APIs & Beautiful… Soup?

Q: What’s an API?

API stands for Application Programming Interface.

Great, we got the acronym out of the way. Not a very helpful acronym since application is such a broad term, programming is such a broad term, and interface is such a broad term.

But then again, API is a broad term, a broad concept.

To the most non-technical profiles, I usually explain that an API is like a plug in the wall. Just as a wall outlet allows for a relatively standardized way for you to get electricity into your vacuum, your lamp, your cellphone, an API provides a way for developers to send and receive data in a standardized format between applications that have wildly different purposes and might speak entirely different languages.

The World of Scraping

In high school, a friend and I decided we wanted to set up our computers as web servers to host our music and movies that we had been (very illegally) curating and building via Kazaa, Limewire, and Bittorrent. We quickly became competitive about whose server would have better design, a better interface, and ultimately the better library.

I remember a particular challenge I faced – I wanted a graphical display for my movies. I had a library of maybe 50 movies, which seemed just out of reach to manually go and find a link to a trailer, and the cover image for. IMDB existed, and I figured I’d just use that. Well at this time I don’t believe there was an API available, or if there was I wouldn’t have had any idea how to use it, so decided to scrape what I needed from IMDB.

If you’re not familiar, scraping is the practice of scooping up a bunch of disorganized data (usually markup such as entire webpages), and then systematically dissecting it to pull out the relevant pieces (in my case, the movie cover image and a link to a trailer). Since places like IMDB build serve pages in a predictable way, scraping can be a pretty reliable way to get data (until they change the structure, and everything is messed up, and you have to figure out why).

I dove deep into the world of BeautifulSoup4. I’m ambivalent as to whether or not that was a beautiful world. I remember it being a headache often, not so much because BS4 wasn’t useful, but because scraping is neither an elegant art, nor a science.

The API Way

Now in an alternate universe, had I known what an API was and had I simply googled “movie database api,” I’m sure I would have found a cleaner solution. Probably I could have written a script that, each time I added a movie file to my library, would automatically visit a URL like:

https://v2.sg.media-imdb.com/suggests/s/shawshank.json

Automatically download the response:

imdb$shawshank({  
   "v":1,
   "q":"shawshank",
   "d":[  
      {  
         "l":"The Shawshank Redemption",
         "id":"tt0111161",
         "s":"Tim Robbins, Morgan Freeman",
         "y":1994,
         "q":"feature",
         "vt":4,
         "i":[  
            "https://m.media-amazon.com/images/M/MV5BMDFkYTc0MGEtZmNhMC00ZDIzLWFmNTEtODM1ZmRlYWMwMWFmXkEyXkFqcGdeQXVyMTMxODk2OTU@._V1_.jpg",
            674,
            1000
         ],
         "v":[  
            {  
               "l":"Official Trailer",
               "id":"vi3877612057",
               "s":"2:11",
               "i":[  
                  "https://m.media-amazon.com/images/M/MV5BNjQ2NDA3MDcxMF5BMl5BanBnXkFtZTgwMjE5NTU0NzE@._V1_.jpg",
                  640,
                  480
               ]
            }
         ]
      }
   ]
})

… and be on my merry way.

Most APIs will return JSON or XML, a structured response which almost all languages have a way of processing and extracting the good bits.

Salesforce also provides APIs, wall outlets that allow developers to write programs that push data into Salesforce and pull data out. To learn about the REST API, check out my post about REST Endpoints.

That’s all, cheers! Check out the other Conversations here.

Add comment

Recent Posts

Categories