Get your FREE 30 page Developing SOLID Applications guide!

REST APIs as Data Backends

Out Of Date Warning

Languages change. Perspectives are different. Ideas move on. This article was published on March 19, 2012 which is more than two years ago. It may be out of date. You should verify that technical information in this article is still current before relying upon it for your own purposes.

Some months ago, the Socorro team agreed that our current mix of REST API middleware calls and direct SQL calls from the web interface simply wasn’t meeting our needs. We were faced with an increasing number of data sources, including the coming addition of Elastic Search to the data storage system, and maintenance was becoming a problem. Thus, the decision was made to move our data layer to our REST API exclusively, removing all direct access to data storage from the web interface.

This is the second such project I’ve been on where an external API has been used for the retrieval of all data in an application. It’s a novel concept, but one that takes some getting used to to be sure.

There are numerous benefits to the approach we took: first, it allowed us to write our middleware in a different language from our web interface (in this case, Python). It also means that for the users of the application (and Mozilla has users of Socorro outside Mozilla), it becomes much easier to strip off the Firefox-focused web interface we’ve built and replace it with something that produces reports aimed at their specific needs. Finally, when we do rewrite the web interface, it will make it that much easier since we’ll already have the data storage backend in place.

At first I was opposed to the idea of making one or more HTTP calls on loading a page, but I started to get used to the idea. I realized that this model made sense for both our application and in general practice: building applications that use your own API forces you to “dogfood” your own work and opens up numerous opportunities down the road, should you wish to open the API to outside developers or build something along the lines of a smartphone or desktop application.

Of course, this has been a long process and for most applications that are built in a “traditional” style (where the data backend is built into the application), it takes a bit of planning. Adrian Gaudebert (site in French), who is our Serial Reorganizer, has been focused on this as one of his top Quarter 1 goals and he’s unlikely to complete it in this quarter, though not for a lack of trying. There’s a tremendous amount of code that interfaces directly with some sort of data source, and it’s difficult to pull that code apart, push it into the API, and piece the web interface back together. For our use case, however, it makes a lot of sense, and is worth the effort.

In a world where data is exceedingly coming from both internal and external sources, this sort of approach seems to make a lot more sense. It’s one I’ll use in my future web applications and one I encourage to be used more in yours.

Don’t forget, I’m giving away three copies of my book, The PHP Playbook to those who register by April 15th! Register today for your chance to win!

Learning design patterns doesn't have to suck.

Design patterns open a whole new world of possibilities. So why are you avoiding them? This brand new book will help you finally understand these wonderful programming techiques!

Learn design patterns TODAY »

Simon wrote at 3/19/2012 8:41 am:

Interesting, I’ve seen this done in one or two other places. How are you finding the impact on performance? My concern would be with the potential overhead of a lot of HTTP calls happening in the background, but I don’t know if that concern is well-founded.

Brandon Savage (@brandonsavage) wrote at 3/19/2012 8:48 am:

To date I don’t know that we’ve seen a performance hit. This is largely due to the fact that the API servers are on the same rack as the web servers, and we utilize several servers strictly for the API, helping to ensure consistent availability. We also cache a lot of routine calls, so that we’re not pinging the API for every single page load, or multiple times per page load for the same data.

Lukas (@lsmith) wrote at 3/19/2012 11:05 am:

We are using this kind of architecture on many projects here at Liip.ch and are very happy with the clean separation this offers.

One important thing for performance is look into using parallel requests where ever possible. We have added the necessary features for this into the Buzz HTTP client (though others like Guzzle already support this internally):
https://github.com/kriswallsmith/Buzz/pull/27

Maarten wrote at 3/19/2012 1:38 pm:

Sounds what I’ve heard iBuildings is doing a lot

chris wrote at 3/21/2012 8:11 am:

Just to tell you this is not really a novel concept.

When we started a new app in 2006 (6 years ago !), that was what we did (except it was SOAP instead of REST). Except our app, there was lots of similar designs (IIRC, we based our design after Luftanza did a white paper)

We since replaced SOAP by REST to reduce the cost of the backend calls and we also cache lots of data into memcache (the data that do not change frequently).

The thing to be carefulwhen doing this is the DTO : pay very special attention to the objects you serialize in the backend (i.e with an ORM, the number of SQL queries can be huge for data you do not need)

Doug E wrote at 3/23/2012 1:10 am:

This is the exact same thing that my team is beginning the process of. Luckily our product is fairly new and is still undergoing a lot of development, so we’ve already started the process of offloading new features to an API. Good luck with the project Brandon.

Stephen Flee wrote at 3/26/2012 2:55 pm:

Better yet – use HMVC and you get all the benefits of a rest based API without any of the overhead of additional HTTP requests!

It a great idea tho…a bit more work in the beginning gives you loads of extensibility down the line