Sunday, November 20, 2011

Thinking about your data model

Web applications are so much more than they used to be these days. With integrations into other web applications through exposed API's, the shifts to Single Sign On mechanisms, data sources that vary from the traditional database backends, no-sql solutions such as the Cassandra's out there and even flat files, the amount of data an application needs to process and be aware of is pretty intense.

And yet most web applications treat every data source except the local database as a second-class citizen. Even though those alternate data sources are critical to the running of the application, its only the database itself that is treated with abstraction within the application's model layer.

Model Layer? Well, any web developer attempting to build a web application in this day and age without the structure of some form of MVC (Model-View-Controller) architecture behind it is asking for a difficult time ahead. MVC imparts a fixed structure to a project with a very sensible separation of concerns in order to make your web application a more maintainable as well as extensible product. If you still work in the days of single files with HTML, business logic and data access all scrunched together, then you are woefully behind the best practices at the moment.

Unfortunately, a lot of the power of the MVC design pattern is diluted by misuse. Hell, I have even caught myself doing it at times. The one aspect I am discussing here is the model (or data) layer, which exists for the sole reason of being a central mechanism to allow you to grab the data you need for your application without having to worry about how that data is implemented, where it is stored, what the database architecture is, or even if its a database at all. And that last point is where things fall short.

A number of web apps I have seen (and BrandFu is not exempt from this unfortunately) will use the model layer exclusively for the applications own database. Any other data source is accessed ad hoc, and in varying ways, all throughout the application's controller and view layers and occasionally within the model but only to extend the abilities of grabbing rows out of the database. The problem with this method is that, if you ever want to decouple from a specific data source, such as a web service for example, and want to switch from consuming that web service to storing and managing that data on your own database, it will be a nightmare.

I am not saying I am not to blame either. I do get caught out with this myself. Developing BrandFu, we found ourselves occasionally making calls to external web service from outside the model layer. And a few weeks ago, we had some interest from a company who would like to have the service installed as a seperate instance over their own network to be able to provide BrandFu services to their own clients but on their own managed servers.

Sounds great but theres one problem. At SYNAQ we have an internally used "API" and Single Sign On (SSO) system called SASY. The BrandFu application itself relies quite heavily on SASY as a data source, but unfortunately for us, the web service requests are scattered around the code in the controller layer. Not all, but a fair number of them.

The solution? Replicate the object model returned from these existing API calls as pseudo-database tables in our symfony schema.yml file. Essentially, map the data returned from these API calls as if they were tables in our local database. symfony can then auto-generate the model classes for these API calls, exactly as it would in the more traditional database model, except we can then go ahead and create methods within these model classes that, instead of resulting in calls to our database, will make the API request to SASY, hydrate the object and send that back.

The result is that any chunk of code that needs that data doesn't know where it came from. It doesn't care. As long as it gets what it wants and can continue processing, why should it? This also encourages re-use a lot more, reduces code complexity, and makes maintenance even easier.

The other benefit, is that if we ever need to move away from an API-based data source for those "tables", well, their schema has already been defined and adding the bit of additional code to make a database query instead of a REST request is a lot simpler. You could even have support for both an API data source or a local database and switch between the two via config.

In fact, that's exactly I will be doing now. BrandFu is going to be transitioned to as clean a data model as possible over the next few weeks. This will simply make the application easier to maintain, easier to extend and easier to implement over a variety of systems and networks.

No comments:

Post a Comment