Cloud computing | Rethinking the DBMS

2009/03/02

Not another article about the doomed relational model

Filed under: Uncategorized — Tags: Cloud computing, DOOM! — Ben Samuel @ 10:18

This one is pretty bad. The author, Mr. Bain, is, according to the bio, the founder of “a company that makes investments in early stage software development.” Well, his article is hype about some new technologies and hype is a necessary reality of life, but if he represents investors he ought to give them better advice than he’s giving in this article.
And from the first line, it doesn’t make sense:

Recently, a lot of new non-relational databases have cropped up both inside and outside the cloud. One key message this sends is, “if you want vast, on-demand scalability, you need a non-relational database”.

After that complete non sequitur:

During this time, several so-called revolutions flared up briefly, all of which were supposed to spell the end of the relational database. All of those revolutions fizzled out, of course, and none even made a dent in the dominance of relational databases.

If he’s just saying that all the silly articles predicting that the relational model was doomed were completely wrong, well, that seems vaguely familiar. But if he’s saying that non-relational technologies were attempted and then abandoned, I’d have to take issue with that. OODBMSs should count as one of the “so-called revolutions” and they’re still around and are, in fact, incorporated in many mainstream SQL DBMSs. Most of the other trendy technologies I can think of, such as XML or some of the data warehousing tech, were never really considered direct competitors to the relational model.
The lesson here is that the marketplace of human ingenuity is always bigger than your imagination and there’s plenty of space for all these various technologies. And, to be clear, I’m not pooh-poohing these technologies, rather, I’d like to follow them more closely because I think they’re extremely interesting. My argument is that they aren’t going to shift relational DBMS technology because they don’t compete with it.
What exactly is it that the relational database dominates, and how does it dominate? The what of the answer isn’t way out there, and it surprises me that a business guy like Mr. Blain missed it.
Here’s the basic business reason for a SQL DBMS: your business has many processes that involve many (metaphorical) moving parts and you need to capture that logic, known as business logic, in a set of rules. You need your data to conform, at all times, to those rules.
This isn’t to say that there isn’t a whole lot of other data that can’t be stored in other ways or even must be stored in other ways, but here are a few of the things that businesses have to worry about:

Tracking employees.
Paying taxes.
Government regulations.
Tracking customers.
Tracking shipments.
Managing vendors.

This stuff goes back to before the Roman Empire, and, if anything, it’s only going to get more regulated and more complex, which means the need to track business logic, and thus the case for the relational DBMS, will only grow.
The second half of the answer, the how, is that the relational model is well defined. DBMSs using SQL (which is pretty much all of them) don’t use the relational model but what is often referred to as the SQL model. The differences between the relational model and the SQL model are a little arcane (and I’ll probably touch on them in this blog) but they are significant enough that it’s simply wrong to conflate the two.
Many of the competitors to the relational model, such as the old Pick systems (still around, incidentally) used a model that was not well defined. So what’s wrong with loosely defined systems? There are a few, but I’d argue the complete and total lack of interoperability is the problem. At first blush XML refutes this, but really, it proves it.
What did we have pre-XML? A complete mess of binary formats, and the n-squared problem of writing translators for all of them. Did you ever try these translators? Even simply opening a Word document in a later version tends to destroy the underlying data.
Along comes the web and the successful tag structure of HTML is generalized to arbitrary data. LISP advocates complain that it’s just symbolic expressions with angle brackets.
So now we have a post-XML world. DOC has become DOCX and what’s changed? Well, loading and saving documents is somewhat more robust because some of the logic has been offloaded to reusable libraries. And now you can use XSLT to transform documents.
But the original problem remains: you have a whole mess of incompatible file formats and n-squared translators that don’t actually work.
The various key-value stores are very new, and it remains to be seen how well they will work together, but I haven’t seen any evidence that they even attempt to address the issue.
(Updates: 4 Mar: formatting.)

Comments (5)