At a certain scale you have to give up on the single, normalized, canonical database

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

I am surprised this article by Yorick Peterse, attacking MongoDB, got so much attention. The company used to use MySQL and MongoDB, but now they just use PostGres for everything. PostGres is a great technology, and I would always prefer to MySQL. I question the intelligence of a company that uses MySQL, when PostGres is such a good choice. But if a writer tries to compare PostGres and MongoDB, then I have to question their intelligence, since the 2 technologies are so radially different it doesn’t make much sense to compare them. And also, if you conclude that in the future companies can give up on denormalized databases, and just use a normalized database for everything, then one is very much swimming against the tide of history.

Apparently Yorick Peterse’s company does not yet face the kind of traffic where they need a dedicated denormalized database. And I would guess that 99% of all companies in the world are small enough that they can get by with a single, normalized, canonical database. But there does come a certain scale where that won’t be possible, and then you need to look for ways of storing denormalized data and serving it fast. Among your choices nowadays:

ElasticSearch

Solr

MongoDB

CouchDB

ReactDB

Riak

Cassandra

etc, I can not possibly list them all — there has been an explosion in the number of databases. And the reason there has been an explosion in the number of databases is because a lot of companies are reaching “web scale” and can no long get by with a single, normalized, canonical database.

And there is a good argument for moving to this architecture before you actually need it. Moving away from it only makes sense if you’ve concluded that you are growing slowly and won’t need such an architecture for some years.

Yorick Peterse’s argument:

This brings me to the requirements of a good database, more specifically the requirements Olery has. When it comes to a system, especially a database, we value the following:

Consistency.

Visibility of data and the behaviour of the system.

Correctness and explicitness.

Scalability.

I sure hope these are the arguments against MySQL, rather than MongoDB. If any of those concerns were on their mind when they choose MongoDB, then they are completely incompetent.

With the above values in mind we set out to find a replacement for MongoDB. The values noted above are often a core set of features of traditional RDBMS’ and so we set our eyes on two candidates: MySQL and PostgreSQL.

Wow!!! Really? Seriously? You are giving up on MongoDB because it doesn’t give you Consistency? Did you ever, for one second, think that MongoDB was designed to give you Consistency? Did you randomly pick MongoDB from a list of words, or did you actually read its feature set before you chose it? And then you felt the choice came down to MySQL and PostgreSQL? That’s like having to choose between eating dirt versus eating ice cream.

And then there is this:

In the end we decided to settle with PostgreSQL for providing a balance between the various subjects we care about. The process of migrating an entire platform from MongoDB to a vastly different database is no easy task. To ease the transition process we broke this process up in roughly 3 steps:

Set up a PostgreSQL database and migrate a small subset of the data.

Update all applications that rely on MongoDB to use PostgreSQL instead, along with whatever refactoring is required to support this.

Migrate production data to the new database and deploy the new platform.

This bit: “migrating an entire platform from MongoDB” means that they had canonical data in their MongoDB database. That can be done, but you have to be very, very careful about doing that. MongoDB is at its best when used as the denormalized, read-only front-end database, in particular if you are serving JSON to a Javascript to a frontend. MongoDB is great at that. But then, so is ElasticSearch.

Clearly these people stumbled into using MongoDB without knowing how to use it.

Post external references

  1. 1
    http://developer.olery.com/blog/goodbye-mongodb-hello-postgresql/
Source