March 27th, 2017
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: email@example.com
I like the fact that Kinesis has an SQL interface:
I can see this being useful for Business Intelligence. But that seems like a niche to me. Both SQL and NoSQL solved big, universal problems.
But maybe the SQL is just window dressing? Maybe the real breakthrough is a distributed system with strong consistency guarantees? It would be a very big deal if someone found a way around the CAP Theorem. But otherwise, NewSQL is just more bad marketing for Kyle Kingsbury to tear apart.
All the same, this is a big prediction:
Objects Stores going the way of NoSQL Databases?
Object stores have much in common with NoSQL databases. Both arose from scalability limitations in hierarchical filesystems and relational databases, respectively. However, many NoSQL databases are now being replaced by NewSQL systems that overcome the scalability limitations of single-node relational databases, but maintain the strong consistency guarantees of relational databases [newsql]. Sometimes, you can have your cake and eat it.
Given this, I make the following prediction:
Object stores systems grew from the need to scale-out filesystems. A new generation of hierarchical filesystem will appear that bring both scalability and hierarchy, and many companies will move back to
scalable hierarchical filesystems for their stronger guarantees.
Object stores were designed to scale up a storage service to tens of thousands of servers, providing effectively unlimited storage capacity, but at the cost of weakening the guarantees provided to application developers, in the form of eventually consistent semantics. Eventual consistency implies that data read by applications may be stale (not the latest version written) and, as such, it introduces problems for application developers, as it is challenging to write correct applications due to the weak guarantees they provide which version of an object or file will be read.
Amazon has not published any details on how metadata works in S3 or what guarantees it provides (other than don’t expect to read what you wrote). Companies have reacted by building their own metadata stores for the metadata in S3 — they developed key-value stores that are eventually consistent replicas of the eventually consistent metadata in S3. Even Amazon doesn’t trust S3’s metadata and recommends users to use EmrFS that stores replicas of S3 metadata in DynamoDB.