September 5th, 2015
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: firstname.lastname@example.org
Stability was the key aspect of Kafka we were unhappy with after a yearlong journey with it.
An HA deployment of Kafka requires an HA deployment of zookeeper, which Kafka uses to coordinate distributed state and configuration. As I explained before, we’ve experienced a number of stability issues with this stateful cluster maintaining consistency through outages such as VM recycle. Some serious engineering time was going into reacting to issues, tracking them down, and subsequent stabilization attempts. Not to mention the cost that every downtime incurred.
With the transition to ZeroMQ, this entire layer of complexity was gone overnight. There are no moving parts to configure, deploy, manage, and support in our ZeroMQ design – all state is transient between in-memory data and the network.
Parts of the system that are not there never break.
What about access to historical logs?
With the problem of real-time log consolidation solved with ZeroMQ, supporting durable storage of logs for later post-processing becomes a smaller and a much more manageable challenge. For example, off the shelve solutions like Logstash are capable of capturing data from ZeroMQ and publishing it to a variety of destinations.
In our particular situation at Auth0, we are already using AWS Kinesis, ElasticSearch, and Kibana as a log processing pipeline in other parts of our operations. As such we are likely to develop a small, stateless message pump that will act as a ZeroMQ subscriber on one hand, and an AWS Kinesis client on the other to tap into this pipeline.