Why would anyone choose Docker over fat binaries?

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com

They are blinded by love:

What I like most about Docker is how responsive and quick it is, and the instant repeatability it offers. Spinning up a new container takes literally the same time it takes to run the command on bare metal. It’s a joy to behold! Since this post was originally published in 2013, my fellow Atlassians and I have written several more articles about Docker. What can I say?… we’re in love.

I have the impression that Docker is an incredible technical feat that barely manages to keep an old paradigm alive. It seems sad that so much effort should be made to keep old technologies going, when better alternatives are now available.

Rewind the clock to the mid 1990s. Most companies, and most early web sites, had only a small handful of servers. These servers were lovingly groomed by their sysadmins. There is the old joke that in the those days the servers were named like pets, whereas later everything became “web scale” meaning sites had thousands of servers, meaning the servers had numbers instead of names.

The popular script languages emerged in the 1990s. Python and Ruby, in particular, both adopted the attitude that they should be, as much as possible, thin wrappers over the underlying OS. This attitude lasted a long time.

The Python community felt they didn’t need a story about concurrency because that should be outsourced to the OS. As late as 2012 I caught a talk by Andrew Montalenti in which he repeated the Python slogan “threads drool, processes rule” — meaning, if you need parallelism, call the system level “fork” command and handoff the work to some other process; let the underlying OS do the scheduling. Montalenti is a brilliant guy who says a lot of smart things, and he’s taken a lot of chances on new technology. The fact that he was still handing off concurrency work to the OS in the year 2012 suggests that there is a wonderful kind of comfort to that kind of programming (scripting that hands off the heavy stuff to the OS). I did that kind of programming for more than 10 years, so I can relate to enjoyment of it. That doesn’t change the fact that it is obsolete (for sites at scale).

Ryan Tomayko wrote I Love Unicorn Because It’s Unix in 2009, which is right about when this paradigm began to die. I loved his essay at the time and I sent it to all of my friends. It took me another 2 or 3 years, but eventually I saw that what he suggested took us in the wrong direction.

When I wrote “An architecture of small apps” in 2013 I’d already figured out at least half of this story. I was certainly aware of the simplicity of fat binaries, though I was working at a company that was overwhelmingly committed to PHP and Ruby, so I had to be gentle about introducing new ideas. What I didn’t fully appreciate then was how much the complexity of configuring lots of small apps would make it necessary to give up on those languages that can not be turned into a fat binary.

In recent years, all of the script languages have made an effort to do a better job managing their dependency and configuration. “Bundler” for Ruby is a fantastic dependency management tool. Virtualenv, for Python, allows the creation of artifical environments where dependencies and environment variables are set for a particular app. These solutions managed to get these script languages half way to the modern world. If you run a 100% pure Ruby or Python shop, you can go along way with these technologies. Still, they have their limits. They don’t actually bundle assets the way an uberjar might contain all your HTML and CSS and images (Rails has the Asset Pipeline, but you end up relying on a monolithic framework to handle asset bundling for you). And they are language specific.

Depending on the underlying OS makes sense for the script languages of the 1990s. But it doesn’t make sense for our modern world of microservices running on thousands of servers and needing to auto-discover which ports are used by which services. Relying on environment variables, and paths that are global to the server itself, is massively stupid when you’ve got a 100 apps, written in different languages, and each perhaps with different needs regarding configuration and initiation. In the world of hundreds of microservices running on thousands of servers, what works best is to have self-enclosed apps that contain all of their own dependencies.

There are two possible approaches. Docker is one. The other is a fat binary.

The Go language is the pioneer for fat binaries. Compile a Go app and it has everything it needs inside of it. I recall someone demonstrating a simple text reader on Hacker News, and someone else complained “This app is 25 megs, but it should only be 25k at most — just use the built-in tools that exist on every server, like sed and awk.” But what if a server doesn’t have those tools, or has the tools but only as old versions that lack some of the functions that you want? Writing a fat binary means independence from the server — you no longer have to worry about what is on the server, or what the environment variables are, or what the paths are, or what versions are installed, because everything was taken care of for you, by your build tool, when you created that fat binary.

This comment on Hacker News is a good sample of the Golang attitude:

Doesn’t statically compiling programs solve the deployment issue better? I mean, as far as I can tell Docker only exists because it’s impossible to link to glibc statically, so it’s virtually impossible to make Linux binaries that are even vaguely portable.

Except now Go and Rust make it very easy to compile static Linux binaries that don’t depend on glibc, and even cross-compile them easily.

If I have a binary built by Go, what problems does Docker solve that just copying that binary to a normal machine doesn’t?

From 1999 to 2009 I only worked with script languages, especially PHP and Ruby. I was immersed in that world. I only slowly discovered Java and then Clojure. I recall the first time I created an uberjar: all of my code, and all of my HTML, and all of the CSS and the Javascript and the images — all of it now existed in a single file. This seemed like an incredible magic trick. I know longer needed a complex deployment strategy, I could simply “scp” the file to anywhere on any server and I could spin it up.

But of course, what I was doing was child’s play compared to what others were doing. When Colin Steele was the CTO of Hotelicopter/RoomKey, he had his team create uberjars that included a snapshot of the database, so that the app truly needed nothing from outside of itself.

Solr quickly went from a piecewise solution to our searching needs to something far more interesting. In our problem domain, stale reads of certain data are tolerable, and exploiting that was a lever we could pull. Eventually, we ended up baking an instance of Solr/Lucene directly into our individual application processes, making it possible to achieve true linear horizontal scalability for the application.

20 years from now Colin could spin up one of those instances and it would run and show you all of its data. The hotel data would be out of date, but the thing is, that binary does not need anything except itself. It is truly independent.

Well, I am lying. Any app that runs on the JVM will need the Java runtime, which is separate. Or it has been. With the new Java version 9, the runtime can be bundled with the app, so JVM apps can now be made truly independent, just like Go and Rust apps.

Docker is a tool that allows you to continue to use tools that were perfect for the1990s and early 2000s. You can create a web site using Ruby On Rails, and you’ll have tens of thousands of files, spread across hundreds of directories, and you’ll be dependent on various environmental variables. What ports are in use? What sockets do you use to talk to other apps? How does the application server talk to the web server? How could you possibly easily port this to a new server? That is where Docker comes in. It will create an artificial universe where your Rails app has everything it needs. And then you can hand around the Docker image in the same way a Golang programmer might hand around a fat binary.

Docker is a fantastic achievement in that it allows apps written for one age to continue to seem viable in the current age. However, Docker has a heavy price, in that brings in a great deal of new complexity. If you’ve already experienced the incredible simplicity of fat binaries, then listening to people defend Docker can be a bit incredible:

You now have generic interfaces (Dockerfile, docker-compose, Kubernetes/Rancher templates, etc.) to define your app and how to tie it together with the infrastructure.

Having these declarative definitions make it easy to link your app with different SDN or SDS solutions.

For example, RexRay for the storage backend abstraction of your container:

http://rexray.readthedocs.io/en/stable/

You can have the same app connected to either ScaleIO in your enterprise or EBS as storage.

We are closer than ever to true hybrid cloud apps and it’s now much more easier to streamline the development process from your workstation to production.

This reads almost like a parody, especially “You now have generic interfaces” followed by a bunch of non-generic, very specific technologies: “(Dockerfile, docker-compose, Kubernetes/Rancher templates, etc.).”

Also incredible is the claim that you can now store you files in a variety of places: “You can have the same app connected to either ScaleIO in your enterprise or EBS as storage.” Of course, that is true of all apps, not just Docker apps. Your Go or Rust or Clojure app can also use ScaleIO or EBS. Docker is only adding complexity, it is not expanding the range of options.

Now, it is very easy to bash Docker, given that so many people have written stories about how it failed them. But I would estimate that about 75% of those stories involved bugs that were only present because Docker is young and growing rapidly. Those stories don’t interest me. What does interest me is the other 25%, about things that are core to Docker, which some people actually think of as Docker’s strengths, but which are in fact points of great pain.

Consider “Docker in Production: A History of Failure.” Most of this post is about bugs that only exist because Docker is so young and immature. But this issue is much more fundamental:

The most requested and most lacking feature in Docker is a command to clean older images (older than X days or not used for X days, whatever). Space is a critical issue given that images are renewed frequently and they may take more than 1GB each.

Perhaps someday the technology will exist to make this easy. But, in comparison, think about how you would delete a fat binary that was x days old. On Unix, the “rm” and “stat” commands are both several decades old, and writing a simple bash script to remove a file that is x days old is something Larry Wall would have found easy back in 1987. No new technology needed.

The architects of Docker imagined a world of microservices in which all apps talk to each other using HTTP (actually TCP). And for sure, the best argument for Docker is something like “Python and Ruby and NodeJS can all speak HTTP, and with that they can talk to databases and also the public, so let’s bundle them up and manage their paths and dependencies, so these great technologies can work as well in the future as they have in the past.” But this leads to the biggest problem with Docker. Yes, the network can be very powerful, but trying to use it for everything is a royal pain. Docker initially took the attitude that apps would no longer need a file system, because everything could go over the net, but eventually Docker recognized a need for some kind of storage system:

Ideally, very little data is written to a container’s writable layer, and you use Docker volumes to write data. However, some workloads require you to be able to write to the container’s writable layer. This is where storage drivers come in.

Docker supports several different storage drivers, using a pluggable architecture. The storage driver controls how images and containers are stored and managed on your Docker host.

And how is that going?

Docker has various storage drivers. The only one (allegedly) wildly supported is AUFS.

The AUFS driver is unstable. It suffers from critical bugs provoking kernel panics and corrupting data.

It’s broken on [at least] all “linux-3.16.x” kernel. There is no cure.

…So, the docker guys wrote a new filesystem, called overlay.

“OverlayFS is a modern union filesystem that is similar to AUFS. In comparison to AUFS, OverlayFS has a simpler design, has been in the mainline Linux kernel since version 3.18 and is potentially faster.” — Docker OverlayFS driver

Note that it’s not backported to existing distributions. Docker never cared about [backward] compatibility.

…A filesystem driver is a complex piece of software and it requires a very high level of reliability. The long time readers will remember the Linux migration from ext3 to ext4. It took time to write, more time to debug and an eternity to be shipped as the default filesystem in popular distributions.

Making a new filesystem in 1 year is an impossible mission. It’s actually laughable when considering that the task is assigned to Docker, they have a track record of unstability and disastrous breaking changes, exactly what we don’t want in a filesystem.

Long story short. That did not go well. You can still find horror stories with Google.

Overlay development was abandoned within 1 year of its initial release.

Please note, no one is saying that the folks at Docker are stupid. Just the opposite. There is clearly real talent there. Even in my own social world, some of the smartest programmers that I know are devoted to Docker. But surely we should all stop and take a deep breath, and for a moment wonder how so much talent got spent on a project this useless? I’m seeing great brilliance expended to solve a problem that should never be solved. Or rather, if the problem is resource and dependency and configuration management, we should solve the problem by moving to those languages that support fat binaries. Because fat binaries solve the main problem, without creating additional problems.

There is the issue of orchestrating network resources, such as ports. That is an important issue. Docker initially failed to address this, and now it faces competition from Kubernetes. It seems to me possible that much of what Docker has done so far will eventually be forgotten, and the only thing that will matter in the end is the software they created that focused on the issue of orchestration. And this is an area where Docker lags behind:

Leave it to Google, the technology Behemoth with more container expertise than any organization on the planet, to supply the most compelling alternative, Kubernetes.

Two years after open sourcing the software and donating it to a new Cloud Native Computing Foundation (CNCF), Kubernetes (aka K8s to developers) has become the de facto standard container management and orchestration tool.

For those working in or following container technology, the emergence of Kubernetes as the standard workload orchestration platform is no surprise as the software has been winning developer mindshare, garnering endorsements and accumulating support from all the major cloud container services.

Data from container usage surveys is noisy and inconsistent, primarily due to the relatively small sample set of users and methodological differences. However most results show growing interest in orchestration platforms. For example, Datadog, a provider of application monitoring software, found that 40% of its customers running containers use an orchestration system, primarily AWS ECS and Kubernetes.

Over the last few years I’ve seen orchestration typically handled by stuffing configuration information into a system like etcd or consul. Some of these systems have clever features, like they only hold data for a few minutes, so the data needs to be constantly refreshed, and therefore failure to refresh can be read as a sign that some app has died. It’s like an inverse health check, the absence of data suggests a death. Apparently this is an area where Kubernetes (which replaced Borg at Google) tries to offer an innovative approach:

IP-per-Pod. In Borg, all tasks on a machine use the IP address of that host, and thus share the host’s port space. While this means Borg can use a vanilla network, it imposes a number of burdens on infrastructure and application developers: Borg must schedule ports as a resource; tasks must pre-declare how many ports they need, and take as start-up arguments which ports to use; the Borglet (node agent) must enforce port isolation; and the naming and RPC systems must handle ports as well as IP addresses.

Thanks to the advent of software-defined overlay networks such as flannel or those built into public clouds, Kubernetes is able to give every pod and service its own IP address. This removes the infrastructure complexity of managing ports, and allows developers to choose any ports they want rather than requiring their software to adapt to the ones chosen by the infrastructure. The latter point is crucial for making it easy to run off-the-shelf open-source applications on Kubernetes–pods can be treated much like VMs or physical hosts, with access to the full port space, oblivious to the fact that they may be sharing the same physical machine with other pods.

I don’t know if Kubernetes is better at orchestrating network resources than a system built around etcd or consul, but it does come from Google, so I assume it is battle tested. It’s possible that a small startup only needs a simple setup via consul (or similar), and it is possible that Kubernetes are only worth the investment when a shop is at scale. Time will tell. I am keeping an open mind about the possibility that Kubernetes offer a unique benefit when it comes to orchestration.

But when it comes to managing dependencies or configuration or file systems or paths, I feel confident in saying that Docker is a complete waste of time. For every situation where some developers advocate Docker, I would instead advocate fat binaries.

[ [ UPDATE ] ]

There is a great conversation about this post over on Hacker News.

friend-monoid made a few points, some of which are good, but not entirely related to what I said above:

* If I can just take their work, put it in a container, no matter the language or style, my work is greatly simplified. I don’t have to worry about how to handle crashes, infinite loops, or other bad code they write.

That is only true if you can fix the app by restarting it after it has crashed. But if your system has been designed so that restarting apps will fix your problems, then Supervisord can handle all of your problems and it is a lot simpler.

* We have a whole lot of HTTP services in a range of languages. Managing them all with fat binaries would be a chore – the author would have to give me a way to set port and listen address, and I have to keep track of every way to set a port. With a net namespace and clever iptables routing, docker can do that for me.

* Its possible for me to namespace everything myself with unshare(1) and “ip netns” and cgroups and chroot and iptables… but that would consume all my available time. Docker can do that for me.

You can also do that with a Chef script or an Ansible script and you’ll have the advantage still sticking with fairly ordinary Linux tools, rather than introducing a whole new eco-system (I say a whole new eco-system, because no one seems able to use Docker alone, they immediately run into problems with and so they start using more and more tools to manage the problems).

* When you reach more then 20 or so services to keep track of, you need tools to help you out.

Once you’ve 4 or 5 services or 4 or 5 servers you should start to thinking about standardizing them with something like Chef or Ansible. I don’t see the need for Docker.

* load balancing. Luckily, I don’t have to deal with extreme network loads which would require hand-made solutions, but just pushing up that number a little to do ad-hoc load balance makes things a lot easier.

As I said above, resource allocation is an important issue. That is what makes Kubernetes interesting. I’m not yet sure that Kubernetes is better than older approaches to resource allocation, such as using Consul or etcd or ZooKeeper, plus maybe Chef or Ansible scripts, but I am keeping an open mind. If the only argument for Docker is that containerization is necessary for Kubernettes, I don’t regard that as a very strong argument for Docker.

notyourday also had some good comments:

If my colleagues don’t have to understand how do deploy applications properly, their work is simplified greatly. If I can just take their work, put it in a container, no matter the language or style, my work is greatly simplified. I don’t have to worry about how to handle crashes, infinite loops, or other bad code they write.

Of course you do, you just moved the logic into the “orchestration” and “management” layer. You still need to write the code to correctly handle it. Throwing K8S at it is putting lipstick on a pig. It is still a pig.

and:

We have a whole lot of HTTP services in a range of languages. Managing them all with fat binaries would be a chore – the author would have to give me a way to set port and listen address, and I have to keep track of every way to set a port. With a net namespace and clever iptables routing, docker can do that for me.

Nope, you wrote a set of rules and as long as everyone adheres to those rules things kind of work ( in a clever way ). Of course if you had the same kind of rules written and followed in any other method, you would arrive at the exactly the same place. In fact, you probably would arrive at a better place because you would stop thinking that your application works because of some clever namespace and iptables thing.

and:

sometimes, I have to deploy and insecure app. Usually, it’s a badly configured memcache or similar. With net namespaces, I can make sure only a certain server has access to that service, and that the service cannot ruin my host server.

You may be able to guarantee this with a VM but you certainly cannot guarantee it with a container.

Source