Docker is the dangerous gamble which we will regret

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

Also read: “My final post regarding the flaws of Docker and Kubernetes and their eco-system

Summary: don’t use Docker, or any other container technology. Use Terraform and Packer instead. It’s one less level of virtualization, and therefore one less level of complexity.

There is perhaps one good argument for using Docker. It is hidden by the many bad arguments for using Docker. I’m going to try to explain why so much Docker rhetoric is stupid, and then look at what reason might be good.

Every time I criticize Docker I get angry responses. When I wrote Why would anyone choose Docker over fat binaries? 6 months ago I saw some very intelligent responses on Hacker News, but also some angry ones. So in writing this current essay, I am trying to answer some of the criticism I expect to get in the future.

But I guess I am lucky because so far I have not gotten a reaction as angry as what The HFT Guy had to face when he talked about his own failed attempt to use Docker in production at the financial firm where he works:

I received a quite insulting email from a guy who is clearly in the amateur league to say that “any idiot can run Docker on Ubuntu” then proceed to give a list of software packages and advanced system tweaks that are mandatory to run Docker on Ubuntu, that allegedly “anyone could have found in 5 seconds with Google“.

On the Internet, that kind of anger is normal. I don’t know why, but it is. Many developers get angry when they hear someone criticize a technology which they favor. That anger gets in the way of their ability to figure out the long-term reality of the situation.

Docker promises portability and security and resource management and orchestration. Therefore the question that rational people should want to answer is “Is Docker the best way to gain portability and security and resource management and orchestration?”

I’m going to respond to some of the responses I got. Some of these are easy to dismiss. There is one argument that is not easy to dismiss. I’ll save that for the end.

One person wrote:

Because choosing Docker requires boiling fewer oceans, and whether those oceans should or should not be boiled has no bearing on whether I can afford to boil them right now.

Okay, but compared to what? Having your devops person write some scripts to standardize the build, the deployment, the orchestration, and the resource use? The criticism seems to imply “I don’t want the devops person to do this, because the result will be ad-hoc, and I want something standardized.”

Docker wins because developers and managers see it as offering something less custom, less chaotic, less ad-hoc, and more standardized. Or at least, having the potential to do so. The reality of Docker has been an incredible mess so far (see Docker in production: a history of failure). But many are willing to argue that all of the problems will soon be resolved, and Docker will emerge as the stable, consistent standard for containerization. This is a very large gamble. Nearly every company that’s taken this gamble so far has ended up burned, but companies keep taking this gamble on the assumption it is going to pay off big at some point soon.

Every company that I have worked with, over the last two years, was either using Docker or was drawing up plans to soon use Docker. They are implicitly paying a very high price to have a standardized solution, rather than an ad-hoc build/deploy/orchestrate script. I personally have not yet seen a case where this was the economically rational choice, so either companies are implicitly hoping this will pay off in the long-run, or they are being irrational.

I use the word “implicitly” because I’ve yet to hear a tech manager verbalize this gamble explicitly. Most people who defend Docker talk about how it offers portability or security or orchestration or configuration. Docker can give us portability or security or orchestration or configuration, but at a cost of considerable complexity. Writing an ad-hoc script would be easier in most cases.

The best articles about Docker emphasize the trade-offs that one makes by choosing to use it:

It’s best to think of Docker as an advanced optimization. Yes, it is extremely cool and powerful, but it adds significantly to the complexity of your systems and should only be used in mission critical systems if you are an expert system administrator that understands all the essential points of how to use it safely in production.

At the moment, you need more systems expertise to use Docker, not less. Nearly every article you’ll read on Docker will show you the extremely simple use-cases and will ignore the complexities of using Docker on multi-host production systems. This gives a false impression of what it takes to actually use Docker in production.

In the world of computer programming, we have the saying “Premature optimization is the root of all evil.” Yet most of my clients this year have insisted “We must Dockerize everything, right from the start.” Rather than build a working system, and then put it in production, and then maybe see if Docker offers an advantage over simpler tools, the push has been to standardize the development and deployment around Docker.

A common conversation:

Me: “We don’t need Docker this early in the project.”

Them: “This app requires Nginx and PostGres and Redis and a bunch of environment variables. How are you going to set all that up without Docker?”

Me: “I can write a bash script. Or “make”. Or any other kind of installer, of which there are dozens. We have all been doing this for many years.”

Them: “That’s insane. The bash script might break, and you’ll have to spend time debugging it, and it won’t necessarily work the same on your machine, compared to mine. With Docker, we write the build script and then its guaranteed to work the same everywhere, on your machine as well as mine.”

Like all sales pitches, this is seductive because it leads with the most attractive feature of Docker. As a development tool, Docker can seem less messy and more consistent than other approaches. It’s the second phase of Docker use, when people try to use it in production, where life becomes painful.

Your dev team might have one developer who owns a Windows machine, another who runs a Mac, another who has installed Ubuntu, and another who has installed RedHat. Perhaps you, the team lead, have no control over what machines they run. Docker can seem like a way to be sure they all have the same development environment. (However, when you consider the hoops you have to jump through to use Docker from a Windows machine, anyone who tells you that Docker simplifies development on a Windows machine is clearly joking with you.)

But when you go to production, you will have complete control over what machines you run in production. If you want to standardize on CentOS, you can. You can have thousands of CentOS servers, and you can use an old technology, such as Puppet, to be sure those servers are identical. The argument for Docker is therefore weaker for production. But apparently having used Docker for development, developers feel it is natural to also use it in production. Yet this is a tough transition.

I can cite a few examples, regarding the problems with Docker, but after a certain point the examples are boring. There are roughly a gazillion blog posts where people have written about the flaws of Docker. Anyone who fails to see the problems with Docker is being willfully blind, and this essay will not change their mind. Rather, they will ignore this essay, or if they read it, they will say, “The Docker eco-system is rapidly maturing and by next year it is going to be really solid and production ready.” They have said this every year for the last 5 years. At some point it will probably be true. But it is a dangerous gamble.

Despite all the problems with Docker, it does seem to be winning — every company I work with seems eager to convert to Docker. Why is that? As near as I can tell, the main thing is standardization.

Again, from the Hacker News responses to my previous essay, “friend-monoid” wrote this defense of Docker:

We have a whole lot of HTTP services in a range of languages. Managing them all with [uber] binaries would be a chore – the author would have to give me a way to set port and listen address, and I have to keep track of every way to set a port. With a net namespace and clever iptables routing, docker can do that for me.

notyourday wrote the response that I wish I’d written:

Of course if you had the same kind of rules written and followed in any other method, you would arrive at the exactly the same place. In fact, you probably would arrive at a better place because you would stop thinking that your application works because of some clever namespace and iptables thing.

anilakar wrote this response to notyourday:

I think that the main point was that docker skills are transferable, i.e. you can expect a new hire to be productive in less time. Too many companies still have in-house build/deploy systems that are probably great for their purpose but don’t offer valuable experience that would be usable outside that company.

And as near as I can tell, this is 100% why Docker is winning. Forget all the nonsense you read about Docker making deployment or security or orchestration easier. It doesn’t. But it is emerging as a standard, something a person can learn at one company and then take to another company. It isn’t messy and ad-hoc the way a custom bash script would be. And that is the real argument in favor of Docker. Whether it can live up to that promise is the gamble.

At the risk of being almost petty, I should point out that these arguments confuse containers with Docker. And I think many pro-Docker people deliberately confuse the issue. Even if containers are a great idea, Docker is driven forward by a specific company which has specific problems. Again from HTF Guy:

Docker has no business model and no way to monetize. It’s fair to say that they are releasing to all platforms (Mac/Windows) and integrating all kind of features (Swarm) as a desperate move to 1) not let any competitor have any distinctive feature 2) get everyone to use docker and docker tools 3) lock customers completely in their ecosystem 4) publish a ton of news, articles and releases in the process, increasing hype 5) justify their valuation.

It is extremely tough to execute an expansion both horizontally and vertically to multiple products and markets. (Ignoring whether that is an appropriate or sustainable business decision, which is a different aspect).

In the meantime, the competitors, namely Amazon, Microsoft, Google, Pivotal and RedHat all compete in various ways and make more money on containers than Docker does, while CoreOS is working an OS (CoreOS) and competing containerization technology (Rocket).

So even if you believe containers are a fantastic idea because they make everyone’s setup consistent, Docker itself remains a dangerous gamble.

But okay, let’s treat Docker and containers as somewhat the same thing for now. Of the criticisms that were thrown at my earlier essay, which criticisms were valid?

One mistake I made in that earlier essay was using the phrase “fat binary.” That lead to a lot of confusion. After a few hours I added this disclaimer:

In this essay, I use the phrase “fat binary” to refer to a binary that has included all of its dependencies. I am not using it to refer to the whole 32 bit versus 64 bit transition. If I was only writing about the world of Java and the JVM, I would have used the word “uberjar” but I avoided that word because I also wanted to praise the Go language and its eco-system.

I wish I’d used the phrase “uber binary” which might be a little bit better, though it is a biased phrased, as it shows how much I’ve worked in the JVM world. But it’s the best I can think of, so I’ll use “uber binary” in this essay.

I wish developers were more willing to consider the possibility that their favorite computer programming language may not be ideal for a world of distributed computing in the cloud. Apparently I’m shouting into the wind on this issue. Developers feel strongly that the world needs to adapt to their PHP code, or their Ruby code, or their Python code. It’s never their code that needs to adapt to a changing world.

If you are starting a new project today, and you expect it grow large enough that you will have to worry about scale, or you simply want it to be highly available, you have the option to use a modern language that has been designed for cloud computing. There are many new languages that have some wonderful features. The only two that I have experience with are Go and Clojure. I don’t have much experience with Rust or Scala or Kotlin, so I can not say much about them. Maybe they are wonderful (in my previous essay, many readers seem to think I was insulting the languages that I did not mention. I don’t mean to insult these languages, but I can only praise the languages that I’ve had some exposure to). Everything I’ve seen and read about Scala’s Akka framework makes me think there is a lot of good ideas there. I have not used it, but it seems smart and modern.

Responding to my earlier essay, btown wrote:

Anyone who thinks that all modern web applications are made in Golang or on the JVM is in a pretty weird echo chamber.

Again, it’s great that there are so many languages out there, but I don’t know all of them, and I can only meaningfully praise the one’s I’ve had some experience with. But I can also meaningfully criticize the older languages that I’ve worked with, and that includes many years working with PHP, then Ruby, and more recently Python. They arose from an old paradigm, which is not suited to a world of microservices and distributed computing in the cloud. I wrote about this in Docker protects a programming paradigm that we should get rid of. Nobody seems to be listening to this point right now. I’m reminded of the mania for Object Oriented Programming which peaked around the year 2000. At that time, it was almost impossible to speak out against that paradigm. The tech industry considers itself open minded, but in fact it is full of movements which gather momentum, then shut down all competing conversations, for a few years, then recede, and then it becomes acceptable for all of us to poke fun at how excessive some of the arguments were. In 2000 the excesses were XML and Object Oriented Programming. Nowadays it is Docker.

I have used Clojure a lot. To write an app and create an uberjar which binds up the dependencies seems like a very wise step. And I know how to set up a system such as Jenkins, so my Clojure builds are automated. And I know that some companies have gone to incredible extremes, in terms of building apps that have no outside dependencies, including the astonishing step of bundling the database inside of the uberjar. Consider “60,000% growth in 7 months using Clojure and AWS“:

This led to Decision Two. Because the data set is small, we can “bake in” the entire content database into a version of our software. Yep, you read that right. We build our software with an embedded instance of Solr and we take the normalized, cleansed, non-relational database of hotel inventory, and jam that in as well, when we package up the application for deployment.

We earn several benefits from this unorthodox choice. First, we eliminate a significant point of failure – a mismatch between code and data. Any version of software is absolutely, positively known to work, even fetched off of disk years later, regardless of what godawful changes have been made to our content database in the meantime. Deployment and configuration management for differing environments becomes trivial.

Second, we achieve horizontal shared-nothing scalabilty in our user-facing layer. That’s kinda huge. Really huge.

If you can get this kind of massive scaling without having to introduce new technologies (such as Docker) then you should do so. Solve your problems in the simplest way you can. If the switch away from Ruby/Python/Perl to a newer language and eco-system allows you to achieve massive scale with less technologies and less moving parts, then you absolutely have a professional obligation to do so. Again, the ideal is to achieve your goals in the simplest way possible, and most times this means using the least number of technologies. Inspired by Rich Hickey, I would contrast “simple” versus “easy”. Using an old language that you already know is easy, whereas it is hard but simple learning a new language that allows you to reduce the total number of technologies in your system. “Simple” here means that your system ends up being simpler than it would be otherwise — that is, it has less code or less configuration or a smaller number of technologies in use.

I know, from previous essays, that as soon as I mention Jenkins, some people will suggest that my mindset is out of date, but Sometimes Boring Is Better:

The nice thing about boringness (so constrained) is that the capabilities of these things are well understood. But more importantly, their failure modes are well understood. […] But for shiny new technology the magnitude of unknown unknowns is significantly larger, and this is important.

In other words, software that has been around for a decade is well understood and has fewer unknowns. Fewer unknowns mean less operational overhead, which is a good thing.

I’ve learned that many developers have strong biases, and when they read essays like this they tend to be looking for an excuse to dismiss the whole essay. So if I mention Jenkins or Ansible or Go or Clojure or Kotlin or Akka or any other tech, and they know of a flaw with any of those technologies, they go “This guy is stupid, so I can ignore this essay.” I don’t know of any way to reach those people, other than putting in these disclaimers, and even these disclaimers probably won’t convince those who really don’t want to be convinced.

Regarding my earlier essay, tytso wrote:

And the statement that the Go language is the pioneer for fat binaries is, well, just wrong. People were using static binaries (with, in some cases, built-in data resources) to solve that problem for literally decades. MacOS, for one.

I apologize for any confusion, but the idea I am trying communicate is a binary file that contains all dependencies, plus all necessary configuration, plus any resource that you can possibly put in it, if putting that resource in simplifies the overall system. The concept is somewhat broader than simply linking static libraries. Modern continuous integration systems can be configured to be sure that each binary is given specific configuration information, which might be unique to that particular instance of the binary, so even if you need to have a thousand instances of the same app, you can build a thousand variants with slight variations of the configuration. You can do this using slightly older build systems, such as Jenkins, which are well understood and which are boring in all of the good ways. You should be careful about jumping up to the level of complexity of Docker. (And please don’t obsess over the fact that I used Jenkins in this example, if you prefer to use Travis, TeamCity, or Bamboo then use those. Use any of these. I am often stunned at developers’s willingness to dismiss a whole essay because they didn’t like one technology whose name is used merely as an example.)

A few people said this:

Docker protects against the danger of vendor lock-in on the part of the cloud providers

This is bunk. Any devops tool that standardizes deployment protects you from vendor lock-in. And there is an abundance of such tools.

friend-monoid wrote:

If my colleagues don’t have to understand how do deploy applications properly, their work is simplified greatly. If I can just take their work, put it in a container, no matter the language or style, my work is greatly simplified. I don’t have to worry about how to handle crashes, infinite loops, or other bad code they write.

notyourday response sums up my own attitude:

Of course you do, you just moved the logic into the “orchestration” and “management” layer. You still need to write the code to correctly handle it. Throwing K8S at it is putting lipstick on a pig. It is still a pig.

More so, if you are at a small company, and there are only three developers, then you will have to deal with each other’s code, which might be code that crashes, but if you are at a large company, where the devops team is separate from the programming team, then this is an issue that devops has been dealing with for many years, typically using health checks of some kind. And the health checks still need to be written, and this is something that Docker has not standardized. You, the developer of the app, need to create some endpoint that can send a 200 response that the devops team can test regularly. It is bogus to mention Docker in this context, since it contributes nothing. Many devops teams have scripts that test if an app is alive, and if it seems to be non-responsive, then the app is killed and restarted.

Lazare wrote:

Let’s say I’m working on an existing code base that has been built in the old-style scripting paradigm using a scripting language like Ruby, PHP, or (god help us) node.js.

…I can just about see how we can package up all our existing code into docker containers, sprinkle some magic orchestration all over the top, and ship that.

I can also see, as per the article, an argument that we’d be much better off with [uber] binaries. But here’s the thing: You can dockerise a PHP app. How am I meant to make a [uber] binary out of one? And if your answer is “rewrite your entire codebase in golang”, then you clearly don’t understand the question; we don’t have the resources to do that, we wouldn’t want to spend them on a big bang rewrite even if we did, and in any case, we don’t really like golang.

In this example a company has had a PHP app for a long time, and now it needs to Dockerize that app. Why is this? What has changed such that the app needs to be Dockerized now? This feels like an artificially constrained example. How did the app work before you Dockerized it? What was the problem with the old approach, that you feel that Docker will solve?

Is the end goal orchestration of this app with other (perhaps newer) apps? Docker plus orchestration generally means Docker plus Kubernetes. (Or you could pursue a more unusual choice, using Mesos or Nomad or some other alternative, rather than Kubernetes.) This is a very complex setup, and a company should think long and hard before committing to this path. Read Is K8s Too Complicated? If your app is small, then a complete re-write might be a good option, and if your app is large, ask yourself if there is a simpler path forward for your company, such as writing some Chef or Ansible scripts.

I once worked with a massive monolithic app which had been written in PHP, using the Symfony framework. It suffered terrible performance issues. We began to very gradually pull it apart, keeping the Symfony monolith for the HTML template rendering, but pulling out the real performance blocks and re-writing them in more performant languages. I wrote about this in An architecture of small apps. At that time, the devops crew was using Puppet plus some custom scripts to handle deployments. And that was enough for us. And that had the wonderful benefit of using boring, stable technologies.

Remember, you only have a finite amount of time. Whatever time you spend Dockerizing your PHP code is time you are not modernizing your app in other ways. Be sure that investment is worth it.

A curious fact is that the apps I’ve seen where orchestration is needed are not the ones people bring up in examples when discussing Docker. I’ve seen long running data analysis scripts that are run on Spark, and then Nomad was used for orchestration. I’m aware of a massive system where data (a terabyte a day) is added to Kafka, then it goes to Apache Storm and then to ElasticSearch. That system has a complex set of health checks and monitoring but for the bulk of the work, Storm itself is the orchestration tool. Web apps need to be dealing with massive amounts of data before they need massive orchestration. Twitter deals with massive amounts of data, and uses Aurora for orchestration. Are you Twitter scale? If you are using Docker and Kubernetes for an ordinary website, then please stop. There are simpler ways to run a website. We’ve been building websites for 25 years now, and we didn’t need Docker.

mwcampbell wrote:

“It seems sad that so much effort should be made to keep old technologies going”

I strongly disagree with this part. To make progress as a technological civilization, without constantly wasting time reinventing things, we need to keep old technologies working. So, if Docker keeps that Rails app from 2007 running, that’s great. And maybe we should still develop new apps in Rails, Django, PHP, and the like. It’s good to use mature platforms and tools, even if they’re not fashionable.

That word “fashionable” brings me to something that really rubs me the wrong way about this piece, and our field in general. Can we stop being so fashion-driven? It’s tempting to conflate technology with pop culture, to assume that anything developed during the reign of grunge music, for example, must not be good now. But good technology isn’t like popular music; something that was a good idea and well executed in 1993 is probably still good today.

This is a wildly ironic comment. Apparently sober realists think we should Dockerize everything, whereas crazy people like me think we should use older devops tools, combined with newer languages. My choice is driven by “fashion” whereas their love of Docker is driven by a desire “to make progress as a technological civilization.”

We should use older, boring technologies as long as they can still do their job well without inflicting additional costs because of the paradigm they bind us to. However, when there are significant changes in the way technology works, we should ask ourselves whether there are some technologies that are no longer the correct choice for the new circumstances. In particular, the shift to cloud computing, and the rise of microservices that run in the cloud, should force us to rethink which technologies we choose to use. The guiding rule should be “What is the simplest way to do what we need to do?” If the older technology gets the job done, and is the simpler approach, then it should be preferred. But if there is a new technology that allows us to simplify our systems, then we should use the new technology.

chmike wrote:

Containers are not only a solution for dependencies. It’s also protection boundary.

neilwilson replied:

It’s just a process with a fancy chroot. Don’t believe all the docker hype. Sensible admins have been doing something similar for years. We just didn’t have a massive PR budget

I’ve nothing to add to that.

Above, I asked “Why not use an uber binary that has no outside dependencies?” You could respond, “That is exactly what Docker is! It is a big fat binary that contains everything the app needs! And it isn’t language dependent! You can use it with Ruby, PHP, Python, Java, Haskell, everything!”

All of which is true, though I would recommend that a company look for chances where it can consolidate the number of technologies that it uses, and perhaps use modern languages and eco-systems that can do this kind of uber binary natively.

A great many people argue for Docker with the assumption that they have no power to effect the technologies in use at their company. The assumption is that the company is automatically going to use a heterogenus mix of technologies, including some old ones that are not well suited to distributed computing in the cloud. And so Docker is the bandaid that hides the penalty that the company pays for not using a language and eco-system that is suited for distributed computing in the cloud.

This kind of passiveness can destroy a company in the long-run. I don’t favor chasing the latest fashions in tech, but I do favor a constant reassessment of what is best for the company, with a view to how the overall landscape of computing is changing. Passive acceptance of legacy apps that become pain points for the company will slow the company over time, and when that day finally arrives when the legacy app can no longer be kept alive, the re-write will be more dangerous for the company, because it will have to be a complete re-write. It is better if a company looks for ways to pull apart pieces of legacy of apps and modernize them. Indeed, one of the most important aspects of microservices is that it allows the piecemeal, incremental modernization of old apps. I wrote about this in The Coral Reef Pattern of incremental improvement.

Above, I mentioned an analytics firm, where data (a terabyte a day) is added to Kafka, then it goes to Apache Storm and then to ElasticSearch. This firm was strongly committed to Python for a long time. As they ran into performance issues, they looked to use Python concurrency systems, such as Tornado, to build massive concurrency into their system. They gave this project to a very intelligent engineer, previously from Intel, and they gave him 3 months to build the new system. Utter failure was the result. They could not get the performance they needed, and even Tornado failed to give them the level of fine-grained concurrency and parallelism they were looking for. Finally, they confronted the idea that they could not use Python for everything. They are now examining Go and Elixir as languages that might give them what they need. (I believe there is a bit of sadness at this company — they had been idealists regarding Python, true Pythonistas.)

I approve of this reappraisal, but I think it should happen constantly.

This is the strongest argument for Docker (written by pmoriarty):

You could make pretty much the exact same set of complaints against all those configuration management tools (ansible/chef/salt/cfengine/puppet). They’re all a huge mess of spaghetti and hackery that works when they work and can be a nightmare otherwise.

All these tools need at least a couple of decades more to mature.

That is true.

Do you need a recipe for running WordPress? Ansible has that, but so does Docker. Likewise, Docker has what you need for running Drupal, or MySQL, or hundreds of other tasks. Docker has somewhat caught up, in terms of offering default setups for common devops needs.

If Chef or Ansible were more mature then the argument for Docker would be much weaker. I knew of a startup, in 2013, that was focused on building a framework for Chef (their framework aimed to be sort of the Ruby On Rails of devops). They ran into some problems, and also they got so much lucrative devops work that they ended up getting distracted. But even if they fail, something similar to what they were working on might one day succeed. Such a framework would have the advantage of relying directly on OS features, without doing as much as Docker to mask the OS.

Both Chef and Ansible promised that there would soon be thousands of scripts for all of the common devops tasks, for every possible type of machine. They have failed to fully deliver on this promise. As late as 2005 it still seemed normal that each company would have a devops person who would write custom scripts for all of the devops tasks in the company. By 2010 it was clear that there should be a central store of recipes for common tasks, much more specific than what had been offered by the old Perl CPAN libraries, and focused especially on the issues of consistency (to help with portability) and security and resource management and orchestration. And Chef took off, and then a little later Ansible took off, and Docker was only a few years behind. And many developers felt that Docker finally delivered some of the things that Chef and Ansible had promised, though for a long time Docker could only deliver for local development. As late as 2015, trying to use Docker in production was suicide. And even the companies that tried to use Docker in production in 2016 ran into an inferno of pain. But clearly, over the last year, things have gotten a bit more stable.

Docker strikes me as a direction that one day will be seen as a mistake. The strongest arguments for it are that it might be a standard if it can mature, and it offers a bandaid for many of the other failures that the tech industry is currently suffering from. Those are bad reasons to love Docker.

I suspect that 5 years from now, looking back, it’ll be clear that there was a less complex way to standardize devops for distributed computing in the cloud. But for now, Docker is winning everywhere.

[ [ UPDATE ] ]

There is a good conversation on Hacker News about this essay.

My favorite comment was from cm2187:

I can’t help thinking that Docker is building a massive legacy application headache further down the line when all these applications will not be developed anymore, the developers will have moved on, the technology stack moved on too to new shiny tools, but the underlying OS needs to be updated and these legacy apps still need to run. The compatible base image will probably not even be available anymore. Some poor guy is going to have a horrible time trying to support that mess.

[ [ UPDATE of 2018-05-17 ] ]

I received this question in an email:

It seems that sometimes you see value in systems like Kubernetes (and maybe Istio layered on top?) and at other times you seem to suggest a hand-knit series of bash scripts to simply copy files over to servers.

I apologize if I expressed myself in a confusing way. I do think that Kubernetes is a powerful and useful technology, but it only works with Docker containers, so you need to commit to Docker before you can use Kubernetes. I’d like Kubernetes without Docker, and that is what Nomad tries to be. It lacks some features but offers more flexibility.

When I’m writing essays, I occasionally point out that a simple bash script could be used for most tasks. I don’t think that is the best way to go, but I bring up the idea to remind folks that sometimes very simple approaches can still be made to work. And we should all ask ourselves, given a simple but non-standard approach, versus a complicated but structured approach, is the structured approach worth the extra complexity?

At a philosophical or perhaps architectural level, I prefer the set of tools being developed by Hashicorp.

Rather than use Docker and Kubernetes, the system that appeals to me is:

Nomad

Consul

Terraform

When I say that Docker is confusing, some developers question my intelligence. But I’m not saying Docker is impossible — obviously a lot of people work with it. I would suggest, however, that people ask which approach is simpler, Docker/Kubernetes or ordinary binaries running on ordinary Linux, with something like Nomad and Consul handling the orchestration?

[ [ UPDATE 2018-05-28 ] ]

Also see “Revisiting Using Docker” which is a look at the technical problems with Docker. Their conclusion:

Having seen various groups use Docker and having spent a lot of time in the trenches battling technical problems, my conclusion is Docker is unsuitable as a general purpose container runtime. Instead, Docker has its niche for hosting complex network services. Other uses of Docker should be highly scrutinized and potentially discouraged.

[ [ UPDATE 2018-06-14 ] ]

I currently have a client that is making a major push to Dockerize everything. Some questions have come up. For instance, they realized it can be a security risk to have the app run as root inside of Docker. So we decided to use gosu to help with this issue. There was substantial conversation about how to use gosu best, and does it solve all security issues? Mind you, the question here is “How can we run an app as something other than the root user?” This is a problem that Unix solved 40 years ago, but the question is new and fresh again, thanks to Docker.

An example of the many ways that Docker brings in a complexity that eats up the team’s time.

————————–
————————–
————————–

From Sean Hull in private email:

Terraform can do everything up until the machine is running. So things post boot, *could* be done with Ansible. That said another route is to BAKE your AMIs the way you want them. There’s a tool called “packer” also from hashicorp. It builds AMIs the way docker builds images. In that model, you can simply use terraform to deploy a fully baked AMI. If your configuration changes, you re-bake and redeploy that new AMI.

Barring that, it’s terraform for everything up until boot, and then Ansible for all post-boot configuration.

Sean Hull thinks Docker is a very powerful and important technology, but he suggested this bit with baking the AMIs as another way to go.

.

[ [ UPDATE 2018-06-20 ] ]

We can agree that “immutable deployments” sounds like a good thing, and so does “infrastructure as code”. But don’t let those slogans sell you on Docker. If you are looking for a sane alternative to the Docker approach, check out Using HashiCorp’s “Packer” tool to “bake” your AMI for Immutable AWS Deployments:

AMIs and Immutable Deployments

An AMI can be thought of as the lowest level of control for a deployment. Going up the chain, there are application servers, with application artifacts (RPM, WAR, node package) at the highest level. The goal is to achieve immutability by bundling the artifact in the AMI (baking in) so the instances created from this AMI would need no other configuration, and completely identical instances can be spun up in an automated manner. The logs on this instance can be monitored by pushing them out to a central logging service, such as Splunk or Loggly, or a home grown ELK setup. There should absolutely be no need to ssh into any of these instances.

Packer

Packer is a tool to automate building of AMIs (and other targeted Machine Images) from configuration files, or templates. The templates are JSON files containing basic information about the AMI to be created, and can contain inline scripts to provision software on the resulting AMI. Any instances launched from this AMI will have the exact same configuration – from the Operating System to the deployed code. It is ideal to not have any ssh access on these instances either to achieve a much higher, or absolute level of immutability. That may take a few iterations and maturity cycles though.

[ [ UPDATE 2018-06-23 ] ]

I’m currently working for a client where everything is Python and everything needs to be Dockerized. So I’m developing a Python app and Dockerizing it. I’m also responsible for the continuous integration build system, for which I’m using Jenkins.

If I do “docker-compose up” and run the unit tests they work from the cli. But if I do “docker run” with the CMD set to run the unit tests, I get:

ModuleNotFoundError: No module named ‘zeron’

In other words, the CWD is different when I use “docker run” to run the tests from Jenkins, versus when I “docker-compose up” a container and then get shell on that container.

So what do I do? I add some debugging info, re-run “docker build”, wait a few minutes for it to build, then run “docker run”, then look at the results, then add some more debugging info, then run “docker build”, then do “docker run”, then look at the results, then…

What does that resemble? It exactly resembles the work cycle of a compiled language back 15 years ago, circa 2003, before we had hot reloading. And it was exactly that time-consuming work cycle that gave compiled languages a bad name, and caused programmers to get interested in script languages such as PHP and Python and Ruby. Nowadays, of course, if you develop on the JVM (Java, Scala, Clojure, Kotlin, jRuby and so many others) there are many tools that support hot reloading when you make a change, so you no longer need to recompile.

But I now need to compile Python, following a slow development pattern that I thought we had all escaped more than 10 years ago.

The stacktrace tells me the error is happening here:

File “/usr/local/lib/python3.6/site-packages/django/__init__.py”, line 19, in setup

I’d love to open that file and look at it, and maybe print out the variables that exist on that line, but of course, this is the world of Docker, so nothing like that is possible. That particular Django file does not exist on my machine, its in an image in a Docker repo somewhere, I’d have to build the image, get shell in the container, and then look at the code, which is a lot more effort than just opening a file in Emacs or just running “cat”.

But Docker helps with development?

I am entirely certain this project would have been done a month ago if we were not fighting against Docker every step of the way.

[ [ UPDATE 2018-06-24 ] ]

On Hacker News, in the responses to this blog post, there is a strange comment from the user “endymi0n” who describes themselves as “Cofounder & CTO – JustWatch.” Dominik Raute is the Cofounder & CTO of JustWatch, so I assume this comment is from Raute:

The point is: Even if you’re using single, static binaries (like we do in 90% of cases with Go), the big benefit of Docker is that you don’t just get a binary, you get a service interface.

I can’t stress this point enough. If you don’t know what’s up with a failing service, any SRE can go into the Dockerfile definition (actually, we’re rarely dealing with Dockerfiles anymore, these days it’s more the Kubernetes definition), look at which ports are exposed, what the environment variables are, what the service is talking to, etc.

You can harden security between those services much tighter than between binaries, by dropping capabilities, seccomp, network policies, read-only file-systems.

This is literally a request for the standardization of configuration files. I think that is something we can all support 100%. But there is absolutely no need to use Docker for this. Rather, it seems that Raute is indulging the single worst vice of the technology industry — looking for a technological solution to a social problem.

Here is a simple example of a Dockerfile for a simple NodeJS app:

Add the following to a file called Dockerfile in the project directory:

# use a node base image
FROM node:7-onbuild

# set maintainer
LABEL maintainer “miiro@getintodevops.com”

# set a health check
HEALTHCHECK –interval=5s \
–timeout=5s \
CMD curl -f http://127.0.0.1:8000 || exit 1

# tell docker what port to expose
EXPOSE 8000

It would be great if the software industry could get together and agree on this kind of format for standardizing all the different aspects of configuring apps. I could see the Apache Foundation taking a leadership role on this. They could encourage the tech leads of Apache sponsored projects to adopt a standard, and that standard could spread the tech industry. Another example of Apache helping the tech industry get its act together. I could also see Google potentially playing a leadership role here.

Whoever leads, it is a great idea and I’d like to see more progress on this. It would be great if 5 years from now, whenever I have a question about how an app is configured, I can simply look at the config file, which will be in a universally agreed upon location, and written in a universally agreed upon format, and covering a universally agreed upon list of variables.

That would be awesome. It does not require any new technology. It certainly doesn’t require Docker. It just needs some industry coordination, of the type that’s given us the many other standards we depend upon (such as Unicode and TCP and IP and UDP and Ethernet and email and HTTP and all the others).

Regarding the other capabilities that Raute mentions:

You can harden security between those services much tighter than between binaries, by dropping capabilities, seccomp, network policies, read-only file-systems.

None of these things come from Docker, all of these things are part of the underlying operating system. On Unix system, you can chroot directories to isolate them, or use the ulimit command to limit resources available to a user:

User limits - limit the use of system-wide resources.

Syntax
      ulimit [-acdfHlmnpsStuv] [limit]

Options

   -S   Change and report the soft limit associated with a resource. 
   -H   Change and report the hard limit associated with a resource. 

   -a   All current limits are reported. 
   -c   The maximum size of core files created. 
   -d   The maximum size of a process's data segment. 
   -f   The maximum size of files created by the shell(default option) 
   -l   The maximum size that can be locked into memory. 
   -m   The maximum resident set size. 
   -n   The maximum number of open file descriptors. 
   -p   The pipe buffer size. 
   -s   The maximum stack size. 
   -t   The maximum amount of cpu time in seconds. 
   -u   The maximum number of processes available to a single user. 
   -v   The maximum amount of virtual memory available to the process. 

You have all this power at your disposal without ever using Docker. Standardizing where user limits are set, or what scripts should be run before an app starts, are all great ideas. But using Docker adds a layer of complexity that gives nothing in return. When you use Docker, your system suddenly has a new technology that you and your devs have to worry about, and meanwhile you have not gained a single new ability that you didn’t have before.

[ [ UPDATE 2019-01-19 ] ]

Further complaints, worth noting, over on Reddit:

I hate docker because it’s a security mess. I still use docker because it does provide other tangible benefits and I mitigate the security issues.

[ [ UPDATE 2019-01-20 ] ]

A current article on Hacker News about failures with Kubernetes:

It’s not for everyone and it has significant maintenance overhead if you want to keep it up to date _and_ can’t re-create the cluster with a new version every time. This is something most people at Google are completely insulated from in the case of Borg, because SRE’s make infrastructure “just work”. I wish there was something drastically simpler. I don’t need three dozen persistent volume providers, or the ability to e.g. replace my network plugin or DNS provider, or load balancer. I want a sane set of defaults built-in. I want easy access to persistent data (currently a bit of a nightmare to set up in your own cluster). I want a configuration setup that can take command line params without futzing with templating and the like. As horrible and inconsistent as Borg’s BCL is, it’s, IMO, an improvement over what K8S uses.

Most importantly: I want a lot fewer moving parts than it currently has. Being “extensible” is a noble goal, but at some point cognitive overhead begins to dominate. Learn to say “no” to good ideas.

Unfortunately there’s a lot of K8S configs and specific software already written, so people are unlikely to switch to something more manageable. Fortunately if complexity continues to proliferate, it may collapse under its own weight, leaving no option but to move somewhere else.

Note, the title of my essay is “Docker is the dangerous gamble which we will regret”. I chose the word “gamble” because there is a double-or-nothing quality to investment in this infrastructure. You have to go double-or-nothing at each stage, which is the real risk. Docker didn’t solve everything? Try Kubernetes! But wait, Kubernetes didn’t solve everything? Maybe try it with Ranch! Or give up and go to Mesos?

I’ve worked with startups that have invested more and more money in an attempt to get the assumed benefits of containers. The double-or-nothing aspect is what had me worried when I originally wrote this essay. If you want security and flexibility, there are less complex, easier ways to get it.

Post external references

  1. 1
    https://news.ycombinator.com/item?id=15578147
  2. 2
    https://thehftguy.com/2017/02/23/docker-in-production-an-update/
  3. 3
    https://thehftguy.com/2016/11/01/docker-in-production-an-history-of-failure/
  4. 4
    https://hvops.com/articles/docker-misconceptions/
  5. 5
    https://en.wikiquote.org/wiki/Donald_Knuth
  6. 6
    http://www.colinsteele.org/post/27929539434/60000-growth-in-7-months-using-clojure-and-aws
  7. 7
    https://www.infoq.com/presentations/Simple-Made-Easy
  8. 8
    https://medium.com/production-ready/sometimes-boring-is-better-d16d38214186
  9. 9
    https://code-maze.com/top-8-continuous-integration-tools/
  10. 10
    http://jmoiron.net/blog/is-k8s-too-complicated/
  11. 11
    https://www.hashicorp.com/blog/running-apache-spark-on-nomad
  12. 12
    https://thenewstack.io/twitters-aurora-replaces-operating-systems-stateless-services/
  13. 13
    https://github.com/ansible/ansible-examples/tree/master/wordpress-nginx
  14. 14
    https://hub.docker.com/_/drupal/
  15. 15
    https://news.ycombinator.com/item?id=17062288
  16. 16
    https://www.hashicorp.com/
  17. 17
    https://blog.online.net/2016/09/14/build-your-infrastructure-with-terraform-nomad-and-consul-on-scaleway/
  18. 18
    https://gregoryszorc.com/blog/2018/05/16/revisiting-using-docker/
  19. 19
    https://github.com/tianon/gosu
  20. 20
    https://www.iheavy.com/
  21. 21
    https://lobster1234.github.io/2017/04/23/packer-your-AMIs-for-immutable-aws-deployments/
  22. 22
    https://getintodevops.com/blog/building-your-first-docker-image-with-jenkins-2-guide-for-developers
  23. 23
    https://ss64.com/bash/ulimit.html
  24. 24
    https://www.reddit.com/r/devops/comments/8j9yrn/docker_is_the_dangerous_gamble_which_we_will/
  25. 25
    https://news.ycombinator.com/item?id=18953647
Source