Smash Company Splash Image

May 13th, 2018

In Technology

12 Comments











If you enjoy this article, see the other most popular articles




















If you enjoy this article, see the other most popular articles




















If you enjoy this article, see the other most popular articles

Docker protects a programming paradigm that we should get rid of

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

Do you know about giraffes? In particular, their recurrent laryngeal nerve? Here is the deal: the nerve used to go straight from the brain to larynx, and in fish this is a short, direct connection, with the nerve passing behind the gills, but as fish evolved into creatures that lived on land, the gills got pulled into the body of the creature, and became what we call lungs. As that evolution happened, the nerve got pulled further and further into the body, because it was stuck looping around the lungs. The extreme case is the giraffe. Remember, this nerve is suppose to go from the brain to the larynx in the throat, yet now it has to go all the way down the neck, loop around the lungs, then travel all the way back up the neck to reach the larynx. It’s a huge biological waste. As Wikipedia says:

The route of the recurrent laryngeal nerve is such that it travels from the brain to the larynx by looping around the aortic arch. This same configuration holds true for many animals; in the case of the giraffe, this results in about twenty feet of extra nerve.

Some biologists refer to this as Incompetent Design. The problem is that nobody ever sat down to full redesign creatures so that they could live on land. Rather, each living creature simply wanted to survive, and pass on its genes, and its children were just a little bit different compared to the parent. If God, or you, were to sit down today, knowing all that we now know about life on land, you could design a creature that didn’t have these kinds of obvious mistakes. But instead what happened was something much uglier: the design of the moment, the ancestors of the giraffe, saw small, incremental changes, which over time morphed into a new shape, full of obvious design errors.

Python grew up in the world of 1990s, when a developer might work on the same server for many years. Servers were permanent. In that world, it didn’t seem like a problem if a library was installed globally. After all, the developer had years to get to know the various paths on that server, and years to set the environment variables to whatever their project needed. But that paradigm broke down in the new world of cloud computing. Servers became impermanent. And then Docker came along to help fix some of the problems that eco-systems such as Python faced in this new world of fast changing servers. Certainly, Docker helped a lot with paths and managing environment variables. For this reason, the use of Python tends to lead to the use of Docker, and the use of Docker encourages the use of Kubernetes. And in the end you have the recurrent laryngeal nerve of the giraffe. You end up with something enormously complex, that arose incrementally, by trying to keep alive some pre-existing system. But if you were to sit down and design something entirely new, knowing all that you know now, you can could build something much cleaner and simpler than what Python/Docker/Kubernetes gives you.

Consider these comments in favor of Docker:

- An application will not mess with the configuration of another app (that’s solving the problem of virtualenv, rvm and apt incompatibilities).

Right, so that is why I wrote “Why would anyone choose Docker over fat binaries?“. Rather than use Ruby or Python, and rely on the path and variables of the underlying server, or operating system, why not use an uber binary that has no outside dependencies? Why not keep it clean and simple and isolated?

Let’s consider the counter-argument first. Ryan Tomayko wrote I Love Unicorn Because It’s Unix in 2009,

Eric Wong’s mostly pure-Ruby HTTP backend, Unicorn, is an inspiration. I’ve studied this file for a couple of days now and it’s undoubtedly one of the best, most densely packed examples of Unix programming in Ruby I’ve come across…

We’re going to get into how Unicorn uses the OS kernel to balance connections between backend processes using a shared socket, fork(2), and accept(2) – the basic Unix prefork model in 100% pure Ruby.

We should be doing more of this. A lot more of this. I’m talking about fork(2), execve(2), pipe(2), socketpair(2), select(2), kill(2), sigaction(2), and so on and so forth. These are our friends. They want so badly just to help us.

Ruby, Python, and Perl all have fairly complete interfaces to common Unix system calls as part of their standard libraries. In most cases, the method names and signatures match the POSIX definitions exactly.

This was the attitude that drove Ruby and Python and Perl during the last 30 years: that the operating system was powerful and a light weight scripting language should exist partly to offer a convenient wrapper over OS calls. Larry Wall said that Perl was appropriate for any app that was too big for Bash but didn’t need to be in C. That is a huge space. And Ruby and Python and Perl all were built with the idea that the developer should rely on the OS as much as possible. This was brilliant in the 90s, but it means that the apps written in these languages tend to be dependent on file paths and environment variables and user paths and OS permissions — the app is heavily dependent on the overall context of the machine, because the app is supposed to be a lightweight wrapper around all the functionality already provided by the OS.

This paradigm used to be brilliant but it becomes pathological when it tries to transition to cloud computing.

There is a devops joke that says that when you’ve got a handful of servers you name them like pets: bob, alice, li, lo. But when you’ve got hundreds of servers, you simply number them, like widgets coming off the assembly line at the widget factory.

The paradigm that Ryan Tomayko praises is well suited to a world where the servers are named like pets. But calling fork() and join() from your Ruby app, when you’ve got a thousand instances of your Ruby app running on a thousand servers, is a very bad idea. At that point you need higher level frameworks for dealing with concurrency.

I loved Ryan Tomayko’s essay at the time and I sent it to all of my friends. I wanted everyone to understand and appreciate it. It influenced how I wrote Ruby. But as I worked on bigger and bigger systems, I realized, that paradigm needs to die. fork() and join() cannot control the concurrency of your system when your system is spread across a large number of servers. To handle these bigger systems, many new frameworks have emerged. Ruby programmers can now use Celluloid, an actor framework “which lets you build multithreaded programs out of concurrent objects just as easily as you build sequential programs out of regular objects.” Developers who write Scala are in love with Akka, which appears to be very good. In the world of Clojure, Michael Drogalis has lead the way with the Onyx framework. And the Go language has very good primitives for certain kinds of concurrency, and it has the Circuit framework for large scale distributed computing.

Some people have said to me, “With Docker, I can run two instances of my Python app, or 5, or even 20, on the same host, and I can automate how many instances are running, so as to scale up the number of instances based on how much traffic/demand I need to deal with.” Okay, awesome. So Docker helps manage concurrency? And this is important with Python because Python has historically had a difficult time handling concurrency (the GIL), and even now, Python programmers tend to spin up new processes, rather than using something they consider ambiguous, such as threads. But if that is your need, why not use a language/eco-system that has first class support for concurrency? There are older, mature options, such as Java and C# and Erlang, and there are many newer options, such as Go or Elixir or Clojure.

What is Docker for? You can take your old Python and Ruby and Perl apps and wrap them up in Docker, and thus those old apps can make the transition to the modern world of cloud computing. In that sense, Docker allows you to take apps developed with a paradigm from the 1990s, and deploy it in 2018. The folks working with Python and Ruby and Perl (and PHP) are jealous of the way a Java programmer can create an uberjar, and they are jealous of the way a Golang programmer can create a binary that has no outside dependencies — and so the Python programmer, and the Ruby and Perl and PHP programmer, they turn to Docker, which allows them to create the equivalent of an uberjar. But if that is what they want, maybe they should simply use a language that supports that natively, without the need for an additional technology?

Many people regard this as one of the greatest things about Docker, but I regard the entire effort as an example of what is wrong with the tech industry. We suffer an unwillingness to confront the reality of the emerging situation, and commit to new paradigms that are well adapted to the new situation. Instead we commit to very complex technologies that allow us to wallow in the past. This is “conservative” in the negative sense: rigid, nostalgic, reactionary.

I know a great many developers are going to dismiss this blog post, but as a thought experiment, you might want to consider two different companies. One spends the next 5 years committing to those languages and eco-systems that have been built for the era. The other spends the next 5 years using Docker so they can keep using script languages from the 1990s. Now it is the year 2023, and a crisis happens at both companies. Which of those two companies do you think will be more ready to adapt to the crisis?

[ [ UPDATE 2018-06-14] ]

Myself and a friend just spent an hour trying to get a short Python script running on an EC2 instance. We got stuck dealing with this error:

ModuleNotFoundError: No module named ‘MySQLdb’

The EC2 instance was running Python 2.7 by default. Thinking we needed to use pip3 for this install, we upgraded to Python 3 and pip3. But we still got the same error. We tried a few other things.

Eventually, my friend said, “Hey, let me take this home and write a real install script, and we can try to run this in a few days. Maybe I can build this in Docker.”

Of all the forces that currently push Docker forward, I suspect that the Python community is the strongest. And that is because the dependency management in the Python community is so badly broken.

Compare the Python community with the Java community. At no point in the last 10 years have I had a Java project where I ran into the kinds of dependency management problems that I run into, routinely, with Python.

And again, many of the problems that Python faces goes back to that idea that Python should rely on the underlying machine, and the underlying OS — a set of ideas which the whole tech industry is now trying to get away from.

I absolutely understand why you want to use Docker, if you are working with Python. Because Python is broken. But you owe it to yourself, and your company, to consider that the time you invest in Docker might be better spent moving away from Python.

.

.

[ [ UPDATE 2018-07-09 ] ]

The following happened today. This is exactly the kind of thing that Docker is supposed to protect us from, and it can’t even get this right. At the very least, Docker is supposed to offer a consistent development environment for every developer. To fail at this is to fail at what used to be the core argument in favor of Docker. Really pathetic.

This is me and a co-worker, trying to reconcile our different parts of a Python app:






.

.

[ [ UPDATE 2019-07-07 ] ]

A great conversation on Hacker News:

https://news.ycombinator.com/item?id=20371961

crdoconnor made a very good comment, responding to someone else who’d suggested that Docker standardized things:

There are many arguments I can think of for using docker but “standard” is a pretty poor one. Directories, bash scripts and package managers are also arguably “standard” and sufficient to install most software. We should care primarily about alleviating deployment pain – not about argumentum ad popularum.

“Shipping container standardisation” is not a metaphor for what docker is, it’s marketing. Docker, of course wants to be a “standard” product. Every product wants to be that.

In my experience, it’s been more like a buggy kludge to deal with applications that have isolation issues with their dependencies and unnecessary overhead for applications that don’t. All with a sprinkling of marketing hype.

cassianoleal digs in their heels and pushes back against my argument with this:

So, someone who doesn’t understands his tools blames the tools for the fact that somehow they need to keep using a language they don’t like (and don’t understand either, it seems).

Nothing of what he complains about is Docker’s fault or even Python the language’s fault.

…It would do the author good to do some soul searching and perhaps understand that not all problems are nails, where the best tool to deal with is a hammer.

In the entire above essay, I never said one single critical thing about Docker, so what does it mean to say that this is not “Docker’s fault”? I’ve suggested that Docker is an excellent way of handling some of the problems with Python. I did not accuse Docker of anything, unless one is referring to the problems mentioned in the Slack conversation that I have posted. The Slack conversation details problems that would not have existed if we were using some other VM that was not Docker.

I would ask cassianoleal to consider what I said up above:

I know a great many developers are going to dismiss this blog post, but as a thought experiment, you might want to consider two different companies. One spends the next 5 years committing to those languages and eco-systems that have been built for the era. The other spends the next 5 years using Docker so they can keep using script languages from the 1990s. Now it is the year 2023, and a crisis happens at both companies. Which of those two companies do you think will be more ready to adapt to the crisis?

cies wrote:

The author compares docker-based deploys to deploying straight on top of the OS. And looks at it from a “application programming” perspective.

I argue docker-based deploys should not be looked at from a “application programming” perspective, but from a “dev ops” perspective. It is docker vs puppet/ansible. Not docker vs akka.

To me docker is a big step fwd from provisioning OSes with puppet/ansible and deploying apps on top…

In response, I’ll say “Use Terraform”. I try to keep my essays to less than 4,000 words because I’ve found that if they get longer than that then no one will read them. So I can not cover every argument in one essay. I have elsewhere suggested that Terraform gives all of the benefits that we need from virtual machines:

Docker Is The Dangerous Gamble Which We Will Regret

At some point I’ll publish some of my easy-to-get-going Terraform code, to make clear why I think it is better to stick with Terraform, which allows you to work with technologies that you already know, such as basic Linux, rather than use Docker, which then pulls you into a complicated eco-system of new technologies (Kubernetes, Ranch, Docker Swarm, registries).

tomohawk wrote:

His point about fat binaries is a good one. After using languages with lots of dangling parts such as Java and Ruby, moving to Go was amazing on the deploy end.

…With a fat binary you only have to manage that once, and deployment becomes super simple. We actually put off using containers for quite a while because what’s the point of putting a single binary into a container?…

This is precisely the argument I make here:

Why would anyone choose Docker over fat uber binaries?

.

Also, great conversation on Lobste.rs about this essay.

.

[ [ UPDATE 2019-07-09 ] ]

I’m grateful to Andrew MacGinitie for pointing out some typos, which I’ve now fixed.

Source



Check out my books:
"I wish I could go back," said Anna. "I guess I thought it would always be there, and I could go back and learn more when I was older. But now I'm older and it's gone."

"All the great art scenes are like that," said Mariah. "Renoir's career was half over before the term Impressionism caught on. And Fitzgerald and Hemingway had given up on the Left Bank long before the place was overrun by talentless hacks who wanted to imitate the Lost Generation lifestyle. And the Beats had mostly left San Francisco before busloads of visitors started to do tours of the Haight-Ashbury. When Johnny Rotten couldn't work with the Sex Pistols anymore, he left and the London punk scene began to die. Later on, he said he regretted his decision to leave. Everyone thinks they can go away and come back later, but they never can. When Joan Didion and her husband left New York, she quipped that some other couples were staying too late at the party, but that gets it all backward. The party ends whether you want it to or not, and it takes an unusual arrogance to celebrate the end of an era that some people will remember as the best years of their life. Hemingway lived in Paris during his twenties, but he didn't write about his experience in Paris until he was in his sixties. No one ever knows they're part of an art movement; it's something you only see afterward."

"But if we only see it in retrospect, then how can we find the next great art scene?" asked Anna. "What do I look for?"




Also read this true story about a startup I worked at in 2015:




RECENT COMMENTS

September 22, 2019 3:17 pm

From Just An Observer on Software developers often fail when they try to become managers

"Common failure mode for salespeople as well. Exceeded quota 40 quarters in a row, let's make them the distr..."

September 22, 2019 2:05 pm

From lawrence on How I recovered from Lyme Disease: I fasted for two weeks, no food, just water

"AAreth, you are misinformed. Human blood is full of bacteria. Again, please try to read some of what the NIH h..."

September 22, 2019 1:51 pm

From Verio on One-on-one meetings are underrated, whereas group meetings waste time

"This is grotesque. You are being vindictive. If you were my manager, I would quit. Nobodyy wants to work at a ..."

September 22, 2019 1:48 pm

From AAreth on How I recovered from Lyme Disease: I fasted for two weeks, no food, just water

"I'm happy to confirm that all tissues are in normal circumstances essentially sterile. The combination of '..."

September 2, 2019 11:58 am

From lawrence on Docker is the dangerous gamble which we will regret

"Chrisco, thank you, this is a great comment. You raise the point of MySQL in Docker, but you have to provide a..."

September 1, 2019 8:12 pm

From chrisco on Docker is the dangerous gamble which we will regret

"I live in the Java world. Since about 2000 all my web apps have been deployed into what have been known as "ap..."

August 29, 2019 5:39 pm

From Brandon on How I recovered from Lyme Disease: I fasted for two weeks, no food, just water

"This is a fantastic story. There's something deeply harrowing in the sentence "[then] I took Amoxicillin for 1..."

August 27, 2019 1:53 pm

From lawrence on High Availability is not compatible with a MVP, because MVP is about fast iteration

"Joshua Hoover, I strongly agree. I've previously advocated for Heroku, which was the pioneer in the serverless..."

August 26, 2019 10:39 pm

From Joshua Hoover on High Availability is not compatible with a MVP, because MVP is about fast iteration

"Agreed on HA not being a part of MVP. It's waste. The MVP is one step closer to discovering you have to do som..."

August 21, 2019 8:47 am

From Jorge Castro on Docker is the dangerous gamble which we will regret

"Hi there and I agree completely but I can resume as follow: Docker promises simplicity, i.e. IT IS EASY. ..."

August 20, 2019 2:29 pm

From lawrence on If you want to go dancing in New York City, consider Silvana

"Which is fine. Like I said, there are dance scenes that have strict “no alcohol” rules. That might appeal to y..."

August 20, 2019 1:35 pm

From Just An Observer on If you want to go dancing in New York City, consider Silvana

""Promise of an early bed" - the whiff of danger keeps me away from many venues like the one you describe...."

August 20, 2019 12:22 am

From lawrence on If you want to go dancing in New York City, consider Silvana

"I think any time you go to any club there is the possibility of running into an angry person, maybe a person w..."

August 19, 2019 7:56 pm

From Just An Observer on If you want to go dancing in New York City, consider Silvana

"I'm confused. You and your friends went out, had a fight, and it's still a great place to go to? Maybe..."

August 18, 2019 8:57 pm

From Michael L on Americans increasingly hate each other

"You seem to have little patience for people who choose different tech paths than you. Although it looks like o..."

12 COMMENTS

May 15, 2018
12:11 am

By Agam Brahma

Great article, so true @ “unwillingness to confront the reality of the emerging solution”.

Minor typo: s/Onxy/Onyx/

May 15, 2018
10:32 am

By lawrence

Agam Brahma, thank you. I have now fixed the typo.

July 6, 2019
2:52 pm

By Alireza Bashiri

Awesome reading.

Pingback: New top story on Hacker News: Docker protects a programming paradigm that we should get rid of – Hckr News

July 6, 2019
5:33 pm

By Scott Smith

It’s true, though all any good encapsulation strategy can be used to encapsulate bad things. Teams I’ve been on that use Scala wrap it in docker containers too, even though it can do its own concurrency.

A docker container seems to have fixed the problem that a Unix process has virtualized access to RAM and file descriptors, but not the filesystem itself, and this gives us a higher order “process”.

This solves the problem that two brilliantly written fat executables on my laptop can read each other’s files, even if one is a video gam and the other is managing my banking interactions.

Let’s keep docker and still get on to improving how we code.

July 6, 2019
7:17 pm

By Mark

You’ve very well described a long known and well-recognized ‘issue’ with Docker (OK, maybe not by everybody!) and one widely used by the ‘serverless’ crowd to justify that direction of application Cloud architecture.

Sort of like ‘fork-lifting’ to the Cloud, one would hope that this use of Docker to simply shift existing conventional (90s ‘old-school’) apps to the Cloud (or a distributed OS like Kubernetes) is just an interim step to newer more effective software design/architectures. But for most conservative businesses that is probably wishful thinking and a ways off depending upon the competitive environment.

Pingback: Giraffes, Docker and Incompetent Design | Markus Feilner

Pingback: 64K : The Compatibility Struggle Looming Over the Horizon

August 18, 2019
7:40 pm

By Michael L

If you weren’t criticizing Docker, why did your frame the article as anti the Docker/Kubernetes/Python troika? What is it with geeks that they have to write a blog post about the total failure of technologies X,Y,Z whenever they don’t scratch their particular itch? Guess what? You can be right and wrong. Maybe it’s not right for you. If you aren’t willing to spend 2 minutes researching how to install the MysqlDb module, don’t use Python. Others will use it, and it will work well for them. How about find tools you like and write about using them?

August 18, 2019
7:48 pm

By Michael L

You think that containerization is going anywhere? I agree that it isn’t strictly necessary, but you mistake your perspective for the truth. It’s here to stay, at least until it’s supplanted by a superior paradigm. Fat binaries aren’t a full replacement for a fully orchestrated environment. Does that fat binary contain the deployment logic for the application in a specific organization? Nope. You should be careful to understand that there are multiple valid viewpoints out there. When you confuse your own for the only valid argument, you only come off as myopic.

I have a background in operations, with a focus on VMs (I was “lucky” enough to start on an IBM mainframe so I had virtualization before Intel). I see how containers are largely optional in the abstract. However, expecting anyone to regret the time that they spent containerizing their apps, and completely eliminating the back and forth with ops since everything was already included, is absurd. It won’t happen. don’t hold your breath.

I don’t care if this gets published, btw. Feel free to delete it.

August 18, 2019
8:00 pm

By Michael L

To build on my last statement, I’m not trying to show that I’m “smarter.” I’m probably not, or if I am, who gives a fuck? Comment sections suck, and I don’t want to drag down yours. But I do feel like you’re being shortsighted here. It’s not so simple. Docker is being left behind, although largely by other runtimes and technologies that mimic it. K8s is powerful, albeit not necessarily friendly. But have you tried to encapsulate how to run a datacenter via a set of automated routines?

Also, your angst against Python doesn’t make a strong point. It makes you look impatient to anyone remotely familiar with that language.

Anyway, I didn’t give you an actual email address with which to contact me so I should be more nice. I have no doubt that you’re skilled, and I’m not trying to peacock my own views as superior to your own. Please delete my comments, they were meant for you, no one else.

August 18, 2019
8:34 pm

By Michael L

Me again. I’ve worked for a company that focuses on containerized applications for some time now. There is absolutely no strong correlation between Python and Docker. Sure, you can run Python, but it isn’t a preferred mechanism for doing so. Dude, you either have a really specific and dumb group of friends or you have constructed one hell of a straw man.

Python is one of the runtimes that my company works with in containers, but it isn’t dominant. Did you ever consider that your premise is just absolutely, fucking, incorrect? It is. I’ve preferred Python for many years, much longer than I’ve worked with containers. And I never heard anything about “you need to run Docker to run Python, what with it’s terrible dependency managment” or whatever.

Have you ever considered that you suck at Python? That’s fine. Not everyone needs to know the same language. But you are either dramatizing your experiences, or you’re fucking terrible at it. don’t use Python. It’s ok. If someone tells you that you should, tell them to fuck off. What’s the point of this fucking post? It’s embarrassing.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>