Smash Company Splash Image

June 6th, 2019

In Philosophy

1 Comment

Nils Meyer: there are advantages to containers, but fairly easy to get wrong

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com

Here is a comment that Nils Meyer on LinkedIn, in response to something I said. I can agree that I would see some use to Docker/Kubernetes in a non-virtual world, the irony is that I’ve only seen Docker/Kubernetes used in virtual setups.

I would agree that using packer and treating the VM like a container (or rather a pod in Kubernetes Parlance) is the easier approach. It’s also somewhat bizarre that a tarball of multiple tarballs and shell scripts concatenated with “&&” are now considered the pinnacle of software packaging. EDIT: There is a lot to dislike about docker as a container runtime, the image format, the standard way to build images, use of outdated lowest common denominator technology (iptables anyone? And why oh why must we use IPv4 and NAT?), feature creep etc.. Containers are a neat idea but docker leaves a lot to be desired.

There are advantages to using containers, especially with an additional orchestration layer, especially when NOT running on virtualization (most container deployments probably are to VMs). It is very difficult to get there though and requires a lot more work on infrastructure so that’s not a step to be taken lightly. It’s also fairly easy to get wrong with disastrous consequences for security, very often I see people pull in containers and code from all manner of sources, and many of the “official” container images have known vulnerabilities.

For context, see my earlier essay, Docker Is The Dangerous Gamble Which We Will Regret.

[ [ UPDATE 2019-06-10 ] ]

Nils Meyer added a comment to this blog post, which I’m incorporating into the blog post itself:

To elaborate a bit on my remarks: Many organizations use containers and container orchestration without having a defined and valid use case and without the necessary skillset and manpower in house to manage a complex setup. It’s extremely easy to get up and running with containers, setting up Kubernetes is also extremely easy when you’re just using kops. There are of course hosted solutions as well. This can be very deceptive since you skip ahead on the learning curve.

A risk therein is that people don’t actually understand what they have built, especially when it’s mostly developers with little Linux background setting it up. There is a lot of technology involved that you should understand when running a complex setup: Networking, Routing, Network overlays, the Linux distributions you’re using, overlay filesystems as well as underlying filesystems, proxying, if you’re using a proprietary cloud you should have an understanding of how those components work as well.

The risk is that you’ll get a lot of rope to hang yourself with since re-use of components is very easy. For example, you can pull in a lot of stuff from docker hub and other container registries, but there is no quality assurance or curation there (like you would get with python core modules for example) – there was some recent research that found a lot of fixed security issues in “official” docker images simply because the underlying OS layer wasn’t updated.

So to do this properly you would end up building your own container images, which means a lot of duplicated effort. Since you often end up running different Linux distributions you’ll need to know how to manage those as well. You will need a CI/CD system for that. You’ll want your own container registry. You’ll want to run vulnerability scans on your containers and have alerting when an image is vulnerable.

Once you have a large orchestration layer it becomes more and more difficult to get things to run similarly on developers machines. You’ve already lost when developers run an OS that doesn’t natively support containers and most of your developers probably don’t run Linux.

You need to be able to debug a container build – this can be especially annoying with Dockerfiles due to the layered approach, if the list of commands you chained together with && \ fails you’ll have some trouble trying to fix it. This is of course true of other packaging systems to a certain degree as well. Wouldn’t it be great instead of having every command create a new layer to just create a layer explicitly (just like database transactions)?

If you’re in a strongly regulated business you’ll want to be able to audit what software in what version and under which license you’re running at any given time. That also means you need to keep old container images around but be able to prevent their use.

Once you have all this you can do some pretty cool things – for example you can run optimized builds of your software, you can use newer versions of libraries only where you need them instead of completely running a bleeding edge distro, you can achieve far better utilization with containers on given hardware, containers scale very fast, virtual machines suffer from some unique CPU bugs that aren’t as high impact with containers. Some of these benefits you’ll realize a lot easier by running on bare metal, but few do this.

All of this doesn’t even take into account the most difficult thing: Storing data. At a certain point you have to store data, and usually do to limitations in networking and storage systems this can’t be very elastic and it gets very difficult if you have certain requirements for durability.

Source



Check out my books:





RECENT COMMENTS

June 24, 2019 3:49 am

From RASHMI GUPTA on Why are large companies so difficult to rescue (regarding bad internal technology)

"History and Trust..wonderfully summarized..."

June 10, 2019 8:21 pm

From Sean Hull on Nils Meyer: there are advantages to containers, but fairly easy to get wrong

"Some great points. Especially the one about storage systems. A lot of micro services encourages breaking up ..."

June 1, 2019 11:26 am

From Chris on The winners of globalization will now fight it out in the political sphere

"That explains exactly what happened here in the Australian election. The polls were drastically wrong. This ex..."

May 31, 2019 7:22 am

From Piers B on Object Oriented Programming is an expensive disaster which must end

"Having been developing software in both functional and OO languages for 30+ years, this is all about education..."

May 31, 2019 5:45 am

From HC on Billions were wasted on Hadoop startups, and the same will eventually be true of Docker

"This seems like a weird Docker hatchet piece. You're taking one piece of news about Hadoop and then weirdly re..."

April 14, 2019 4:34 pm

From lawrence on Abuse on Wikipedia

"Just An Observer, please post the link if you find it...."

April 14, 2019 12:28 pm

From Just An Observer on Abuse on Wikipedia

"Well, wouldn't you know, the slagging of Katy Bouman who is the algorithm designer for the black hole image is..."

April 11, 2019 4:30 pm

From Just An Observer on Abuse on Wikipedia

"In a similar vein, yesterday a woman scientist wrote about how Wikipedia articles about woman scientists are o..."

March 30, 2019 5:31 pm

From lawrence on Don't waste your life on Twitter

"Orbay, with any type of creative endeavor, I think you'll find 90% of the output is mediocre. That is true for..."

March 30, 2019 4:51 am

From Orbay on Don't waste your life on Twitter

"I agree, but you consider a great game a great achievement, not money from wasting other people's lives? I..."

March 21, 2019 10:38 pm

From Adam Trepanier on Object Oriented Programming is an expensive disaster which must end

"Thank you for this post. This sums up what I have been feeling for years now with OOP in such a great way. I..."

March 13, 2019 1:58 pm

From ball on Facebook activated my dormant account and it won't let me deactivate it

"Same shit here. I regret ever making a spraybook account..."

February 20, 2019 10:41 am

From Just An Observer on Don't waste your life on Twitter

"A couple of my favorite bloggers started doing twitter. Instead of permanent additions to knowledge, there is..."

February 20, 2019 3:24 am

From Brennan on Did sleep paralysis start the Salem Witch Trials?

"If you have occasional sleep paralysis, you can take steps at home to control this disorder. Start by making s..."

February 19, 2019 11:09 am

From Ryan Earp on Why I prefer dynamic-typing over static-typing: the speed of adapting to change

"If static typing lead to greater programmer productivity (via a reduction in bugs) then corporate Americ..."

1 COMMENT

June 10, 2019
8:21 pm

By Sean Hull

Some great points. Especially the one about storage systems. A lot of micro services encourages breaking up your database, so that each service has its own. When you do that you have to put an API in front of it, so that all the other services can get at such data. You can see things getting hairy already.

What about network overhead? Longer code path? How do you do joins across different APIs? How do you backup multiple databases sitting behind different services at the same point in time. How do you restore them together?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>