Systems of process orchestration

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com

I get that Linux Cgroups might be acceptable, in the sense that “perfect is the enemy of the good” and Cgroups are maybe good enough for now. I’d prefer it if the industry could rethink what a multi-user computer is, get rid of Unix, take the best ideas forward into a new system, and build something where something as clunky as Cgroups and Docker are not needed — it should be possible to have a “fat process” which does what Docker does, and that should be one of the primitives of the system.

But clearly that is not going to happen during the next 10 years, so for now we are stuck with Docker.

Docker’s complexity sucks, but it does facilitate orchestration, which is a real problem that deserves serious attention. Again, to speak for a moment about the perfect, it would be nice to have a fat process that has more than stdin and stdout and stderr — it should also have stdconfig, to facilitate orchestration. And if it also had a standard byte frame (I can’t think of a reason why this is impossible) then it would be better than Docker, and less complex. (The standard byte frame, combined with standard network sockets, would also remove the need to use HTTP.)

But orchestration remains an issue, and for now most systems are designed to work with Docker. Systems that don’t rely on Docker, such as Mesos, are extremely complex, because they are trying to orchestrate every imaginable kind of app, so, to be clear, I would love to see the standardization that Docker brings, but I’d like to see this as a primitive that goes down to the hardware level, not an extra feature built on top of an extra feature that has been built on top of Unix — there are only so many features you can bolt on to a system before a complete rethink of the foundations is needed. You can start with a simple foundation if you are creating a building that is one-story or two-stories or three-stories, but if you are building a 90-story skyscraper, you need to rethink the depth of the foundations.

So here are some current interesting options:

Docker Swarm

Docker Swarm is Docker’s own tool for cluster management and orchestration, and was recently introduced into Docker Engine as “swarm mode” with the Docker 1.12 update, which added support to the Docker Engine for multi-host and multi-container orchestration. Administrators and software developers can create and manage a virtual system known as a “swarm” that is composed of one or more Docker nodes. You can connect directly with the Docker API, giving you access to native tools such as Docker Compose. Container deployments are typically handled via Docker Compose or the Docker command line. Docker claims that the software can handle up to 30,000 containers and clusters of up to 1,000 nodes, without suffering any dip in performance.

Kubernetes

The Google-designed Kubernetes is an open-source system for Docker container management and orchestration. Kubernetes uses a single master server that manages multiple nodes using the command-line interface kubectl. In Kubernetes, the basic unit of scheduling is a “pod,” a group of typically one to five containers that are deployed together on a single node in order to execute a particular task. Pods are temporary – they may be generated and deleted at will while the system is running. Higher level concepts such as Deployments can be constructed as a set of pods. Users can set up custom health checks, including HTTP checks and container execution checks, on each pod in order to ensure that applications are operating correctly.

Marathon

Marathon is a production-grade open-source framework for container management and orchestration that is based on Apache Mesos and intended to work with applications or services that will run over a long period of time. Marathon is a fully REST-based solution and can also be operated using a web user interface. In order to guard against failure, Marathon can run multiple schedulers at once so that the system can continue if one scheduler crashes. Like Kubernetes, Marathon allows you to run regular health checks, so you stay up to date on the status of your applications. Another benefit of Marathon is its maturity; the software is stable and has a variety of useful features such as health checks, event subscriptions, and metrics.

Amazon ECS

Amazon EC2 Container Service is a container management service for Docker containers. Importantly, any containers managed by Amazon ECS will be run only on instances of Amazon Web Services EC2; so far, there is no support for external infrastructure. However, as a positive, this also means that you have access to AWS features such as elastic load balancing, which redistributes application traffic to provide better performance under pressure, and CloudTrail, a logging and monitoring application. Tasks are the basic unit of Amazon ECS and are grouped into services by the task scheduler. Persistent data storage can be accomplished via data volumes or Amazon Elastic File System.

For a lot of the work I do, I can imagine that Nomad would be more of what I want, much more than Kubernetes:

Why Spark on Nomad?

Nomad’s design (inspired by Google’s Borg and Omega) has enabled a set of features that make it well-suited to run analytical applications. Particularly relevant is its native support for batch workloads and parallelized, high throughput scheduling (more on Nomad’s scheduler internals here). Nomad is also easy to set up and use, which has the potential to ease the learning curve and operational burden for Spark users. Key ease-of-use related features include:

Single binary deployment and no external dependencies

A simple and intuitive data model

A declarative job specification

Support for high availability and multi-datacenter federation out-of-the-box
Nomad also integrates seamlessly with HashiCorp Consul and HashiCorp Vault for service discovery, runtime configuration, and secrets management.

Source