A parable about concurrency (demonstrated with comical cartoons)

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com

(Note: Leah McCloskey is a brilliant illustrator who brings warmth and humor to every graphic she creates. She is also head of design at Haystack.im. Save for those rights which she specifically granted to me, she reserves all rights on these images, so if you wish to re-use them, then you must contact her directly at leah @ dendritecorp.com. I am grateful that she found the time to work on this project. View her portfolio!)

This is part 1 of a 12 part series:

1.) Quincy’s Restaurant, a parable about concurrency

2.) Why I hate all articles about design patterns

3.) Clojure has mutable state

4.) Immutability changes everything

5.) Mutable iterators are the work of the Devil

6.) Get rid of all Dependency Injection

7.) Sick politics is the driving force of useless ceremony

8.) Functional programming is not the same as static data-type checking


9.) Inheritance has nothing to do with Objects

10.) Is there a syntax for immutability?

11.) Immutability enables concurrency

12.) Quincy’s Restaurant, the game

I offer this as a fun thought-experiment for thinking about the issues mutability, immutability, and concurrency. The following is the start of a 12 part series. The next 11 parts will dive into code, but this part is purely at the level of analogy. At the end of this series we will build a game that simulates Quincy’s Restaurant.

Have you ever seen this image?

I saw this on a forum somewhere. This image was offered as an explanation of the differences between concurrency and parallelism. But there is a problem: this makes parallelism appear orderly, whereas concurrency looks like a chaotic mess. And that is incorrect, right?

Leah McCloskey and I worked out a set of images that were slightly better:

Having done this much, I thought it would be good to have a story to think about how we handle concurrency in real-life, and why is it so difficult to translate our real-life experiences to software. The difficulties have two sources:

1.) the difficulty of formalizing implicit or intuitive knowledge

2.) the lack of support for concurrency in nearly all programming languages

We can not do much about #1, but as software developers it is our job to worry about #2.

So I wrote out a parable about running a restaurant, and we created some images for that parable. A warning: I don’t want to repeatedly write “concurrency and parallelism” so I treat parallelism as a sub-category of concurrency.

We can not live without concurrency

None of us would survive a single day if we had to live without concurrency. In that sense, concurrency is the most natural thing in the world, literally as easy as breathing (since so many things in our body have to happen concurrently: we have to breath, and our heart has to beat, and our liver has to purify our blood, and we have to avoid running into objects). And yet somehow, when we try to write software, concurrency becomes a subject of great complexity.

Joe Armstrong, the inventor of the Erlang language, said this better than anyone:

The word concurrency refers to sets of events which happen simultaneously. The real world is concurrent, and consists of a large number of events many of which happen simultaneously. At an atomic level our bodies are made up of atoms, and molecules, in simultaneous motion. At a macroscopic level the universe is populated with galaxies of stars in simultaneous motion.

When we perform a simple action, like driving a car along a freeway, we are aware of the fact that there may be several hundreds of cars within our immediate environment, yet we are able to perform the complex task of driving a car, and avoiding all these potential hazards without even thinking about it.

In the real world sequential activities are a rarity. As we walk down the street we would be very surprised to find only one thing happening, we expect to encounter many simultaneous events.

If we did not have the ability to analyze and predict the outcome of many simultaneous events we would live in great danger, and tasks like driving a car would be impossible. The fact that we can do things which require processing massive amounts of parallel information suggests that we are equipped with perceptual mechanisms which allow us to intuitively understand concurrency without consciously thinking about it.

When it comes to computer programming things suddenly become inverted. Programming a sequential chain of activities is viewed as the norm , and in some sense is thought of as being easy, whereas programming collections of concurrent activities is avoided as much as possible, and is generally perceived as being diffcult.

Quincy’s Restaurant

Let’s talk, for a moment, about the (fictional) restaurant known as Quincy’s.

When Quincy’s first started, it was the smallest restaurant in town – barely a hole in the wall. It had only four tables, and there was only need to hire one waiter and one cook. Becca, the manager, had an easy job; all she had to do was pay the bills. She did not need to organize the waiters, because there was only one waiter. The job of the waiter was demanding, but at a conceptual level it was straightforward:

1.) when customers come in, tell them they can sit anywhere

2.) then present them with a menu

3.) after a few minutes, go ask them what they want

4.) next, go tell the cook what the customers want (put the written order into a queue that the cook will pull from)

5.) a few moments later, take the food from the cook to the customers

6.) collect payment from the customers

7.) after the customers leave, collect all of their dishes and take them to kitchen to be washed

This is a simple algorithm. At a restaurant that only has one waiter and one cook, the staff simply needs to run through these steps over and over again, as fast as possible.

Everyone loved Quincy’s, and the place became very popular. However, it tended to attract a particularly mutable type of customer, one who loved to change their minds after they placed an order. A customer might come in and order an omelet with ham and bacon, but then, a moment later, they might suddenly decide they are vegan and therefore only want rice cakes and hummus for breakfast. Becca told the staff that all such requests must be greeted with a smile. Customer service was important at Quincy’s, and therefore, all orders were mutable up to the moment the food was delivered. Whenever a customer changed their mind, the waiter would walk back to the kitchen, find the ticket that held the customer’s order, cross out the old order and scribble in the new. The cook would sometimes be baffled by all the scribbling, but since there were only four tables, the waiter and the cook could always talk things over until they achieved clarity.

Later, Quincy’s built an addition. Now it had twelve tables and three waiters (Svati, Marcus, and Claudette) and two cooks (Timothy and Kwanlo). Becca thought about implementing some complex rules about how the customers would be divided among the waiters, but ultimately she decided that informality was still the best policy: “Be friendly with each other and use common sense.” And this turned out to be just fine. The waiters were able to solve all the problems they faced in an ad hoc manner, inventing informal rules to fit each situation and changing those rules whenever common sense suggested that they should.

Customers told their friends about how great Quincy’s was, and the place grew and grew. They built another addition, and then another, and then another. Pretty soon the restaurant had 40 tables.

Sadly, the simple ad hoc method of dealing with problems, which Becca favored, did not scale up very well. This was a lesson that Becca had to learn the hard way. In retrospect, she would later admit that she had been too slow to recognize the need for change.

A big part of the problem was the mutable customer orders. Any customer could change their mind at any time — a policy which had helped the restaurant establish its reputation as the most pleasant restaurant in the city. However, now that Quincy’s was big, the mutable customer orders were driving the staff completely crazy. Kwanlo was especially upset after he got eight orders in a row that started off as requests for a ham and bacon omelet and then mutated into requests for rice cakes and hummus! Timothy got an order for steak that mutated into an order for alfalfa sprouts topped with dandelion petals, and then an order for alfalfa sprouts topped with dandelion petals that mutated into an order for steak! Kwanlo admitted that every time he went to the refrigerator to get supplies, instead of just getting the food that had been ordered, he would pull out all of the food that the order might mutate into. Instead of getting one thing, he would get a dozen things. Instead of just ham or bacon or eggs, he would get ham, bacon, eggs, rice cakes, hummus, steak, alfalfa sprouts, dandelion petals, tofu, chicken, soybeans, kale, green peppers and potatoes. It was slowing him down! He always had to grab too many resources!

When the customers can change their mind on a whim, the staff have to be on guard against unexpected change.

If a cook read an order, and then the order changed, how would the cook know to read the order again? If a waiter initially wrote “rice cakes and hummus” but then later scribbled that out and wrote “ham and bacon omelet” would the cook see the change? Wouldn’t it make sense to establish some pattern that would notify the cooks of changes?

Things were even more complicated when the cooks divided up the work. This was highly efficient when the order didn’t mutate, but it slowed them down when the order did mutate. If they got an order for a ham and bacon omelet, Timothy would get the eggs and Kwanlo would get the bacon and ham. If they got an order for rice cakes and hummus, Timothy would get the rice cakes while Kwanlo prepared the hummus. But a few times an order mutated right in the middle of their work, and only one of them saw the change, so they ended up with ham and bacon and rice cakes. (A computer developer would say that the customer’s order was now “in an inconsistent state”.)

Finally, the staff rebelled. They took their complaints to Becca. They had seen one too many rice cakes piled high with bacon and ham, one too many kale and dandelion salads with a slab of very rare steak on top. And Becca listened. She was sad to see the end of the old informality, which had helped Quincy’s achieve its initial success, but she realized it was time for a change. Would it be possible to make customer orders completely immutable? Becca decided they would experiment for a week.

Immutability changed everything. Now, when Timothy got an order for alfalfa sprouts topped with dandelion petals, he knew that the only ingredients he needed to be concerned with were alfalfa sprouts and dandelion petals. When Kwanlo got an order for a ham and bacon omelet he would run back to the refrigerator to grab only those three ingredients, and if he noticed the rice cakes he would laugh triumphantly at them and shout, “Not this time, Rice Cakes! Not this time!” Even more importantly, when Timothy and Kwanlo divided up the work, they knew that when they both got back to their work space, they would have the right ingredients – there was no more risk of the order changing half way through, with only one of them seeing the change.

However, as it turned out, they could not get rid of all mutability. They had to be pragmatic. For instance, a woman came in and ordered the alfalfa sprouts and dandelion petals, and only after the order had been sent to the cooks did the woman think to inquire what was in the dressing on top of the salad. Her waiter, Marcus, told her the dressing was a mix of herbs, but also contained peanut oil. The woman explained that she was deathly allergic to peanuts. Moreover, Quincy’s was at fault for not having listed all possible allergens on its menu. In this case, some change of the order had to be allowed – the woman wanted oil and vinegar to be substituted for the usual dressing.

Since it was impossible to eliminate all cases of mutable orders, Becca decided that, at the very least, changes of orders would be handled with some strict rules. Specifically, the waiters would no longer write notes about a change on the original order form. Under the old system it had been too easy for a cook to overlook the fact that an order had changed. Now, when an order changed, the waiter would create a completely new ticket, and the old ticket would be moved to a different queue and marked “Old Value.” When a cook saw a ticket on the “Old Value”, queue they knew there would be some other ticket with “New Value”; therefore the cook would feel confident that they could completely ignore the ticket marked “Old Value,” and focus on creating the meal described by the ticket marked “New Value.” The ingredients for the “Old Value” meal could be taken back to the storage room, whereas the ingredients for the “New Value” meal needed to be fetched. So, for the woman with the peanut allergy, instead of scribbling a note on the old ticket requesting, “Please change the dressing to oil and vinegar,” Marcus created a new ticket with the full order: “Alfalfa sprouts and dandelion petals salad topped with oil and vinegar.”

Immutability as the default, with only rare occurrences of carefully-controlled mutability, resolved the problems that too much popularity had created.

The Prince of Doderstod marries the actress Nadine

Then Becca decided to increase profitability by going into the catering business. The Prince Of Doderstod had fallen in love with a beautiful American actress known as Nadine. It was agreed that the wedding ceremony should be held in Nadine’s home town, which just happened to be the town where Quincy’s was located. Becca bid to cater the wedding. While she was delighted to win, Becca became greatly concerned about the scale of the event. The Prince had invited all the nobility of Doderstod, and Nadine had invited every celebrity in Hollywood. Becca was told there would be 4,000 people attending the reception.

Becca understood that the old informality was inadequate to handle an event of this magnitude. Whereas “Be friendly and use common sense” was the perfect rule when Quincy’s had only three waiters, a new set of rules would now be needed. Becca decided that in this case, rigid allocations of guests would be the only way forward: there would be 200 tables, each holding 20 guests, and there would be one waiter for each table.

But the rigidity of the plan sealed its fate. Three big problems came up:

1. The celebrities in Hollywood were total flakes and many of them didn’t show up. There were some tables that only had two guests!

2. The professional caliber of the waiters varied greatly. That is to say, they were not homogenous. Each had a unique mix of strengths and weaknesses. For instance, some waiters had Awesome Memories But Uselessly Slow Heels (these we will call the AMBUSHs), whereas others had Near Amnesia yet Incredible Feet (these we will call the NAIFs). The AMBUSHs could easily handle twenty customers but they could only handle one table. The NAIFs could only handle ten guests but could easily handle them at five different tables.

3. At large enough scale, unexpected events can be expected to happen all the time. Sometimes waiters twist their ankles and have to go home, and then other waiters need to take over their work, and therefore a rigid allocation of waiters to tables doesn’t work.

At the wedding reception, some waiters were assigned a table that only had two guests. Some waiters went home and others were asked to cover their tables, so some waiters had 40 guests to take care of. Half-way through the wedding Becca was forced to once again resort to temporary ad hoc rules (just as many computer developers are forced to write a bunch of one-off rules to deal with a crisis). Pure chaos ensued. The guests grumbled about food being cold. The waiters complained about being either under-worked or over-worked.

Somehow, they got through the wedding. And regardless of the catering fiasco, Quincy’s was popular and they kept building expansions. Becca was determined to figure out the right way to handle all of the people.

The simplest way to handle concurrency is to not have any concurrency. If Becca only hired one waiter, then there would be no need to have a policy about how to distribute work among waiters! However, that would be a monumentally stupid policy, since one waiter cannot possibly take on all of the work involved! On a busy night Quincy’s might nowadays have 320 guests sitting at 80 tables.

So there had to be multiple waiters, and the work must be divided among them somehow. But how?

What if the waiters’ names were kept on a list – and each time a party of customers came in, the party was assigned to the next waiter on the list? That is, a pure “round robin” strategy. So one party comes in and is assigned to Claudette, and then another party comes in and is assigned to Svati, and then another party comes in and is assigned to Marcus, and so on. What could be more fair?

But wait, there is a problem. What if, by random chance, all the parties assigned to Claudette have eight customers in them, whereas all the parties assigned to Marcus have just one customer in them? Claudette is overwhelmed, whereas Marcus is bored and wants more customers.

Surely it would be more fair to the waiters to split up all the friends who came to Quincy’s and apply a “round robin” strategy to the individual customers that came in the door? That would solve the problems created by the unequal sizes of the groups of patrons arriving for a meal. This fine-grained strategy directs that when, for example eight friends come in together, one person is assigned to one of Claudette’s tables, one person is assigned to one of Svati’s tables, one person is assigned to one of Marcus’s tables, and so on, until all eight have been assigned. Oddly enough, the customers rebelled against this idea. It turned out that when a group of friends decided to go have dinner at Quincy’s, they expected to sit together. They did not want to be broken up and forced to sit with strangers at different tables. (A software developer might say that there are some data sets that resist easy division – indeed, tasks of uneven workloads are one of the reasons why concurrency is so hard.)

Another approach would be to hire a specific waiter for each party that comes to Quincy’s. But wait! If, in the course of a day, 500 parties visit the restaurant, then 500 waiters will have to be hired! This is a disaster for several reasons: on the one hand, the waiters are bored and they want more work; on the other hand, the paperwork involved in hiring 500 waiters would overwhelm Becca and keep her from all the other important work that she needs to do. Clearly, the restaurant needs to hire a fixed number of waiters, and the work needs to be divided among that finite group.

(For some reason, this is the kind of mistake that it is easy to make in software, yet no one ever makes this mistake in real-life. In software, I have made this many times myself – if I accidentally write my code such that a new thread is spawned for every task, then I rapidly exhaust all the resources available to me on my machine, at which point I have no option but to kill the app and reboot it. The 500 waiters are a metaphor for 500 threads that are initiated but never reused and never get garbage collected – and resources that are never released will eventually exhaust all the resources of your machine. Just as one should use a fixed-size thread pool in one’s software, and re-use the threads to handle additional work, so to, when running a restaurant, one should hire a fixed-size pool of waiters, and re-use them to handle more than one group of customers.)

Becca had learned this much:

1.) At a certain scale, informal rules break down, at which point there is a need for a formal system for dividing customers among waiters.

2.) Friends who come in together are unhappy when asked to split up and sit at different tables. Wisdom dicatated that the best approach was to deal with the parties in a coarse-grained manner, allowing each group, regardless of its size, to sit together.

3.) A rigid allocation of tables to waiters was the wrong approach. Because parties vary in size, the work loads are lumpy.

4.) AMBUSHs should get a small number of tables with many customers, while the NAIFs should get many tables with few customers.

What about the idea of waiters pulling work from a central queue of parties of customers? (A pattern usually given a name such as “Worker/Queue Demand Pull”)? Each waiter asks for more work as they are ready for it. New assignments are handed out in the order in which the waiters demand them.

This might be flexible enough to deal with the restaurant’s varying needs. The work is distributed so as to roughly average out the number of customers per waiter, even though the parties need to be assigned as a whole to each waiter. For instance, in the morning, when things are slow, and the only waiters on duty are Svati, Marcus, and Claudette, if the first party that comes in has eight people, it goes to the first waiter on the list, which in this case might be Svati, and if the next party has only two people, it goes to Marcus, and if the next party has only two people, it goes to Claudette, and if the next party has only six people is goes to Marcus, and if the next party has only two people it goes to Claudette, and if the next party after that only has four people, it again goes to Claudette, because Svati has eight customers, and Marcus has eight customers, and so Claudette needs four more customers to catch up to the other waiters. In other words:

# of People in Party Waiter

8 Svati
2 Marcus
2 Claudette
6 Marcus
2 Claudette
4 Claudette

So now they each have eight people. Whoever is least busy gets the next party. And then, when Svati’s party of eight leaves, Svati gets the next few parties, until she has as much work as the other two.

A pure (not random) round-robin strategy would stupidly give more work to Svati when she is already busy, and would not give work to Claudette even when she is bored:

# of People in Party Waiter

8 Svati
2 Marcus
2 Claudette
6 Svati
2 Marcus
4 Claudette

This gives Svati 14 customers while Claudette only has 6 customers. A simple round-robin strategy is easy to implement, but if the tasks are uneven, then it is an inefficient strategy.

You can fix a lot with a queue

There is a final issue to consider: AMBUSHs should get a small number of tables with many customers, while the NAIFs should get many tables with few customers. At large scale, the different abilities of one’s workers can have a large impact on how fast work gets done. In a situation like this, it might be wise to actually have two queues, one for small parties (which get handled by the NAIFs), and one for large parties (which get handled by the AMBUSHs). Whether there is an efficiency gain by breaking up work into multiple Demand Pull queues is something which, often, can only be determined by empirical testing.

The Worker/Queue Demand Pull strategy also gives us a way to deal with the problem that came up at the wedding, where some waiters had to go home. When a task is left incomplete, and the worker who was working on it “dies,” we just put the task back on the queue, and have it assigned again to the next waiter who is idle and looking for work.

A large category of software can be structured with this pattern. Messages are put on the queue. The word “messages” can refer to many things: an image, a JSON document, a serialized object, a simple string recording some information, such as an email address — almost any kind of data. There are many libraries that help implement queues, and also independent software that can be used as queues (see the Github post that introduced Resque for an overview of some the reasons they used, but then rejected, various different kinds of queue software). “Workers” (little software robots) will then pull those messages off the queue and do whatever work needs to be done according to each individual message.

(At the end of this series we build a game that simulates Quincy’s Restaurant. There are a few ways to model what we’ve talked about here. We will start simple, in the 3rd post, using vectors as if they were queues. That’s a bad idea in real-life code, but it can be a simple way to get started. At the end of the series, we will build the game in a more serious fashion.

customers -> waiters -> cooks -> waiters -> customers, with possible mutations to orders and possible “dying” waiters, could be seen as an example of the recursive Event/Consequences Pattern that Stuart Sierra talked about in his 2012 presentation to StrangeLoop. The customers are the Event, and they set off a chain of Consequences: a waiter takes an order, the cooks cook the food for that order, the waiter takes the food to the customers, the customers eat the food, the customers (hopefully) pay for the food, the customers leave the restaurant and tell their friends that Quincy’s is the best restaurant in the city, thus causing even more customers to walk in the door, thus setting off another round of Events and Consequences. Possible side effects exist at each stage, including killing/mutating orders and waiters who injure themselves and have to go home, necessitating the re-assignment of their customers.)

The parable in images

So now we can revisit these subjects, in a graphical form.

Good Parallelism:

Bad Parallelism:

Good Concurrency:

Bad Concurrency:

Rough rules of thumb for Concurrency

I’m leaving some stuff out because this is the intro and I will talk about these subjects more in the next few posts. But I would like to forestall the thousand comments on Reddit where people explain how He’s Doing It Wrong, so I’ll add these next two paragraphs in a partial attempt to placate them:

Concurrency and Parallelism are not mutually exclusive. I have often written software that divided up tasks among workers in a concurrent manner, and then each worker further split up its task in a parallel manner. We will soon look at an example of this.

I once went to a job interview where the guy asking me questions asked me the difference between concurrency and parallelism, and I gave an answer similar to the one above. He pounced on my answer and said I was all wrong; in his view, concurrency is an architectural decision that leaves the scheduling of tasks to the operating system, whereas with parallelism the programmer is still in charge of scheduling and is forcing the work to be broken up and run on multiple CPUs “right now”. From that point of view, concurrency is a strategy, whereas parallelism is a tactic.

I doubt we gain much from such strict definitions. Most of the time, when writing software, we simply want to remember these rough rules of thumb:

1.) if you have a task that can be split into sub-tasks, all of which can be done in about the same amount of time, then parallelism is a good approach.

2.) if you have some workers (that is, some functions) that handle a type of task, but that task can greatly vary in size (or rather, time), then concurrency might be a great approach

3.) if you have many tasks which take some time to setup but which then run quickly then concurrency might be a bad approach. That is, if switching costs are “high”, then concurrency is probably a bad approach. What do I mean when I talk about high switching costs? Ultimately, you can test concurrent and non-concurrent approaches, and empirically figure out which is fastest, but as a general rule, if the time needed to switch to the next task is a significant percentage of the time needed to run the task, then perhaps your switching costs are too high, and concurrency is therefore a bad option for you.

What kind of concurrency do I mean?

For the remainder of this series, unless I specify a particular kind of concurrency, I’m probably referring to the overall concept, broadly understood. That means light-wight threads in a language like Java, or async calls in NodeJS, and sometimes I might even mean spinning up children processes of the current processes, which, for a long time, was the only way to get concurrency in Ruby and Python.


I offer a huge “Thank you” to Natalie Sidner for the tremendous editing she did on the rough draft of this post. To the extent that this article is readable, it is thanks to her. Any mistakes are entirely my fault, and I probably added them after she was done editing. If you need to hire a good editor, contact Natalie Sidner at “nataliesidner at gmail dot com”.

Also, I thank Blanche Krubner for reviewing this work. As Mrs Krubner studied computer programming during the 1970s, I found it fascinating to get feedback from someone whose views of the discipline were shaped during a different era.)