Clojure has mutable state

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

Warning: This post is an intro to mutable state in Clojure. If you already know Clojure, you can skip this. If you have no interest in learning Clojure, you can skip this.

This is part 3 of a 12 part series:

1.) Quincy’s Restaurant, a parable about concurrency

2.) Why I hate all articles about design patterns

3.) Clojure has mutable state

4.) Immutability changes everything

5.) Mutable iterators are the work of the Devil

6.) Get rid of all Dependency Injection

7.) Sick politics is the driving force of useless ceremony

8.) Functional programming is not the same as static data-type checking

Interlude

9.) Inheritance has nothing to do with objects

10.) Is there a syntax for immutability?

11.) Immutability enables concurrency

12.) Quincy’s Restaurant, the game

What does immutability look like?

If you can forgive the irony of the joke, there is a lot of truth to the saying, “Immutability changes everything.” It changes the idioms and the design patterns we rely on. It changes the kinds of architectures we tend to favor. And most developers eventually report that it changes the very way they think.

The goal of this series is to make clear how powerful an idea immutability can be. A lot of my examples will be in Clojure, and yet my audience is all the folks who are working with Javascript, Ruby, PHP, and other highly mutable languages. So, before we discuss immutability, we should talk about mutability in Clojure. But before we do that, we should probably offer an example of what immutability looks like.

In the book Programming Clojure, Stuart Halloway writes:

Clojure is high signal, low noise. As a result, Clojure programs are short programs. Short programs are cheaper to build, cheaper to deploy, and cheaper to maintain. This is particularly true when the programs are concise rather than merely terse. As an example, consider the following Java code, from the Apache Commons:

  public class StringUtils {
    public static boolean isBlank(String str) {
      int strLen;
      if (str == null || (strLen = str.length()) == 0) {
        return true;
      }
      for (int i = 0; i < strLen; i++) {
        if ((Character.isWhitespace(str.charAt(i)) == false)) {
          return false;
        }
      }
      return true;
    }
  }

The "isBlank()" method checks to see whether a string is blank: either empty or consisting of only whitespace. Here is a similar implementation in Clojure:

		    
 (defn blank? [s] (every? #(Character/isWhitespace %) s))

The Clojure version is shorter. More important, it is simpler: it has no variables, no mutable state, and no branches. This is possible thanks to higher-order functions. A higher-order function is a function that takes functions as arguments and/or returns functions as results. The "every?" function takes a function and a collection as its arguments and returns true if that function returns true for every item in the collection.

Because the Clojure version has no branches, it is easier to read and test. These benefits are magnified in larger programs. Also, while the code is concise, it is still readable. In fact, the Clojure program reads like a definition of blank: a string is blank if every character in it is whitespace. This is much better than the Commons method, which hides the definition of blank behind the implementation detail of loops and if statements.

Why Clojure?

Throughout this series, for most of my examples of Functional code, I give examples in Clojure. Why only Clojure? I am studying Erlang, and I am fascinated with Elixir, but I don't feel competent to give examples in those languages. I don't want to offer flawed code when I'm trying to offer good examples.

We will also look at Facebook's Immutable.js. Facebook has done a great job of bringing some of the best ideas of the Functional world to Javascript.

What follows is somewhat informal. Check out Kyle Kingsbury.'s overview for a more formal treatment.

Clojure has mutable state

Nearly all software needs to keep track of some data which changes over time. Clojure, like any computer programming language, has ways of tracking data as it changes. Let's look at the four possible types of var which might contain mutable state in Clojure:

1.) var created by 'def'

2.) atom

3.) agent

4.) ref

All of these are automatically global, unless you mark them private:

(def ^:private conversations (ref {}))

Being private means they can only be seen in the namespace where they are defined. This enables what I call the Pseudo Object Oriented style. But namespaces are much simpler than objects -- you don't initiate them with "new" and you don't have instances of them. I offer an example on Github of what I mean.

1. var created by def. Most of the time a var should be treated as a constant. If we find ourselves calling def on the same var multiple times, then we have probably done something wrong. There are a tiny handful of exceptions to this situation, but generally, treat vars as constants. We might create a var called "users", like this:

(def users (agent {}))

This establishes that "users" is a var that holds an agent. The content of the agent might change over time, but we are probably making a mistake if we find ourselves later doing something like:

(def users ["Susan" "John" "Megan"])

I tend to treat a var created with def as if it was a constant, whose value is set at compile time. However, the ability to redef a var exists.

An interesting use for these vars is when you mark the var as dynamic. Then it can hold different values in each thread. To demonstrate this, I created a simple app:

(def users ["lisa" "meghan" "kwan" "svati"])
(def ^:dynamic banned ["higgins" "shon" "suela" "metz"])

(defn change-in-another-thread []
  (future
    (def users ["bill" "kirti" "carlitta" "mercer"])
    (Thread/sleep 10000)
    (println users))
  (println users))

(defn change-in-another-thread-2 []
  (future
    (def banned ["henson" "carlos" "vendia" "lola"])
    (Thread/sleep 10000)
    (println banned))
  (println banned))

(defn -main [& args]
  (change-in-another-thread)
  (change-in-another-thread-2))

Which gives me output that looks like this:

That is:

[bill kirti carlitta mercer]
[higgins shon suela metz]
[bill kirti carlitta mercer]
[henson carlos vendia lola]

In other words, we never see the initial value of "users":

["lisa" "meghan" "kwan" "svati"]

but we do see the initial value of "banned":

["higgins" "shon" "suela" "metz"]

This is because we change the value of "users" in "change-in-another-thread" -- the change we make effects its value in all threads.

However, we do see the initial value of "banned" because it is dynamic. In the main thread, where our app starts, "banned" keeps its original value. But in the other thread, created by "future", we change the value of "banned".

In the debate over Dependency Injection, there are those who use this approach to configure their apps. We will talk about this a lot more in the article on Dependency Injection.

2. atom. This is just like a global variable, except any change in its value is guaranteed to be atomic; we don't have to worry about the situation where two functions, in two different threads, try to update the same variable at the same time, with the final result being some kind of incomprehensible gibberish. In most languages we would achieve atomic updates via the acquisition and management of locks, but in Clojure that is done for us.

Consider the productivity benefits of having Clojure manage the locks for us. Many developers now work with languages that manage the memory for us -- garbage collected languages which free us from the burden of having to manually worry about RAM. We've gotten used to this. Garbage collected languages have been around for a few decades. Programmer productivity is boosted by such languages. Clojure takes the next step and automates aspects of concurrency for us. This offers yet another boost to productivity.

3. agent. This is just like an atom, but updates to an agent are asynchronous. If we call "send" on several agents, we don't know in what order the functions will execute. In other words, if we do this:

(def users (agent {}))
(def movies (agent {}))
(send users find-users-with-favorite-movies)
(send movies update-movie-rankings-based-on-user-favorites)

we have no idea what order these will execute in. If we need these to execute in order, then an atom would be a better bet.

I'm using "send" in this somewhat silly example. I could also use "send-off". The rule for using "send" and "send-off" is that "send-off" is preferred for functions which block on IO, whereas send is for those functions which block on CPU. If I was hitting a database then I should probably use "send-off".

Each agent has its own work queue, which executes in order, so these three functions should execute in the order they are being called:

(send users calculate-winnings-per-category)
(send users calculate-winnings-total)
(send users remove-the-losers)

The only trick that comes up is what might happen when a function such as "calculate-winnings-total" throws an exception? Can work proceed? For this situation we have restart-agent:

When an agent is failed, changes the agent state to new-state and then un-fails the agent so that sends are allowed again. If a :clear-actions true option is given, any actions queued on the agent that were being held while it was failed will be discarded, otherwise those held actions will proceed.

At the end of this series of essays I'm going to build a multi-threaded example of Quincy's -- The Game, and I'll show how agents might be used as one possible approach.

Agents bring up all the normal scheduling issues that asynchronous code involves, but Clojure is rich in tools for making those issues fairly simple to deal with. For instance, suppose a user wants to change their password, and suppose our system has one agent for each user. We might write 5 functions to accomplish the following:

1.) log their request in a vector inside the agent

2.) check to see if this is a possible brute-force hacker attack, by checking the history in that log vector

3.) generate a special key to email to them

4.) store the key in the agent

5.) send the user a confirmation email

The first four functions would be functions that we send to the agent. The fifth function has nothing to do with the agent, though it does need to happen after the other four functions have executed. That raises the issue, how can outside code know when a series of asynchronous updates are complete? For that, Clojure enables "watches", which we will talk about later.

What if, while the agent is working on its four functions, the user again initiates an attempt to update their password? In that case, we might want to dismiss our four functions and start them over again. To cancel all the work in the queue, we have the function "release-pending-sends" (whether this raises a security issue, or a worry about inconsistent data, would be a completely separate conversation).

4. ref. A ref is just like an atom, with one primary difference. We can wrap multiple refs inside a single transaction, and we can ensure that they either all update together, or they all fail to update together. This is referred to as "Software Transactional Memory" (STM). It is very similar to the way most database transactions work. Clojure gives us this ability inside our own app and makes it easy to use.

When I first heard of Clojure in 2009, STM was one of the things that people were most excited about. And yet, a surprise in the Clojure community has been how little STM gets used. Atoms are used more often. Why is that? Suppose we need to track Movies and Users, and suppose we will often need to update them at the same time. You could have two refs:

(def movies (ref {}))
(def users (ref {}))

Or, you could have a hashmap with these two keys:

{
:users ["Susan" "John" "Megan"]
:movies ["Citizen Kane" "Moscow does not believe in tears" "Dum Laga Ke Haisha" "The Seven Samurai"]
}

And we could store this hashmap in an atom:

(def media (atom {:users ["Susan" "John" "Megan"]
:movies ["Citizen Kane" "Moscow does not believe in tears" "Dum Laga Ke Haisha" "The Seven Samurai"]}))

Since any update to an atom is guaranteed to be atomic, we could update the :users and :movies in the hashmap, and the update will be automatically synchronized. In other words, we don't have to use refs to get synchronization.

But wait, aren't we violating some Design Patterns if we are stuffing unrelated data into the same var? Yes. This is a bit icky. I am a little surprised that this has become a pattern in the Clojure community. But it can be very convenient, and, so long as we put the right contract on the function that updates the atom, we can achieve perfect safety. So I guess this can be done? If the data is somewhat related then I suppose it can be justified. I would not put totally unrelated things together.

I rarely use refs, but I am amazed at how easy they are, considering what they enable. If you have experience trying to synchronize the mutation of multiple variables in Java, then you know how much of a pain that can be. Clojure makes this kind of change remarkable easy.

Last year I worked on a project where a salesperson could use our iPhone app to send messages to SalesForce, via our Natural Language Processing (NLP) engine, which converted their free-form text messages into the format that SalesForce understands. (That is, the messages went from the iPhone to our servers, then we did our NLP processing, and then we sent the processed messages to SalesForce.) For security reasons, we needed one app ("fudfs", fetch user data from salesforce) that allowed a user to login and get a token from us, and we needed a different app that let the user send messages to our NLP engine. The token we sent them was a UUID. We then used that UUID to track their messages in "conversations", a var where we tracked the messages they had sent us.

What happens when the user's credentials in SalesForce expired, and they had to login into SalesForce again, and get a new Oauth access token? How could we link their past messages with their future messages if we were using a UUID that was linked to a specific access token? That is, the UUID was the key in the hashmap that pointed to their conversation, and also, in user-tokens, it pointed to the SalesForce credentials. My original take on this update was somewhat clumsy:

(def ^:private conversations (ref {}))
(def ^:private user-tokens (ref {}))

(defn update-user-tokens
  "For the sake of security, when a message comes from the 'fudfs' channel in Redis, it will be a map containing :salesforce-credentials which will contain the user_id and a new token, but when a message comes from the 'api' channel in Redis (a normal sales meeting report), it will have a token but no user_id. We track conversations by token, because we don't have access to user_id for most normal messages. But to provide the user with a smooth experience, when the user logs in again, and their token changes, we need to link the old token to the new token.

  To put this from the point of view of an iPhone user: For security, the iPhone only ever sends the user_id to the '/token' endpoint, and so the nlp-housing only gets user_id from fudfs if the credentials sent to fudfs (by the iPhone) were valid. All the normal debriefs, which go from the iPhone to the api, at the '/chat' endpoint, contain only the token. Thus we have to do some work to be sure a user's conversation continues from old tokens to new tokens. Note that currently the iPhone app pings the '/token' endpoint every time the iPhone wakes up, so in the course of 10 minutes we might have see 2 or 3 different tokens from the same user."
  [redis-vector]
  {:pre [(vector? redis-vector)
         (= (count redis-vector) 3)]}
  (try
    (let [[type-of-message channel-name content-of-message] redis-vector]
      (when (= channel-name "fudfs")
        (let [salesforce-credentials (get content-of-message :salesforce-credentials :credential-error)
              user_id (get salesforce-credentials :user_id :credential-error)
              new-token (get salesforce-credentials :token :credential-error)
              old-token (get @user-tokens user_id)]
          (if (#{salesforce-credentials user_id new-token} :credential-error)
            (publish-credential-error "Message from fudfs is lacking either :salesforce-credentials or :user_id or :token" content-of-message)
            ;; I originally wrote this with 2 atoms and 2 swaps, but anytime you have 2 related swaps, its time to move to refs and dosync
            (dosync
             ;; now that we have old-token, we can move the old-conversation to a new place, under the new token
             (alter conversations
                    (fn [old-conversations]
                      (let [
                            ;; did the user have any earlier conversation?
                            this-users-old-conversation (get old-conversations old-token {})

                            ;; lets merge the stuff from fudfs into the users conversation, overwriting the old stuff with the new stuff where the keys are set
                            ;; Note: if this is the first time that the user has logged in, then this-users-old-conversation is an empty map and
                            ;; content-of-message has all the info that initiates the conversation
                            this-users-new-conversation (into this-users-old-conversation content-of-message)

                            ;; copy the old conversation to the new token
                            old-conversations (assoc old-conversations new-token this-users-new-conversation)

                            ;; delete the old conversation to conserve on memory
                            old-conversations (dissoc old-conversations old-token)]

                        old-conversations)))

             ;; then we overwrite the old token with the new token
             (alter user-tokens
                    (fn [old-user-tokens]
                      (assoc old-user-tokens user_id new-token))))))))
    (catch Exception e (timbre/log :trace " exception in update-user-tokens" e))))

I wrote this comment for my co-workers, but it is worth discussing here:

"I originally wrote this with 2 atoms and 2 swaps, but anytime you have 2 related swaps, its time to move to refs and dosync"

As far as that goes, it is true. Any time you are calling "swap!" twice in a function, on two atoms, then you should ask yourself if you really want to use refs and not atoms. (Or skip refs and just use one huge atom. Like I said, there are some talented software developers in the Clojure community who find it more convenient, and just as safe, to stuff the different kinds of data into a hashmap inside of a single atom.

About this:

"now that we have old-token, we can move the old-conversation to a new place, under the new token"

We used the UUID as the key in conversations, so when we changed the UUID we had to move the conversation to the new UUID. That is, we were using a UUID to track the access tokens we got from Salesforce, and we used the same UUID to track the user's conversation. But what happened when the Salesforce access tokens expired? We would need to get a new access token, and that would be linked to a new UUID. So we would need to move this users conversation from the old UUID to the new UUID.

At least, that was the system I came up with. It was more complicated than necessary. My co-worker eventually came up with a much better idea: we could have a private key for encryption and use that to encrypt the user's SalesForce username. Then instead of using UUIDs, we could use the encrypted username to link together all the messages they sent us. Then we thought "We won't ever have to change it! This could last forever!" but then we decided, to be very safe, we should change our private key every month or so. Still, using the private key simplified things.

Rules for Clojure's mutable state

We can summarize Clojure's mutable state with some rules of thumb:

1.) Most of the time we should only call "def" on a var once, unless the var is dynamic.

2.) We swap! out the old value in an atom and use a function to replace the old value with a new value.

3.) We send a function to an agent, and the function updates the value of the agent.

4.) Transactions with refs are just like transactions with a database.

Atoms are simple to use and they feel just like global variables, so new Clojure developers often use them. And that's fine, up to a point. If, however, we ever see code where a developer starts a new thread just to swap the value of an atom, we know they have made a mistake – they actually want to use an agent. If we ever see code where a developer wanted to update 2 atoms at once, then probably that developer really wanted to use refs.

Gary Verhaegen recently offered this summary:

These functions are not equivalent. Just like 'send' on an agent has fundamental differences with 'swap!', 'alter' has fundamental differences with both. There is also some similarity, of course: in all three cases, the intent is to "update" the "current" value by supplying a transformation function, but:

* 'swap!' will happen immediately, in the current thread, with no regard for whatever else happens;

* 'send' will happen at some point in the (hopefully near) future on a different thread;

* 'alter' can only be used within a transaction and will do some complex magic to coordinate changes to multiple refs.

Bear in mind that Clojure is built on top of a polymorphic foundation, so names like "atom-swap!" and "ref-swap!" would not really make sense; if the point was to call them the same, they could just be called the same, like 'conj' is the same for all data structures, and like 'deref', which does have the same semantics for atoms, refs, and agents, is called the same for the three of them.

What does all of this reveal? Should we conclude that Clojure has mutable state; therefore Clojure is not really a Functional language? Not exactly. Clojure is pragmatic. If you want purity, you should look at a language such as Haskell. Clojure encourages good habits, but does not force them on you. Clojure knows that changes to its data are a dangerous thing, so it tries to get us to show some care when making changes. We still end up mutating state, just like we would in any other language, but we do so in specific ways that tend to minimize mistakes. These differences might be subtle, but the overall effect is surprisingly powerful.

Lastly, I should say that despite the wealth of options given to us for managing state, Clojure developers often end up relying on the same tools (especially message queues) that Java or even Ruby programmers would use.

Quincy's -- The Game, the prototype

In the prototype, I use vectors as a sort of primitive queue, but I would never do that in real life. I'm doing it here because I have not yet talked about queues. Later I'm going to show another example of this code, where I use Zach Tellman's library "Manifold" to put orders onto a "stream". But I have not yet talked about queues or Manifold, so I will leave that out for now.

In the final article in this series I talk about concurrent ways to simulate Quincy's as a game, but here I only want to demonstrate mutability in Clojure, so this example is single-threaded.

Check out the repo for the prototype of Quincy's -- The Game.

We have not yet implemented the work of the cooks, or waiters delivering food. All we do in this version is show what it looks like when a customer changes their order. In the final version, we will see that when an order comes in, the cooks need to fetch ingredients from the Storage and move the ingredients to the Kitchen Counter, but when an order is cancelled the cooks need to move the ingredients from the Kitchen Counter back to Storage.

When I run:

java -jar target/quincy_the_game_1-1-standalone.jar

and then I let the app run for a minute, I end up with output such as:

The customer is unhappy!

We must change the customer's order!
{:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
 :customer "5f4da75a-3f38-4cf4-8035-e4cea5c8787c",
 :menu-item-name :deep-fried-dodo-bird}

We are going to remove this order: 
{:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
 :customer "5f4da75a-3f38-4cf4-8035-e4cea5c8787c",
 :menu-item-name :deep-fried-dodo-bird}

We are going to add this order: 
{:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
 :customer "5f4da75a-3f38-4cf4-8035-e4cea5c8787c",
 :menu-item-name :tofu-and-ginger}

The customer-orders: 
({:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "5f4da75a-3f38-4cf4-8035-e4cea5c8787c",
  :menu-item-name :tofu-and-ginger}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "fd5d4f82-8bcb-457b-8d7c-70e8f1ffb81b",
  :menu-item-name :deep-fried-dodo-bird}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "28281d16-6f40-4518-858c-2698984a18ca",
  :menu-item-name :deep-fried-dodo-bird}

 {:waiter "f63b3456-ab9c-40f8-a160-fbd567a9fd94",
  :customer "a3730af3-092b-4fe5-83a2-7a889ac7be27",
  :menu-item-name :deep-fried-dodo-bird}

 {:waiter "f63b3456-ab9c-40f8-a160-fbd567a9fd94",
  :customer "ab7fbca2-2abc-452e-b00d-8082dde05c79",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "f63b3456-ab9c-40f8-a160-fbd567a9fd94",
  :customer "07f2b7ec-a032-4560-a6a0-4c838d524b95",
  :menu-item-name :deep-fried-dodo-bird}

 {:waiter "f63b3456-ab9c-40f8-a160-fbd567a9fd94",
  :customer "fe95600c-8ade-474a-9613-6ccf2a228121",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "f63b3456-ab9c-40f8-a160-fbd567a9fd94",
  :customer "f88d40f5-6472-4ee6-8c2d-9938bebc4c0e",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "3d86cd08-eaa7-4f2d-8074-0764b7c588fb",
  :menu-item-name :ham-and-bacon-omelete}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "7be459e9-7d53-452b-80a1-27ba21bdcfd9",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "b1bb5c12-b43d-4dec-bc0a-f33082c0a6b6",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "df580ec3-aaf5-434a-919b-f3c455b59d7f",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "ef5f36f4-0305-41ed-abde-4e848d245286",
  :menu-item-name :ham-and-bacon-omelete}

 {:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "3bceb0a9-fe92-40ee-abb2-85ef8cfc3d19",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "b3c08145-ee73-4d36-bc04-f111442f5988",
  :menu-item-name :deep-fried-dodo-bird}

 {:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "9a734b56-5835-419d-b00a-026c60174c45",
  :menu-item-name :ham-and-bacon-omelete}

 {:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "71d97944-7b23-4667-baa0-93c720651f48",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "cb2248f8-fcfb-45f6-9b8a-868e45e3608b",
  :menu-item-name :tofu-and-ginger}
)


The customer-orders-cancelled: 

[{:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
  :customer "71d97944-7b23-4667-baa0-93c720651f48",
  :menu-item-name :ham-and-bacon-omelete}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "b1bb5c12-b43d-4dec-bc0a-f33082c0a6b6",
  :menu-item-name :tofu-and-ginger}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "28281d16-6f40-4518-858c-2698984a18ca",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "28281d16-6f40-4518-858c-2698984a18ca",
  :menu-item-name :deep-fried-dodo-bird}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "28281d16-6f40-4518-858c-2698984a18ca",
  :menu-item-name :alfalfa-sprouts-and-dandelion-salad}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "fd5d4f82-8bcb-457b-8d7c-70e8f1ffb81b",
  :menu-item-name :tofu-and-ginger}

 {:waiter "8cce8c53-0472-4c21-8d3e-b2d8ce3b888e",
  :customer "5f4da75a-3f38-4cf4-8035-e4cea5c8787c",
  :menu-item-name :deep-fried-dodo-bird}
]

If we look in customer-orders-cancelled we see customer's such as these:

{:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
:customer "71d97944-7b23-4667-baa0-93c720651f48",
:menu-item-name :ham-and-bacon-omelete}

and if we find that customer in customer-orders, we see that they have a new order:

{:waiter "aa23471c-cb1b-476a-becd-7f1bbcb2c88f",
:customer "71d97944-7b23-4667-baa0-93c720651f48",
:menu-item-name :alfalfa-sprouts-and-dandelion-salad}

Apparently they ordered ham-and-bacon-omelete but then suddenly decided that they were vegetarian, so they went with the alfalfa-sprouts-and-dandelion-salad. If we could ban all mutation, then running Quincy's would be much simpler, and software would be simpler, and indeed, life itself would be simpler. But we must allow some mutation, so the best we can do is guard it carefully.

How to improve the game?

When offering examples in an article I am often torn about how "perfect" the example should be. Should I include all error checking, or does that cloud the point I'm trying to make? I took several shortcuts, and I should call them out.

In real software I would write this:

(defn new-customers []
(lazy-seq
(cons
{:allergies (rand-int 2)
:vegetarian (rand-int 2)}
(new-customers))))

A lazy-seq is an infinite sequence. This is the same idea as generators in Python. You have a function that can generate each item in the sequence. Since it is "lazy" the realization is deferred.

This line:

(take (+ (rand-int 8) 1)

hardcodes that we want to grab 1 to 8 customers. I add "1" because otherwise the range is 0 to 7, but having a party with zero customers is silly.

I am slothful and this is an example, so I'm hardcoding the "take" inside the function:

(defn new-customers []
(take (+ (rand-int 8) 1)
(lazy-seq
(cons
{:allergies (rand-int 2)
:vegetarian (rand-int 2)}
(new-customers)))))

I would never do that in real software.

I call "new-customers" to generate parties. I'm again using "take" and hardcoding that we want from 0 to 2 parties.

(defn new-parties []
(map #(apply vector %)
(take (rand-int 3)
(lazy-seq
(cons
(new-customers)
(new-parties))))))

Here I am fine having zero be a possible value. In the final version of the game I'm going to treat each loop in "start" as equal to 5 minutes, and of course, at a restaurant, there will be spans of 5 minutes when no new customers come in.

This part:

(take (rand-int 3)
(lazy-seq
(cons
(new-customers)
(new-parties))))))

returns a bunch of lists inside of lists:

(({:allergies 0, :vegetarian 1}
{:allergies 1, :vegetarian 1}
{:allergies 0, :vegetarian 0})
({:allergies 1, :vegetarian 1}
{:allergies 0, :vegetarian 1}
{:allergies 0, :vegetarian 0}
{:allergies 0, :vegetarian 0}
{:allergies 0, :vegetarian 1}))

and we actually want vectors, so we do:

(map #(apply vector %)

Which gives us:

([{:allergies 0, :vegetarian 0}
{:allergies 0, :vegetarian 1}
{:allergies 1, :vegetarian 1}
{:allergies 1, :vegetarian 1}
{:allergies 1, :vegetarian 0}]
[{:allergies 1, :vegetarian 0}
{:allergies 0, :vegetarian 1}
{:allergies 0, :vegetarian 0}
{:allergies 1, :vegetarian 0}])

Here we have a list of 2 parties, the parties are vectors of customers, the customers are represented by maps. The first party has 5 customers, the second party has 4 customers.

In real-life software, this would be in one function:

(map #(apply vector %)
(take (rand-int 3)

and this would be in another function:

(lazy-seq
(cons
(new-customers)
(new-parties))))))

Again, I'm being slothful because this is an example.

With Functional programming it is common to have functions that are only three or four lines long. Perhaps because these functions do so much? Having small functions increase composability and reuse.

"allergies" can be 0 or 1, and "vegetarian" can be 0 or 1, which gives us a grid with 4 possibilities. Here I use pattern matching, because I think this is easy to read:

(defn customer-order [{:keys [allergies vegetarian _]}]
(match [allergies vegetarian]
[0 0] :ham-and-bacon-omelete
[0 1] :alfalfa-sprouts-and-dandelion-salad
[1 0] :deep-fried-dodo-bird
[1 1] :tofu-and-ginger))

I could have used something like a "case" statement, which in Clojure is "cond", but I think this is less readable:

(defn customer-order [{:keys [allergies vegetarian _]}]
(cond
(and (= allergies 0) (= vegetarian 0)) :ham-and-bacon-omelete
(and (= allergies 0) (= vegetarian 1)) :alfalfa-sprouts-and-dandelion-salad
(and (= allergies 1) (= vegetarian 0)) :deep-fried-dodo-bird
(and (= allergies 1) (= vegetarian 1)) :tofu-and-ginger))

The pattern matching is much more clear.

In this next line, I am relying on the destructuring that Clojure makes available:

(defn customer-order [{:keys [allergies vegetarian _]}]

This is similar to using the extract function in PHP. Very convenient. But, to be clear, Clojure destructuring goes way beyond what PHP's "extract" allows.

In add-orders-to-customer-orders I examine the insides of what is being handed to me:

(defn add-orders-to-customer-orders [orders]
  {:pre [
         (vector? orders)
         (map? (first orders))
         (:waiter (first orders))
         (:customer (first orders))
         (:menu-item-name (first orders))
         ]}

This is a step down the road to "structural typing", since I am looking inside the structure I am given and I am enforcing a contract not just on "orders" but also on the item inside of "orders". This goes beyond what normal static data-type checking does. (I know some of you will argue with this, and I'll address those arguments in the article Functional programming is not the same as static data-type checking)

Here we are making sure that the first order has a ":waiter" key and a ":menu-item-name" key and a ":customer" key. I assume that if the first order is correctly formatted, then they probably all are. If I thought it was necessary to check every order in orders, I would probably do that in the body of the function, as checking a whole sequence might be a bit much for an assertion. Or maybe I would. I've never done it, but I can imagine doing it.

I want to emphasize how simple and easy it can be to write these assertions. Unlike Ruby, I am not switching to a different part of my software to write unit tests. Unlike Java, I am not switching to different class definitions to specify the data types that I am expecting. Please consider how much work it would be to do this much checking using something like Java -- I'd have to establish that "add-orders-to-customer-orders" takes a Collection full of Order and then, in the file that defines the class Order, I would specify that an Order should have 3 fields, waiter, customer and order.

For me, I never got into Test Driven Development, but I sometimes write the assertions on a function before writing the body of a function. It is simply convenient. It is literally as easy as writing a note to myself about what I expect a function to do. It is as fast as writing a comment in English on the function, yet the meaning is less open to incorrect interpretations.

Of course, we would not want to run assertions in production, since they tend to slow the software down. A nice fact about the JVM and the compiler is that you can set a flag for the compiler and it will strip out the assertions. So you can run the assertions in development (or in production if you face one of those devilish bugs that only appears in production) but when things are stable you can run without the assertions.

The big win, in my opinion, is when I change my mind. I can just adjust the assertions, or remove them. This is easier than, in Ruby, re-writing a unit test or, in Java, re-writing several class definitions. I'm not suggesting we can do without unit tests, but these runtime contracts are very convenient, easy to write, and have the advantage of giving us insight about how our software encounters the real world (that is, we are not relying on mocks to imitate the real world).

Also, about this:

(apply conj previous-customer-queue parties)

If I wanted to add parties to customer queue I could do:

(swap! customer-queue conj parties)

but it is the individual "party"s inside of parties that I want to add to customer-queue. So I use "apply" to unpack what is inside of "parties".

You might have noticed that I have pre and post assertions on "customer-orders"

(defn customer-orders [party name-of-waiter]
  {:pre [
         (vector? party)
         (string? name-of-waiter)
         ]
   :post [
          (vector? %)
          (:waiter (first %))
          (:customer (first %))
          (:menu-item-name (first %))
          ]}

The post assertions here match the pre assertions of "add-orders-to-customer-orders" which is one way to indicate that I expect the return value of "customer-orders" to be fed to "add-orders-to-customer-orders".

About this:

(defn change-customer-order? []
(if (= (rand-int 4) 0)
true
false))

Should predicate functions this short be independent functions? It mostly depends on whether you are going to re-use the function. I often start off with stuff like this inline and then later I refactor it to its own function, if I need to reuse it.

Also, fans of static data-typing argue that we should add return types to all functions, but this is a good example of where doing so is a waste of time. The function is 4 lines of code, but could be rewritten as 2 lines of code. Do I really need the compiler to check this function for me? There are some functions where it feels like a waste of time to check the data-type of the arguments or the return type.

Conclusions

Here are some takeaways:

1.) Clojure has mutable state

2.) Clojure has a wealth of options for handling mutable state

3.) There are ways of guarding your mutations that are simpler and just as safe as those offered by Object Oriented Programming


(Acknowledgements:

I offer a huge "Thank you" to Natalie Sidner for the tremendous editing she did on the rough draft of this post. To the extent that this article is readable, it is thanks to her. Any mistakes are entirely my fault, and I probably added them after she was done editing. If you need to hire a good editor, contact Natalie Sidner at "nataliesidner at gmail dot com".

Also, I thank Blanche Krubner for reviewing this work. As Mrs Krubner studied computer programming during the 1970s, I found it fascinating to get feedback from someone whose views of the discipline were shaped during a different era.)

Post external references

  1. 1
    http://www.amazon.com/Programming-Clojure-Stuart-Halloway/dp/1934356867
  2. 2
    https://facebook.github.io/immutable-js/docs/#/
  3. 3
  4. 4
    https://github.com/lkrubner/quincys-the-game-single-threaded/blob/master/src/quincy_the_game_1/customer.clj
  5. 5
    https://github.com/lkrubner/demonstrate-def
  6. 6
    http://tech.puredanger.com/2014/01/03/clojure-dependency-injection/
  7. 7
    http://clojuredocs.org/clojure.core/restart-agent
  8. 8
    https://groups.google.com/forum/#!topic/clojure/H1MQ-XzuIvw
  9. 9
    https://www.haskell.org/
  10. 10
    https://github.com/ztellman/manifold
  11. 11
    https://github.com/lkrubner/quincys-the-game-single-threaded
  12. 12
    http://php.net/manual/en/function.extract.php
  13. 13
    http://blog.brunobonacci.com/2014/11/16/clojure-complete-guide-to-destructuring/
Source