December 19th, 2012
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: email@example.com
Tony Arcieri has a long post about how Ruby can adapt to the new multi-core, multi-threaded world we live in. He has many smart suggestions about how Ruby can be changed, but I am curious why anyone would even bother. Why not simply use Clojure? If you want a language with the advanced meta-programming of Ruby, but one that has already implemented Arcieri’s ideas (or made them irrelevant with better ideas) then the obvious solution is to use Clojure. It’s important to realize how weak some of Arcieri’s suggestions are: even if they were fully implemented, the programmer is still left manually tracking stuff such as which thread has ownership of an object. Compared to what dosync offers in Clojure, that is pathetic. Remember when Ruby was on the cutting edge of the best ideas for increasing programmer productivity? Even if all of Arcieri’s ideas are implemented, Ruby would still be left with less facilities for multi-threaded apps than Clojure.
These are some of Arcieri’s ideas:
Copying object graphs every time we pass a reference to another thread is one solution to providing both mutability and thread safety, however making copies of object graphs is a lot slower than a zero copy system. Can we have our cake and eat it too: zero-copy mutable state objects that are free of any potential concurrent mutation bugs?
There’s a great solution to this: we can pin whole object graphs to a single thread at a time, raising exceptions in other threads that may hold a reference to any object in the graph but do not own it and attempt to perform any type of access. This idea is called ownership transfer.
The Kilim Isolation-Typed Actor system for Java is one implementation of this idea. Kilim supports the idea of “linear ownership transfer”: only one actor can ever see any particular object graph in the system, and object graphs can be transferred wholesale to other actors, but cannot be shared. For more information on the messaging model in Kilim, I definitely suggest you check out the portion of Kilim-creator Sriram Srinivasan’s talk on the isolation system Kilim uses for its messages.
Another language that supports this approach to ownership transfer is Go. References passed across channels between goroutines change ownership. For more information on how this works in Go, I recommend checking out Share Memory By Communicating from the Go documentation. (Edit: I have been informed that Go doesn’t have a real ownership transfer system and that the idea of ownership is more of a metaphor, which means the safety guarantees around concurrent mutation are as nonexistent as they are in Ruby/Celluloid)
Ruby could support a similar system with only a handful of methods. We could imagine Object#isolate. Like the other methods I’ve described in this post, this method would need to do a deep traversal of all references, isolating them as well so as to isolate the entire object graph.
Moreover, to be truly effective, isolation would have to apply to any object that an isolated object came in contact with. If we add an object to an isolated array, the object we added would also need to be isolated to be safe. This would also have to apply to any objects referenced from the object we’re adding to the isolated aggregate. Isolation would have to spread like a virus from object-to-object, or otherwise we’d have leaky bits of our isolated aggregate which could be concurrently accessed or mutated without errors.
If a reference to an isolated object were to ever leak out to another thread, and that thread tried to reference it in any way, the system would raise an OwnershipError informing you that an unpermitted cross-thread object access was performed. This would prevent any concurrent access or mutation errors by simply making any cross-thread access to objects without an explicit transfer of ownership an error.
To pass ownership to another thread, we could use a method like Thread#transfer_ownership(obj) which would raise OwnershipError unless we owned the object graph rooted in obj. Otherwise, we’ve just given control of the object graph to another thread, and any subsequent accesses by ourselves will result in OwnershipError. If we ever want to get it back again, we will have to hand the reference off to that other thread, and the other thread must explicitly transfer control of the object graph back to us.