Beware of Ruby libraries that generate way too many objects

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com

This is several years old and Ruby garbage collection has gotten better, but still, the point about certain libraries being excessive remains valid.

Be aware that you are allocating objects, for instance something as simple as 100.times{ ‘foo’ } allocates 100 string objects (strings are mutable and therefore each version requires its own memory allocation).

Make sure to evaluate the libraries you use, for instance switching a Sinatra XML rendering action from Builder to Nokogiri XML Builder saved us about 12k object allocations (Thanks Aaron Patterson). Make sure that if you are using a library allocating a LOT of objects, that other alternatives are not available and your choice is worth paying the GC cost. (you might not have a lot of requests/s or might not care for a few dozen ms per requests). You can use memprof or one of the many existing tools to check on the GC cycles using load tests or in dev mode. Also, be careful to analyze the data properly and to not only look at the first request. Someone sent me this memory dump from a Rails3 ‘hello world’ with Ruby 1.8.7 and it shows that Rails is using 331973 objects. While this is totally true, it doesn’t mean that 330k objects are created per request. Instead that means that 330k objects are currently in memory. Rubygems loading already allocate a lot of objects, Rails even more but these objects won’t be GC’d and don’t matter as much as the ones allocated every single request. The total amount of memory used by a Ruby process isn’t that important, however the fluctuation forcing the GC to run often is. This is why my middleware only cares about the allocation change during a request. (The GC should still traverse the entire memory so, smaller is better)

The more object allocation you do at runtime/per request, the more the GC will need to run, the slower your code will be. So this is not a question of memory space, but more of performance. If your framework/ORM/library/plugin allocates too many objects per request maybe you should start by reporting the problem and if you can, offer some patches.

What about a better GC?

That’s the easy answer. When I mentioned this problem with Rails, a lot of people told me that I should use JRuby or Rubinius because their GC were much better. Unfortunately, that’s not that simple and choosing an alternative implementation requires much further research and tests.

But what annoys me with this solution is that using it is not solving the issue, it’s just working around it. Yes, Ruby’s GC isn’t that great but that’s the not the key issue, the key issue is that some libraries/frameworks allocate way too many objects and that nobody seems to care (or to even know it). I know that the Ruby Core Team is working on some optimizations and I am sure Ruby will eventually get an improved GC. In the meantime, it’s easy to blame Matz, Koichi and the rest of the core team but again, it’s ignoring that the root cause, totally uncontrolled memory allocation.

Maybe it’s time for us, Rubyists, to think a bit more about our memory usage.

Source