Mutable strings in Ruby

When people ask why we use Ruby instead of Python, I usually mention the beautiful, nearly-invisible English-like syntax, the ease of writing DSLs, trivially simple block arguments, meta-programming including reflection… But I have to admit that Python has a few clearly superior features.  First on the list: Python strings are better because they are immutable.

The Perils of Mutation

What’s so bad about Ruby’s mutable strings?  Consider this innocuous-looking Ruby method that puts “[EOL]” at the end of the message that is being logged:

def log_message(message)
  message << "[EOL]"
  puts message

LOOP_MESSAGE = "In the loop"

3.times do

Here’s the output:

In the loop[EOL]
In the loop[EOL][EOL]
In the loop[EOL][EOL][EOL]

What are those extra [EOL]s doing in there?!  Drop into pry and have a look at LOOP_MESSAGE after the loop has finished:

=> "In the loop[EOL][EOL][EOL]"

Uggh! Every pass through the loop mutated that “constant” LOOP_MESSAGE string.  Not good.

Avoid Mutating Methods and Operators

Let’s banish the nasty mutating “String#<<” operator:

def log_message(message)
  puts message + "[EOL]"

With that, the bug is fixed!

In the loop[EOL]
In the loop[EOL]
In the loop[EOL]

Besides String#<<, most of the other mutating operators to avoid end in !, like slice!, downcase!, etc.

Run-time Immutability with Object#freeze

Ruby has always had a run-time form of immutability.  If you call “.freeze” on an object, the Ruby runtime will raise an exception if any code tries to mutate that object:

>> pets = "cat"
=> "cat"
>> pets << " dog"
=> "cat dog"
>> pets.freeze;
>> pets << " canary"
RuntimeError: can't modify frozen String
 from (irb):8

So if you remember to freeze your string literals, you can avoid bugs with mutating shared references. You may have seen this done in your favorite Ruby gem or in Rails itself:

CONTENT_TYPE = "Content-Type".freeze

Performance Problems Caused by Mutability

Every time this code is called, the string literal “[EOL]” will be allocated again off the heap.  If this method were called inside the tightest loop in our code, the time to dynamically allocate this string literal (and later garbage collect it) can start to dominate the performance of the application.

The only reason Ruby goes to all this effort to allocate string literals every time the code is run is to allow the string to be mutated.  For example:

def concatenate(*args)
  result = ''
  args.each do |arg|
    result << arg

Non-Mutation is the Functional Way

Mutating variables is bug-prone, particularly in a language with shared references like Ruby, because side-effects from one method can change the behavior of another.  More generally, it is much harder for a programmer to understand the behavior of code that mutates because the name of an object is insufficient to predict its content—the programmer must also know the point in the object’s lifetime.

It may be less intuitive, but non-mutating code can actually perform better because objects that are never mutated can be freely shared, both by the language and the application.

Non-mutation is a cornerstone of the functional programming style.

One thought on “Mutable strings in Ruby

  1. Pingback: Magic comment ‘immutable: string’ makes Ruby 2.1′s “literal”.freeze optimization the default | Invoca Development Team Blog

Leave a Reply

Your email address will not be published. Required fields are marked *

× 2 = ten