Hash Tricks

Ruby’s Hash can accept a block when you initialize it. The block is called any time you attempt to access a key that is not present. Hash’s initialization block expects the following format:

Hash.new{|hash, key| ... }

The hash references the hash itself, and the key parameter is the missing key. With this, you can initialize default values in the hash before they get accessed. Here are a few interesting things you can do with a hash’s initialization block.

By setting the value to an array, you can easily group items in a list:

groups = Hash.new{|h,k| h[k] = [] }
list   = ["cake", "bake", "cookie", "car", "apple"]

# Group by string length:
list.each{|v| groups[v.length] << v}
groups #=> {4=>["cake", "bake"], 6=>["cookie"], 3=>["car"], 5=>["apple"]}

Setting the value to 0 is a good way to count the occurrences of various items in a list:

counts = Hash.new{|h,k| h[k] = 0 }
list   = ["cake", "cake", "cookie", "car", "cookie"]

# Group by string length:
list.each{|v| counts[v] += 1 }
counts #=> {"cake"=>2, "cookie"=>2, "car"=>1}

Or if you return hashes that return hashes, you can build a tree structure:

tree_block = lambda{|h,k| h[k] = Hash.new(&tree_block) }
opts = Hash.new(&tree_block)
opts['dev']['db']['host'] = "localhost:2828"
opts['dev']['db']['user'] = "me"
opts['dev']['db']['password'] = "secret"
opts['test']['db']['host'] = "localhost:2828"
opts['test']['db']['user'] = "test_user"
opts['test']['db']['password'] = "test_secret"
opts #=> {"dev"=>
           {"db"=>{"host"=>"localhost:2828", "user"=>"me", "password"=>"secret"}}, 
          "test"=>
            {"db"=>{"host"=>"localhost:2828", "user"=>"test_user", "password"=>"test_secret"}}
          }

A block can also be used to create a caching layer:

require 'net/http'
http = Hash.new{|h,k| h[k] = Net::HTTP.get_response(URI(k)).body }
http['http://www.google.com'] # makes a request
http['http://www.google.com'] # returns cached value

In ruby 1.9 hashes are ordered so you can make the cache a fixed length, and evict old values:

http = Hash.new{|h,k| 
  h[k] = Net::HTTP.get_response(URI(k)).body 
  if h.length > 3
    h.delete(h.keys.first)
  end
}
http['http://www.google.com']
http['http://www.yahoo.com']
http['http://www.bing.com']
http['http://www.reddit.com'] # this evicts http://www.google.com
http.keys #=> ["http://www.yahoo.com", "http://www.bing.com", "http://www.reddit.com"]

You can also use it to compute recursive functions:

factorial = Hash.new do |h,k| 
  if k > 1
    h[k] = h[k-1] * k
  else
    h[k] = 1
  end
end

This will cache each result, so if you have computed part of a number’s factorial, it won’t need to compute it again. For instance, factorial[4] will compute the values for 1,2, and 3, and then if you call factorial[3] it will already have the result. This is a somewhat contrived use, but it’s interesting none the less.

As you can see the default block for a Hash has a lot of interesting uses, are there any that you find particularly useful?

About these ads

16 Comments

Filed under ruby

16 responses to “Hash Tricks

  1. Rockwell

    This is very cool! Thanks for sharing. It would be cool if you boiled down one or two of these examples and submitted them to http://rubyquicktips.tumblr.com/

  2. Nice exposition! I like the fixed-size hash example for 1.9.

    I’m a big fan of the caching usage as well, as I wrote about not too long ago.

  3. Pingback: Hash Tricks | Ruby Here Blog


  4. tree_block = lambda{|h,k| h[k] = Hash.new(&tree_block) }
    opts = Hash.new(&tree_block)

    You can shorten these two lines to one:

    opts = Hash.new {|h,k| h[k] = Hash.new(&opts.default_proc) }

  5. Dan

    You can shorten this:

    Hash.new{|h,k| h[k] = 0 }

    To this:

    Hash.new(0)

    Note that you can’t do this with the Array example that preceded it because it would then reuse the same object for every key in the Hash. There’s no such issues with using an integer like this though.

  6. http://pahanix.tumblr.com/post/389121275/interesting-refactoring-1

    A code that lets you define a date by using syntax like Feb[28][2010].

    require ‘date’

    FancyDate = Hash.new do |hash, month|
    Hash.new do |hash, day|
    Hash.new do |hash, year|
    Date.new(year, month, day)
    end
    end
    end

    Date::ABBR_MONTHNAMES[1..-1].each_with_index do |name,index|
    Object.const_set(name, FancyDate[index 1])
    end

  7. Tuomas

    I really liked the cache example, but the downside of all these cool rubytricks is that the code is utterly unreadable to other team mates, unless these become well known practices.

    • Adam Sanderson

      Entirely true. Some of them such as using a hash to group items or count them are pretty common though.

    • wes

      If your team mates can’t figure out the code then they are just tourists anyway.

    • The same could be said of just about any Ruby feature. You could say “these blocks are really powerful, but they are pretty confusing if you don’t understand blocks”. At some point you have to draw a line and say look, being a Ruby programmer means understanding the features of the language.

      Yes, there are some things which are truly obscure. I wrote an article recently about overloading the backtick quotes, something that very few Ruby programmers know is possible. Actually using that “trick” could be deeply confusing. But using essential Hash features like the default action block ought to be a part of any established Ruby programmer’s repertoire, and I give kudos to the author for bringing more attention to it.

  8. Pingback: Getting to Know the Ruby Standard Library – Abbrev | End of Line

  9. Manoj

    Factorial is very much impressed me. Great…

  10. Pingback: Трюки с передачей блока кода в хэш | Разработка на Ruby и Rails c нуля

  11. m4

    I must admit, my first reaction was like “what a hacky trick” as well. On the second thought, though, I can’t see anything wrong there. Maybe one might have to look up what passing a block to the new hash means. So what? I constantly looking things up when coding.

    And from a readability point of view? That depends on the block itself. The examples here are, in my opinion, concise and understandable. From an (object-oriented) design point of view? There is a little bit of individual behavior attached to the hash objects. I really can’t see anything wrong with that. It’s even very nicely encapsulated and transparent to users of the hashes, once they are created.
    Although there is certainly a point when creating a new class rather than “customizing” standart classes becomes more advisable.

  12. Pingback: Ruby, le langage Objet par excellence, un véritable bijou pour les développeurs - Architec TIC

  13. “Hash Tricks | End of Line” was indeed a delightful article, cannot wait to look over a lot more of your blogs.
    Time to waste a lot of time on-line haha. Thank you -Lynda