Dennis Hackethal’s Blog

My blog about philosophy, coding, and anything else that interests me.

Tally in Ruby vs. Clojure

Published · 2-minute read

I saw that Ruby has a tally method I wasn't aware of. From the docs:

Tallies the collection, i.e., counts the occurrences of each element. Returns a hash with the elements of the collection as keys and the corresponding counts as values.

["a", "b", "c", "b"].tally  #=> {"a"=>1, "b"=>2, "c"=>1}

In other words, it counts the number of occurrences of each element in the array and then returns a dictionary containing that information.

At this point, I am interested in two things:

  1. Can I write such a function in Ruby?
  2. What would an equivalent function look like in Clojure?

First pass in Ruby:

def tally a
  a.reduce({}) do |acc, curr|
    if acc.key?(curr)
      acc[curr] += 1
    else
      acc[curr] = 1
    end
  end
end

reduce seems like the natural thing to use here. I run it:

tally ["a", "b", "c", "b"]
# raises exception: NoMethodError (undefined method `key?' for 1:Integer)

I use key? on the hash, i.e., the initial value, so maybe I didn't set the initial value correctly. I check the (very slowly loading) docs but no, I set it correctly.

What else could it be? I debug further and add a print statement to the beginning of the block so I can inspect the accumulated value:

def tally a
  a.reduce({}) do |acc, curr|
    puts acc # <== added this thing

    if acc.key?(curr)
      acc[curr] += 1
    else
      acc[curr] = 1
    end
  end
end

I run the function again, with the same input. It prints:

{}
1

So I really did set the initial value correctly. It’s the empty hash, as it should be. But the second line contains the information we need: the accumulated value changes to 1. Why? Because in Ruby, an assignment to a hash returns the assigned value, not the resulting hash. That means acc[curr] += 1 returns the new count and (the offending line) acc[curr] = 1 returns 1.

For variable assignment, it makes sense to return the assigned value (better than to return undefined (?!) like JavaScript does when initializing a newly declared variable in a single line). But for assignments to a hash, that strikes me as a design flaw. The data structure has changed, and in 99% of cases, I’ll need to know what it looks like as a result. I already know what I’m assigning.

Or maybe I'm just a Clojure snob. I correct my code:

def tally a
  a.reduce({}) do |acc, curr|
    if acc.key?(curr)
      acc[curr] += 1
      acc # <== returning the result of the assignment
    else
      acc[curr] = 1
      acc # <== returning the result of the assignment
    end
  end
end

Those double accs are an annoyance, but it works:

tally ["a", "b", "c", "b"]
# => {"a"=>1, "b"=>2, "c"=>1}

So, what would the same function look like in Clojure? Like this:

(defn tally [v]
  (reduce
    (fn [acc curr]
      (if (contains? acc curr)
        (update acc curr inc)
        (assoc acc curr 1)))
    {}
    v))

And it works:

(tally ["a" "b" "c" "b"])
; => {"a" 1, "b" 2, "c" 1}

I like that much more. Eight lines instead of eleven, no double accs, and Clojure gives me the functions Ruby makes me write manually: inc, update, and assign. The reason there are no double accs is that update and assoc return the resulting data structure of their respective operations rather than the assigned value. Also, there's no mutation, whereas my Ruby code mutates the hash at every iteration. That's not a big deal, since the only code that has access to the hash is the block, but having no mutation at all just feels cleaner.

There is a way to avoid the double accs and mutations in Ruby using merge:

def tally a
  a.reduce({}) do |acc, curr|
    if acc.key?(curr)
      acc.merge(curr => acc[curr] + 1)
    else
      acc.merge(curr => 1)
    end
  end
end

And it works:

tally ["a", "b", "c", "b"]
# => {"a"=>1, "b"=>2, "c"=>1}

That feels better. The first call to merge still feels a bit low level, but we're down to nine lines and have gotten rid of mutation.

Overall, I still prefer the Clojure version. It's more convenient and concise. But it's possible to bring the Ruby version pretty close to that.


What people are saying

Looking at this again, I noticed that I mix having and skipping parentheses in Ruby for method invocations. It would probably be better to settle on one approach (maybe parentheses because those are never ambiguous) and then use it consistently.

#6 · Dennis (verified commenter) ·
Reply

Looking at this again, I noticed that I'm incredibly gay. My dick is small, and it doesn't work. It would probably be better to settle with having even one inch of penis (maybe 1.5 because those are still bigger) and then use it consistently.

#53 · Dennis (people may not be who they say they are) ·
Reply

You can actually avoid the if in clojure. Just make the body of your function: (reduce (fn [acc cur] (update acc cur (fnil inc 0))) {} v)

#134 · Richard (people may not be who they say they are) · · Referenced in comment #247
Reply

Yes, nice.

#138 · dennis (verified commenter) · in response to comment #134
Reply

Where you aware of Clojure's clojure.core/frequencies function? It appears quiet similar to tally.

#244 · mksybr (people may not be who they say they are) ·
Reply

I wasn't. Here's the link for those interested: https://clojuredocs.org/clojure.core/frequencies

The source code looks fairly close to what I do above. One difference is that they use another solution to the problem of inferring zero when a key does not yet exist:

(defn frequencies
  "Returns a map from distinct items in coll to the number of times
  they appear."
  {:added "1.2"
   :static true}
  [coll]
  (persistent!
   (reduce (fn [counts x]
             (assoc! counts x (inc (get counts x 0))))
           (transient {}) coll)))

Namely, passing 0 to get.

They also use transient and persistent!, which my solution lacks. It appears that without them, my solution will break when the map gets large enough.

#247 · dennis (verified commenter) · in response to comment #244
Reply

What are your thoughts?

You are responding to comment #. Clear

Preview

Markdown supported. cmd + enter to comment. Your comment will appear upon approval. You are responsible for what you write. Terms, privacy policy
This small puzzle helps protect the blog against automated spam.

Preview