Stretching the Language with Macros; Implementing a `for` Loop In Crystal

·

11 min read

This article was originally published a couple of years ago on New Relic's now-defunct Relicans site. I am reprinting it here as I was the original author.


What Are Crystal's Looping Structures?

The Crystal language offers three different language-level looping structures. It provides a while loop and an until loop, conditional looping structures, which loop over the body of the loop while the provided condition is true, or until the provided condition is true. Crystal also offers an unconditional, infinite looping construct using loop do. In addition, Crystal also provides iterators for various collection classes, like Array or Hash, in the form of an #each method which can be called on the collection types.

This set of looping constructs is sufficient to write any loop that one wants. If this were a computer science course, you might be taught several different types of loops.

  1. Infinite Loops -- repeat without any way of escaping the loop.

  2. Count Controlled Loops -- Repeat a specific number of times.

  3. Condition Controlled Loops -- Repeat until a specific condition is met.

In addition, Condition Controlled Loops have a couple of specific subtypes, the entry-controlled loop, and the exit-controlled loop. The condition is checked in the entry-controlled loop before the loop body is executed, which raises the possibility that the loop body will never be executed. In the exit-controlled loop, the condition is checked after the loop body executes, which means that the loop body will always run at least a single time.

All of these loop types are possible with the existing looping constructs for Crystal, including loops like exit-controlled loops where the body will be executed at least once:

counter = 0
loop do
  counter += 1
  break if rand() < 0.1
end

puts counter
# The counter will be at least 1.

Crystal also offers convenient methods on collection classes for iterating through the elements of that class. So, if one has a hash, one can iterate through all of the elements in the hash:

h = {"a" => 1,
     "b" => 2,
     "c" => 3}

h.each do |key, value|
  puts "#{key} -> #{value}"
end

Introducing Javascript's for/in and for/of

Some languages implement loops in different ways from Crystal. Javascript uses for as one of its primary looping keywords, with several different types of loops all possible using that keyword. For example, with Javascript, one can iterate through all of the keys or all of the values in an object with for/in or for/of.

ary = ["first", "second", "third"]

ary.somethingNew = "Cool, a new value."

console.log("Keys:")
for (var key in ary) {
  console.log(key)
}

console.log("Values:")
for (var value of ary) {
  console.log(value)
}

The Star of the Show, the C-style for Loop

Javascript also implements a C-style for loop using the same keyword. The C-style for loop takes three arguments, an initialization clause, a test clause, and an update clause. It operates by first executing the initialization clause to set up any preconditions that should be in place before the loop starts. It then enters the loop.

It checks the test clause, and if that clause evaluates to true, the loop body will execute. After the loop body executes, the update clause will run.

This is a versatile structure that can be used for various types of iteration. It is one that anyone with C or any C-family programming language experience will be familiar with. Most types of a loop (except for exit-controlled loops) can be implemented using only the C-style for loop syntax.

Everything that can be done with for in Javascript can be done with an existing looping feature of the Crystal language. However, people coming from languages like Javascript still sometimes wonder whether Crystal offers a loop that they are familiar with.

Does Crystal Offer a for Loop?

While the answer to that question is "No," Crystal has some fantastic dynamic/metaprogramming capabilities that make it possible to change that answer somewhat.

Introducing iterate

Let me introduce you, first, to a looping structure that is similar to the Javascript for/in|for/of loops:

h = {"a" => 1,
     "b" => 2,
     "c" => 3}

iterate(key, value, using h) do
  puts "#{key} -> #{value}"
end

This looping structure is interesting principally because it showcases Crystal macros' power in a small-scale item. It introduces apparent keywords and syntax that don't exist elsewhere in the language, and while not syntactically identical to the Javascript for/in|for/of loops, it is reminiscent of them.

Functionally, the above code is identical to this standard Crystal code:

h = {"a" => 1,
     "b" => 2,
     "c" => 3}

h.each do |key, value|
  puts "#{key} -> #{value}"
end

In this case, the native Crystal syntax is superior, but let's take a quick look at how this alternative looping structure is implemented before moving on to the following, more useful example.

Implementing using

The simplest piece is the using part of that syntax. That is implemented as a macro that does nothing more than return whatever it is given:

macro using(target)
  {{ target }}
end

Its only purpose is to be syntactic sugar, allowing the using word in code without having any material effect on the argument that follows it.

Implementing iterate

The implementation of the iterate loop is a little bit more interesting:

macro iterate(*elements, &blk)
  {% target = elements[-1] %}
  {{ target }}.each do |{{ elements[0..-2].join(", ").id }}|
    {{ blk.body.id }}
  end
end

Crystal macro definitions generally look like regular Crystal methods, except that the code they define is executed at compile-time and is used to write other (compliable) code. This means that they can take splat arguments, have default arguments, accept blocks, etc., just like traditional method definitions.

The above macro accepts any number of arguments and a block. The last argument is assumed to be the object that is to be iterated over. In contrast, the first arguments are the variable or variables that will receive the elements that will be iterated over.

The macro then rewrites those arguments into a familiar #each loop. It is a fun and hopefully digestible piece of macro code, but this loop doesn't offer any benefits over what we already have. However, consider this trivial example:

Does Crystal Offer a for Loop?

require "prime"
require "for"

prime = 4 # not a prime
for({max = 2147483647}, ->(){!prime.prime?}, {prime = rand(max)})

puts "random prime number: #{prime}"

It is roughly equivalent to this:

require "prime"
require "for"

prime = 4 # not a prime
max = 2147483647
while !prime.prime?
  prime = rand(max)
end

puts "random prime number: #{prime}"

Except that in the for example, the loop runs in its own scope, making it more like:

require "prime"
require "for"

prime = 4 # not a prime
->() do
  max = 2147483647
  while !prime.prime?
    prime = rand(max)
  end
end.call

puts "random prime number: #{prime}"

It can also be done as an exit-controlled loop:

require "prime"
require "for"

prime = uninitialized Int32
do_until({max = 2147483647}, ->(){prime.prime?}, {prime = rand(max)})

puts "random prime number: #{prime}"

This is roughly equivalent to:

require "prime"
require "for"

prime = uninitialized Int32
->() do
  max = 2147483647
  loop do
    prime = rand(max)
    break if prime.prime?
  end
end.call

puts "random prime number: #{prime}"

Using Macros, It Can!

The use of a for loop here allows the code to be more concise.

The implementation of for and do_until loops provides a few features.

They both take init, test, and update clauses, just like the C-style for loop, along with an optional block for the loop's body. As demonstrated, the test and update clauses might be sufficient without any other loop body for trivial things. In addition, they both run in their own scope, as closures to the current scope, and finally, that scope, which is a Proc, is itself assignable to a variable, just like any other scope.

This last capability is legitimately useful, as it transforms the loop into an active entity that can be stored, passed around, and invoked at will.

Consider the following example, which simulates an HTTP worker API that would receive jobs via an HTTP request and then run them on an internal worker pool.

require "for"
require "http/server"

jobs = [] of HTTP::Server::Context
handlers = [] of Proc(Nil)
queue = Channel(Tuple(HTTP::Server::Context?, Channel(Nil))).new(1000)

8.times do
  handlers << for(
    {counter = 1},
    ->{ tup = queue.receive? },
    {counter += 1},
    run: false) do
    puts "REQ #{counter} -- #{tup[0]}"
    tup[1].send(nil)
  end
end

server = HTTP::Server.new do |context|
  pong = Channel(Nil).new
  queue.send({context, pong})
  pong.receive # Worker has finished; return response
end

spawn(name: "work loop") do
  handlers.each { |handler| spawn(name: "worker") { handler.call } }
end

server.bind_tcp 8080
server.listen

The server creates eight handlers, each of which is just a for loop that is not yet running. Those for loops, when started, either receive a tuple containing a request and a channel to send a signal on when the request is done being processed or it can receive nil. A nil will signal to the worker to exit the loop (shut down). Otherwise, it just keeps looping, accepting work, and incrementing a counter of the number of jobs it has acted upon.

There is nothing here that you can't code without using the for looping macro, but using it makes the whole thing a little more brief, and the implementation of both the for and the do_until macros is itself concise.

Implementing for

macro for(init = "", test = "", update = "", run = true, &blk)
  ->() do
    {{ (init.is_a?(ProcLiteral) ? init.body : init).id }}
    while {{ (test.is_a?(ProcLiteral) ? test.body : test).id }}
      {{ blk.is_a?(Nop) ? "".id : blk.body.id }}
      {{ (update.is_a?(ProcLiteral) ? update.body : update).id }}
    end
  end{{ run ? ".call".id : "".id }}
end

The implementation allows for some interesting behavior. The code from the various clauses is accessed through the #body method if it is passed as a ProcLiteral, or through the #id method if passed any other way. The intention of this was that one could pass code that the compiler will not successfully evaluate as a Proc, as a String instead. For example:

for(->(){ counter = 0 }, ->(){ counter < 10 }, ->(){ counter += 1 }) { puts counter }

This code fails with Error: '+=' before definition of 'counter'. The error is because when Crystal evaluates that expression, it doesn't know that it's part of a macro and won't execute as written; that the code therein will become part of the code assembled by the macro. So it throws an error because we can't call #+= on something that is not defined yet.

So, the intention was that one could do this:

for(%( counter = 0 ), %( counter < 10 ), %( counter += 1 )) { puts counter }

That will work just fine. However, it is a little ugly. Then an interesting thing happened. I wrote this one time while testing:

{counter = 0}

And my code still worked!

Detour! What is Happening Here?

That code, { counter = 0 } itself evaluates to a Tuple, {0}. However, when calling #id on it from within a macro, Crystal returns the full text for the Tuple declaration. It is completely legal to have any code that one wants inside of a Tuple declaration. So while a Tuple can't be used for the test clause of a for loop since any Tuple is a truthy value, it can be used for the initialization and the update portions, which results in something that looks like this:

for({ counter = 0 }, ->(){ counter < 10}, { counter += 1}) { puts counter }

That works because Crystal thinks that the variable declarations inside of those Tuple declarations are going to be executed in the current scope, and thus counter will be initialized when { counter = 0 } is executed and will therefore be known when the equality check and the increment operations happen after.

This isn't what happens.

That code ends up being part of the for macro, executed in its own scope (so there are no side effects like creating a counter variable in the current scope). It conveniently makes the for macro syntax a little more pleasant to read than if strings were used to express the code.

Implementing do_until

macro do_until(init = "", test = "", update = "", run = true, &blk)
  ->() do
    {{ (init.is_a?(ProcLiteral) ? init.body : init).id }}
    loop do
      {{ blk.is_a?(Nop) ? "".id : blk.body.id }}
      {{ (update.is_a?(ProcLiteral) ? update.body : update).id }}

      break if {{ (test.is_a?(ProcLiteral) ? test.body : test).id }}
    end
  end{{ run ? ".call".id : "".id }}
end

This is similar to for, except that a loop do is implemented to check the exit condition at the bottom of the loop body before it repeats.

Final Thoughts

Macros are a powerful tool in Crystal. They make it possible to do some pretty fantastic dynamic and metaprogramming tasks in Crystal that would otherwise be extremely difficult or impossible to accomplish. A few lines can augment or alter syntax or create entirely new and valuable syntax and tools.

You can look at the code for this for implementation, as well as how to integrate this shard into your code, here. Please file an issue if you try it and run into any problems.