# Roll Your Own Ruby Type Checking: Part 1

26 December 2022

Earlier this year, I was talking with a colleague about Sorbet and other type-checking features that are available for Ruby. Although we were with familiar with usage of gradual typing in other languages such as TypeScript and Python, neither of us really knew how this would work under the hood.

This post begins with an example from the Sorbet documentation, and builds up the machinery to support a simplistic type-checking implementation. While our implementation will not be production-ready like Sorbet, I hope that the exploration will be enlightening.

Later posts will refine the type-checking implementation, making it more versatile, and capable of being used with more sophisticated Ruby programs.

## Type-checking

But first, what does it mean to check types?

In a dynamically-typed language such as Ruby, the types of variables, parameters, etc, are typically determined at runtime. That is, they can vary depending on how your program is run. A variable that is a String in some cases, may be Numeric in other cases. The purpose of a type-checker is to detect bugs that may occur due to mis-matching types. An example of this is passing a String as an argument when a number is expected.

Where does Sorbet fit into this?

## Sorbet

Sorbet is a powerful type-checker, designed specifically for Ruby. It is implemented using Ruby language features, unlike other type-checkers based on pre-processors and other tricks. The Sorbet ecosystem also provides plugins for IDE integration and additional development tooling.

Let’s look at some code… When we visit the Sorbet website we’re greeting with this little example, illustrating just how easy it is to add type-checking to both new and existing code:

extend T::Sig

sig {params(name: String).returns(Integer)}
def main(name)
puts "Hello, #{name}!"
name.length
end


We can see from this example that Sorbet provides type-checking through the use of sig annotations.

As you would probably expect from the Ruby ecosystem, this little snippet is built upon quite a bit of magic. In this post, we’ll attempt to re-invent our own annotation machinery, and use it to build a simple type-checker.

NOTE: All of this is just one way that type-checking could be implemented. This should not be considered a description of how Sorbet itself works!

## Repeater

For the examples in this post, we’ll work with a toy class called Repeater. All this class does is repeat a string str a given number times count, with each copy of the string separated by separator. Each version of this class will be given its own number, e.g. Repeater1 below:

class Repeater1
def repeat(str, count, separator: '') do
Array.new(count, str).join(separator)
end
end


This class is very easy to test from IRB, or from a script:

puts Repeater1.new.repeat('test', 3, ', ')


In this case, the output will look like:

test, test, test


Now let’s say we want to add some functionality to this class, so that we print its arguments before the body is executed, and print out the return value before it is returned to the caller.

This is how we could implement this:

class Repeater2
def repeat(*args, **kwargs)
puts "before: #{args}, #{kwargs}"
fn = -> (str, count, separator: '') do
Array.new(count, str).join(separator)
end
ret = fn.call(*args)
puts "after: #{ret}"
ret
end
end


What is unusual about this class is that the repeat method does not directly specify its parameters. Instead, it uses *args so that we can easily print out the arguments using puts. This will print the actual arguments, not just those that the method expects. The code from the original repeat method is captured within a lambda literal, that specifies the actual parameters.

Now If we were to run the following line of code:

puts Repeater2.new.repeat('test', 3, ', ')


It would produce the following output:

before: ["test", 3], {:separator=>", "}
after: test, test, test
test, test, test


## Hooks

Of course, we would like to simplify Repeater2 so that the code to run before and after the method body is more clearly separated, using an annotative style. Ideally something like this:

before -> (*args, **kwargs) { puts "before: #{args}, #{kwargs}" }
after -> (returns) { puts "after: #{returns}" }
def repeat(str, count, separator: '')
Array.new(str, count).join(separator)
end


In this snippet, before is a hook that is called with the arguments supplied to the method, but before the body of the method is executed. And after is a hook that is called with the value returned by the method.

If we consider that Ruby class declarations are executed from top to bottom, and existing methods can be called from within the class body, then we see that before and after could be provided by extending a module.

How does the Hooks module work?

We can see from the snippet above, that we need to define a module that provides before and after methods. We’ll call this module Hooks. before and after are very simple setter methods that store a lambda in a class variable, so that it can be used later. We’ll call this the hook context for a method.

module Hooks
def before(tag)
@before = tag
end

def after(tag)
@after = tag
end


We can then use Ruby’s powerful meta-programming functionality to augment method declaration. In this case, we’re making use of method_added, which is called whenever a method is added to a class:

  def method_added(name)
# short-circuit, to avoid infinite loop
return unless @before || @after

# reset hook context, but store current values
before = @before
@before = nil
after = @after
@after = nil

# capture the original method
meth = instance_method(name)

# wrap the original method with calls to before and after hooks
define_method(name) do |*args, &block|
before.call(*args) if before
ret = meth.bind(self).call(*args, &block)
after.call(ret) if after
ret
end
end
end


define_method is used to replace the newly added method with a block that wraps the original method with calls to our before and after hooks. We have to be careful to clear @before and @after before calling define_method, otherwise this code will keep triggering method_added, and we’ll end up with a stack overflow.

We can finally implement Repeater3, extending the Hooks module:

class Repeater3
extend Hooks

before -> (*args, **kwargs) { puts "before: #{args}, #{kwargs}" }
after -> (returns) { puts "after: #{returns}" }
def repeat(str, count, separator: '')
Array.new(count, str).join(separator)
end
end


Lets check that it works as expected, using the same test code as earlier:

puts Repeater3.new.repeat('test', 3, ', ')


This should produce exactly the same output:

before: ["test", 3], {:separator=>", "}
after: test, test, test
test, test, test


## Type Checking

The next step is to develop this into a convenient, if somewhat primitive, type-checking system.

Our first thought may be to use our Hooks module to write bespoke type-checking code like this:

before do |str, count, separator|
raise "invalid str" unless str.is_a? String
raise "invalid count" unless count.is_a? Numeric
raise "invalid separator" \
unless separator.is_a? String || separator.nil?
end
after begin |ret|
raise "invalid return value" unless ret.is_a? String
end
def repeat(str, count, separator: '')
Array.new(str, count).join(separator)
end


While this may work, it would be time-consuming, error prone, and not particularly easy to maintain. We would like to streamline this so that we write something more compact, like this:

typedef { params(String, Numeric, separator: String).returns(String) }
def repeat(str, count, separator)
Array.new(count, str).join(separator)
end


There’s a fair bit going on here. Firstly, we’re no longer using before and after. Instead we’re using a hook called typedef. Within the typedef call, we’re also using methods called params and returns.

This is getting a bit closer to the interface offered by Sorbet:

sig {
params(str: String, count: Integer, separator: String).returns(Integer)
}
def repeat(str, count, separator)
Array.new(count, str).join(separator)
end


## Method Parameters

One of the interested problems here is that we’re using both positional parameters (str and count) and named/keyword parameters (separator).

Sorbet uses some additional metaprogramming magic to handle these the same (this is not the only reason, but one that particularly stands out). However, for the purposes of this experiment, we’ll keep things simple by handling them separately.

## Types

What we end up with is a Types module that provides two methods, params and returns which can be use to specify the types for a method. These methods both return self, which is what allows them to be chained.

Here is the new Types module in all its glory:

module Types
def params(*arg_types, **kwarg_types)
@arg_types, @kwarg_types = arg_types, kwarg_types
self
end

def returns(ret_type)
@ret_type = ret_type
self
end

def typedef
yield
end

def method_added(name)
# short-circuit, to avoid infinite loop
return unless @arg_types || @kwarg_types || @ret_type

# reset hook context, but store current values
arg_types, kwarg_types, ret_type = @arg_types, @kwarg_types, @ret_type
@arg_types, @kwarg_types, @ret_type = nil, nil, nil

# capture the original method
meth = instance_method(name)

# wrap the original method with type checks
define_method(name) do |*args, **kwargs, &block|
# check positional arguments
arg_types.each_with_index do |type, idx|
raise "Invalid type for arg #{idx}; expected: #{arg_types[idx]}" \
unless args[idx].is_a? type
end

# check keyword arguments
kwarg_types.each do |key, type|
raise "Invalid type for kwarg '#{key}; expected #{kwarg_types[key]}" \
unless kwargs[key].is_a? type
end

ret = meth.bind(self).call(*args, &block)

# check return type
raise "Invalid return type, expected #{ret.name}" \
unless ret.is_a? ret_type

ret
end
end
end


## Putting it together

Now we can implement the Repeater class using our home-grown type-checking system:

class Repeater4
extend Types

typedef { params(String, Numeric, separator: String).returns(String) }
def repeat(str, count, separator: '')
Array.new(str, count).join(separator)
end
end


If we pass valid arguments, as per previous examples, we should see the expected output:

test, test, test


But we replace the count argument with a string, we’ll see our type-checking in action:

begin
puts Repeater4.new.repeat("test", "3", separator: ", ")
rescue StandardError => e
puts "Error: #{e}"
end


The output should look something like:

Error: Invalid type for arg in pos 1; expected: Numeric
`

## Closing Thoughts

If you take away anything from this post, hopefully it is that Ruby allows you to build annotation-like constructs, using features that are part of the language.

These annotations can be evaluated when classes are loaded, and Ruby’s rich support for metaprogramming means that it is possible to do much more than simply annotate methods.

As for the type-checker we constructed in this post, it is currently very simplistic, and not particularly useful. However it lays the groundwork for future posts, in which we’ll develop a more complete type-checker.

### Code

All of the code for this post can be found in my ruby-type-checking repo on GitHub.