Writing a Gem with native extensions
There are many reasons you might want to write a Gem using native extensions. Performance is perhaps the most obvious. CPU heavy tasks, such as number crunching, can be re-written in C, to be many times faster than the equivalent Ruby code. For the daring among us, you can use multiple threads, GPUs, etc.
Another is to re-use existing code. Whether that be legacy code that is critical to your business, or a third-party library that happens to do exactly what you need, a native extension can give you access to that functionality from Ruby.
Examples
Let’s make this more concrete with some examples of well-known gems that rely on native extensions:
- Byebug - a debugger for Ruby, that uses Ruby’s TracePoint API for execution control and the Debug Inspector API for call stack navigation. Written as a C extension for speed.
- nokigiri - an HTML and XML parser. Uses native libraries for speed and ensure standards compliance.
- RMagick - bindings for the ImageMagick image manipulation library.
- sqlite3 - bindings for the SQLite3 database engine.
My motivation
My motivation for learning about native extensions is improve the performance, and memory footprint, of a data structure I implemented using Ruby. This data structure is called a partially persistently tree.
A persistent data structure preserves the previous version of itself when modified, and partial persistence implies that only the latest version can be updated (as opposed to full persistence, where updates can be performed on any previous version). So what I’ve implemented is essentially a versioned tree.
To improve performance, and reduce memory usage, I have been reimplementing the core functionality in C++, exposing it to Ruby via a native extension. This code is currently incomplete, but the challenge has been interesting enough to warrant writing this post, and to do a talk at the Melbourne Ruby meetup.
Clipboard access
Persistent data structures are a vast subject, so to make this all more accessible, the remainder of this post steps through a more contained example. We’ll use a C library called libclipboard to access a user’s clipboard from Ruby code.
For those who want to jump ahead and see all of the code in one place, you can find it on GitHub at:
The libclipboard API
libclipboard is a cross-platform clipboard library, which provides the following functions for interacting with a user’s clipboard:
clipboard_new
- create a context through which to access the user’s clipboardclipboard_free
- free any memory allocated byclipboard_new
clipboard_text
- read the contents of the clipboard as text, if possibleclipboard_set_text
- replace the contents of the clipboard with new text
Before we can use libclipboard from our own code, we’ll need to install it.
Installing libclipboard
The following instructions assume you are working in a UNIX-based environment, with git, cmake and a compiler toolchain installed. On Mac OS X, this can be achieved by installing XCode (and command-line tools), Homebrew, and then using Homebrew to install CMake.
With these pre-requisites met, you should be able to run these commands to compile and install libclipboard:
git clone https://github.com/jtanx/libclipboard
cd libclipboard
mkdir build
cd build
cmake ..
make -j4
sudo make install
Once that is complete, you can test that it is working:
./bin/clip_sample1 -s hello
./bin/clip_sample1
The second command should print out ‘hello’.
Extending Ruby using C
We’re going to use C to create a module called SimpleClipboard
. This module will contain two methods, get_text
and set_text
. get_text
will return the current contents of the clipboard, as text. set_text
will replace the contents of the clipboard, but its return value will be the previous contents of the clipboard.
Big picture
Here’s the code we’re working towards:
#include <ruby.h>
#include <libclipboard.h>
#include "extconf.h"
static clipboard_c *cb = NULL;
VALUE set_text(VALUE _self, VALUE val) {
Check_Type(val, T_STRING);
VALUE result = Qnil;
char *text = clipboard_text(cb);
if (NULL != text) {
result = rb_str_new(text, strlen(text));
free(text);
}
if (false == clipboard_set_text(cb, StringValueCStr(val))) {
rb_raise(rb_eRuntimeError, "Failed to write to clipboard.");
}
return result;
}
VALUE get_text(VALUE _self) {
VALUE result = Qnil;
char *text = clipboard_text(cb);
if (NULL != text) {
result = rb_str_new(text, strlen(text));
free(text);
}
return result;
}
void Init_simple_clipboard() {
cb = clipboard_new(NULL);
if (NULL == cb) {
rb_raise(rb_eRuntimeError, "Failed to create clipboard context.");
}
VALUE mod = rb_define_module("SimpleClipboard");
rb_define_module_function(mod, "get_text", get_text, 0);
rb_define_module_function(mod, "set_text", set_text, 1);
}
If you aren’t fluent in C, this will look… complex. To break this down, we’ll start from the bottom.
Initialisation
When Ruby first loads a native extension, it will look for a function called Init_{extname}
where {extname}
is the name of the extension. This gives the extension an opportunity to define modules, classes, etc. and to do any other initialisation that is required. We will call our extension ‘simple_clipboard’, so this function will be named Init_simple_clipboard
.
Here, we define a module called ‘SimpleClipboard’, and store a reference to it in mod
. We then define two module methods, get_text
and set_text
, that take 0 arguments, and 1 argument, respectively:
void Init_simple_clipboard() {
cb = clipboard_new(NULL);
if (NULL == cb) {
rb_raise(rb_eRuntimeError, "Failed to create clipboard context.");
}
VALUE mod = rb_define_module("SimpleClipboard");
rb_define_module_function(mod, "get_text", get_text, 0);
rb_define_module_function(mod, "set_text", set_text, 1);
}
Note that we also call clipboard_new
to setup a context through which to access the clipboard. This is required by the libclipboard library. If this fails, we raise a RuntimeError
.
Reading from the clipboard
Moving further up, we have the C implementation of the get_text
method:
VALUE get_text(VALUE _self) {
VALUE result = Qnil;
char *text = clipboard_text(cb);
if (NULL != text) {
result = rb_str_new(text, strlen(text));
free(text);
}
return result;
}
This C function returns a VALUE
, which can refer to any Ruby value. We set a default return value of nil
. We then call clipboard_text
to get the current contents of the clipboard, which may not necessarily be set. The other tricky thing here is that we can’t just return the string (char *text
) returned from libclipboard. We first need to turn it into a VALUE
using rb_str_new
. rb_str_new
takes two arguments - a pointer to a string (an array of char), and the number of characters to take from that array.
Once we’re done with the string, we free
it, and then we can return the VALUE we created using rb_str_new
.
Note that we use the Ruby convention of prepending an underscore to the name of unused parameters.
Writing to the clipboard
Writing to the clipboard is similar, but first line of the function is worth highlighting:
Check_Type(str, T_STRING);
val
is the value given to the function by the Ruby interpreter. This must be a string, and the Check_Type
macro is used to raise an ArgumentError
if that is not the case.
extconf.rb
Finally, some Ruby!
The ‘extconf.rb’ file should contain:
require 'mkmf'
$LOCAL_LIBS << '-lclipboard'
if RUBY_PLATFORM =~ /darwin/
$LDFLAGS << '-framework AppKit'
end
create_header
create_makefile 'simple_clipboard'
Once you’ve created this file, you can run it:
ruby extconf.rb
This configures the build parameters needed to compile our native extension, and generates several files:
- extconf.h
- Makefile
- mkmf.log
The one we care about right now is ‘Makefile’. The make
command will look for this file in the current directory, and use the definitions within it to compile our native extension:
Running make
on Mac, you’ll see something like this:
compiling simple_clipboard.c
linking shared-object simple_clipboard.bundle
Now we can use IRB to load the simple_clipboard extension:
require './simple_clipboard'
SimpleClipboard.methods
You should see a method list something like this:
=> [:get_text, :set_text, :<=>, :module_exec, :class_exec, :<=, :>=,
:==, :===, :include?, :included_modules, :ancestors, :name,
:public_instance_methods, :instance_methods, :private_instance_methods,
:protected_instance_methods, :const_get, :constants, :const_defined?,
:const_set, :class_variables, :class_variable_get, :remove_class_variable,
:class_variable_defined?, :class_variable_set, :private_constant, ...
To test out set_text
, first copy some text (such as ‘this text’). Now, run the following in IRB:
SimpleClipboard.set_text 'test'
The result should be the current value of the clipboard (‘this text’, or whatever you placed on the clipboard). Now call:
SimpleClipboard.get_text
The result should now be ‘test’.
Putting this into a gem
Structure
This is the structure for our simple_clipboard gem:
ext/
simple_clipboard/
extconf.rb
simple_clipboard.c
lib/
simple_clipboard/
version.rb
simple_clipboard.rb
simple_clipboard.gemspec
LICENSE
README.md
The key difference from a regular gem is that we have a ‘ext’ directory, for any native extensions.
Our ‘simple_clipboard.gemspec’ file also looks pretty normal. We only have to add a line to specify extensions, and add our .c source file to the list of files to be bundled in the gem:
require File.expand_path("../lib/simple_clipboard/version", __FILE__)
Gem::Specification.new do |s|
s.name = 'simple_clipboard'
s.version = SimpleClipboard::VERSION
s.date = '2018-07-24'
s.summary = 'Simple clipboard example gem'
s.authors = ['Tristan Penman']
s.email = ['tristan@tristanpenman.com']
s.licenses = ['MIT']
s.homepage = 'https://www.github.com/tristanpenman/simple-clipboard'
s.extensions = ['ext/simple_clipboard/extconf.rb']
s.files = [
'ext/simple_clipboard/simple_clipboard.c',
'lib/simple_clipboard.rb',
'lib/simple_clipboard/version.rb'
]
s.require_paths = ['lib']
end
lib/simple_clipboard/version.rb:
module SimpleClipboard
VERSION = '0.0.1'
end
lib/simple_clipboard.rb:
# Ensure that native extension is loaded
require 'simple_clipboard/simple_clipboard'
require 'simple_clipboard/version'
Build
Building the gem is straight-forward. Note that the native extension is not actually compiled at this point.
gem build simple_clipboard.gemspec
No surprises in the output:
Successfully built RubyGem
Name: simple_clipboard
Version: 0.0.1
File: simple_clipboard-0.0.1.gem
Install
Installing the gem is when things get more interesting:
gem install simple_clipboard-0.0.1.gem
We can see in the output that this is when the native extension is actually compiled. And this is why installing gems with native extensions is a common pain point for new Ruby users:
Building native extensions. This could take a while...
Successfully installed simple_clipboard-0.0.1
Parsing documentation for simple_clipboard-0.0.1
Done installing documentation for simple_clipboard after 0 seconds
1 gem installed
Once the gem is installed, we can try it out in IRB:
2.3.3 :001 > require 'simple_clipboard'
=> true
2.3.3 :002 > SimpleClipboard::VERSION
=> "0.0.1"
2.3.3 :003 > SimpleClipboard::get_text
=> "gem install simple_clipboard-0.0.1.gem"
2.3.3 :004 > SimpleClipboard::set_text "Hello world"
=> "gem install simple_clipboard-0.0.1.gem"
2.3.3 :004 > SimpleClipboard::get_text
=> "Hello world"
Testing
Testing a native extension is a little trickier than usual. Before running any tests, we need to compile our code and put the bundle somewhere that Ruby can find it (ideally not with our system-level gems).
Here is the approach that I used to get RSpec working (the choice of RSpec is due to the fact that the test suite for my partially persistent tree was written using RSpec). We’re going to create a Rakefile, and use the Rake::ExtensionTask
class from the ‘rake-compiler’ gem to automatically compile our native extension before running tests.
To do this, we need two new files - a Gemfile and a Rakefile:
Gemfile:
source 'https://rubygems.org'
# Dependencies specified in simple_clipboard.gemspec
gemspec
Rakefile:
require "bundler/gem_tasks"
require "rspec/core/rake_task"
require 'rake/extensiontask'
desc "simple_clipboard test suite"
RSpec::Core::RakeTask.new(:spec) do |t|
t.pattern = "spec/*_spec.rb"
t.verbose = true
end
gemspec = Gem::Specification.load('simple_clipboard.gemspec')
Rake::ExtensionTask.new do |ext|
ext.name = 'simple_clipboard'
ext.source_pattern = "*.{c,h}"
ext.ext_dir = 'ext/simple_clipboard'
ext.lib_dir = 'lib/simple_clipboard'
ext.gem_spec = gemspec
end
task :default => [:compile, :spec]
We also need to update ‘simple_clipboard.gemspec’ to include some additional development dependencies:
require File.expand_path("../lib/simple_clipboard/version", __FILE__)
Gem::Specification.new do |s|
s.name = 'simple_clipboard'
s.version = SimpleClipboard::VERSION
s.date = '2018-07-24'
s.summary = 'Simple clipboard example gem'
authors = ['Tristan Penman']
s.email = 'tristan@tristanpenman.com'
s.licenses = ['MIT']
s.homepage = 'https://www.github.com/tristanpenman/simple-clipboard'
s.extensions = ['ext/simple_clipboard/extconf.rb']
s.files = [
'ext/simple_clipboard/simple_clipboard.c',
'lib/simple_clipboard.rb',
'lib/simple_clipboard/version.rb'
]
s.require_paths = ['lib']
# Required to run tests
s.add_development_dependency "rspec", ">= 2.13.0"
s.add_development_dependency "rake", ">= 1.9.1"
s.add_development_dependency "rake-compiler", ">= 0.8.3"
end
Doing this gives us several rake tasks we can use for native extension development:
- The default task (just running
rake
) will both compile the code, and run tests - Running
rake compile
will just compile the code - And running
rake spec
will just run the test suite
The reason we don’t simply make compile
a dependency of spec
, is to avoid unnecessary compile steps if we a just writing/iterating on tests.
Resources
I’m going to wrap up this post with a list of resources that I have found helpful while learning about Ruby internals, and doing native extension development.
Ruby Under a Microscope
One of my favourite references has been Pat Shaughnessy’s book Ruby Under a Microscope. While not specifically about native extensions, this book goes into plenty of detail about the inner workings of several Ruby implementations. This is kind of knowledge that will help guide your intuition about Ruby performance.
Official Documentation
The Ruby and RubyGems documentation is also a good place to start:
- https://ruby-doc.org/core-2.3.3/doc/extension_rdoc.html
- https://guides.rubygems.org/gems-with-extensions/
Blogs and web sites
Aaron Bedra’s Extending Ruby guide:
Chris Lalancette’s in-depth series on writing Ruby extensions in C, which covers the following topics:
- Project Setup
- RDoc
- Extension Initialisation
- Types and Return Values
- Exceptions
- Catch/Throw
- Numbers
- Strings
- Arrays
- Hashes
- Blocks and Callbacks
- Allocating Memory
Ruby Native Extensions in C, starter gem
I found this repo on GitHub useful when figuring out how to get RSpec to work:
Complete example
Finally, you can find the complete code for the ‘simple_clipboard’ gem on GitHub: