Archive for September, 2008

September 25, 2008

Metaprogramming in the Ruby C API: Part One: Blocks

This is the first in a series of articles on Metaprogramming in the C API. This series will explain how to implement dynamic method definition, eigenclass and metaclass manipulation, and ultimately DSL construction in pure C. This particular article will discuss Ruby blocks, and how we use them in C.

Blocks

In Ruby, blocks hold a central role in Metaprogramming and this is equally true in the C API. Blocks can be passed to methods both implictly:

def my_method
    if block_given? then
        yield
    else
        raise ArgumentError, "a block is required"
    end
end

And explictly:

def my_method(&block)
    if block then
        block.call
    else
        raise ArgumentError, "a block is required"
    end
end

Implicit Blocks

We will examine the implicit case first from the C perspective. Here is the corresponding code in C:

static VALUE
my_method(VALUE self) {
    if(rb_block_given_p())
        rb_yield(Qnil);
    else
        rb_raise(rb_eArgError, "a block is required");

    return Qnil;
}

Looking over the C and the Ruby code it should strike you how similar they are and it should be immediately obvious that rb_block_given_p() and rb_yield() are the C counterparts to Ruby’s block_given? and yield. The prototypes for the two C functions are:

VALUE rb_block_given_p();
VALUE rb_yield(VALUE argv);

The return values for rb_block_given_p() and rb_yield() are identical to their Ruby equivalents; rb_block_given_p() returns a boolean true or false (Qtrue and Qfalse in the C API) and rb_yield() returns the value returned by the block. No surprises here.

Now let’s look at another example of implicit block passing, converting an implicit block into a proc:

def my_method
    if block_given? then
        p = Proc.new
        p.call
    else
        raise ArgumentError, "a block is required"
    end
end

Here is the corresponding C code:


static VALUE
my_method(VALUE self) {
    VALUE p;

    if(rb_block_given_p())
        p = rb_block_proc();
        rb_funcall(p, rb_intern("call"), 0);
    else
        rb_raise(rb_eArgError, "a block is required");
    return Qnil;
}

Similar to Proc.new the C function rb_block_proc() converts the implicitly passed block into a proc. The rb_funcall() function (which you should be familiar with) then executes the proc by invoking its ‘call’ method. The prototype for rb_block_proc() is as follows:

VALUE rb_block_proc();

Where the return value is the newly created proc.

One more function that exists only in the C API but is nonetheless very useful is rb_need_block(), it throws a LocalJumpError exception if no block is present, use it as follows:

static VALUE
my_method(VALUE self) {
    rb_need_block();

    rb_yield(Qnil);

    return Qnil;
}

Here is its prototype:

void rb_need_block();

Explicit Blocks

To refresh here is the Ruby code for explicit block passing:

def my_method(&block)
    if block then
        block.call
    else
        raise ArgumentError, "a block is required"
    end
end

And here is the corresponding C code:

static VALUE
my_method(int argc, VALUE *argv, VALUE self) {
    VALUE block = Qnil;

    rb_scan_args(argc, argv, "0&", &block);

    if(RTEST(block))
      rb_funcall(block, rb_intern("call"),0);
    else
      rb_raise(rb_eArgError, "a block is required");

    return Qnil;
}

From above, C doesn’t have any native way of defining a parameter as a ‘block parameter’ so to achieve the equivalent we must use rb_scan_args() and a variable length parameter list.

The “0&” parameter to rb_scan_args() indicates we have no (0) ordinary parameters and one block (&) parameter. The &block  tells rb_scan_args() to save the block  in the variable called ‘block’. And, as in previous examples, rb_funcall() invokes the block (now really a proc).

Caveats

Blocks in C do not behave entirely like their Ruby counterparts. Take the following Ruby code:

def my_method

   yield
   instance_eval "puts 'heya'"

end

and its C “equivalent”:

static VALUE
my_method(VALUE self) { 

    VALUE cmd = rb_str_new2("puts 'heya'");

    rb_yield(Qnil);

    rb_obj_instance_eval(1, &cmd, self);

    return Qnil;
}

When we invoke the Ruby version of my_method on receiver ‘obj’ we get:

obj.my_method { puts "hello" }
output:
hello
heya

But for the C version we get:

obj.my_method { puts "hello" }
output:
hello
ArgumentError: wrong number of arguments (1 for 0)
from (irb):3:in `my_method'
from (irb):3

Why the difference in behaviour? The error is due to the fact that in the C API blocks passed to methods are still ‘live’ and are automatically passed-on to other methods that can take blocks.

So, the line that errors in C appears to Ruby as:

instance_eval("puts 'heya'") { puts "hello" }

Which is an error: instance_eval can take either a String or a block, but not both.

So how do we get the behaviour we want in C?


static VALUE

my_method(VALUE self) {

   VALUE cmd = rb_str_new2("puts 'heya'");

   rb_yield(Qnil);

   rb_funcall(self, rb_intern("instance_eval"), 1, cmd);

   return Qnil;

}

It is important to understand this difference as it can be the source of many headaches and frustrations.

Summary

The intention of this article was to provide some groundwork for the more advanced Metaprogramming articles to come. Although not technically ‘Metaprogramming’ blocks are an integral part of the field and a thorough understanding of them is necessary before moving onto the more advanced material.

In the next article we’ll learn about dynamic method definitions and singletons.

For full prototypes with explanations of some (but not all) of the functions presented here check out the pickaxe.

Follow

Get every new post delivered to your Inbox.