In Ruby, everything is open to modification at run time. Classes, modules, and even the behavior of individual objects can all be changed while a program is running. It’s trivial to write code which defines new classes or adds methods to an existing class or object. Virtually nothing is off-limits. This so-called “metaprogramming” is one of Ruby’s most powerful features. It’s also one of its most dangerous.
There are a lot of good uses for metaprogramming. Cleaning up redundant code, generalizing a feature to work with more than one class, and creating domain specific languages are just a few examples. But there are downsides too. Methods like eval
tend to become a crutch since they can be used to solve so many programming problems, yet at the same time they can expose your application to serious security issues. Knowing which kinds of metaprogramming are safe and helpful, and which are problematic is the responsibility of every Ruby programmer.
Metaprogramming can be a slippery slope. This chapter will help you maintain sure footing.
Love them or hate them, callbacks are a recurring pattern in software design. From user interface event handling to asynchronous APIs, callbacks are practically everywhere. Working with a library or framework that requires callbacks isn’t always the most enjoyable experience. That said, some languages make it more natural to pass functions or anonymous chucks of code around as callbacks. You could even consider a block in Ruby to be a type of callback. In that light callbacks don’t seem so bad after all.
At the risk of outing myself, my favorite text editor uses callbacks as a way to hook into nearly every feature it offers. I love them. I can write a so-called hook function to be notified before or after a file has been opened, saved, or closed. Almost every action performed in my editor triggers an event which runs registered hook functions. I’ve done some pretty weird stuff using hooks, always for the greater good of course.
Ruby also uses this idea of events and hook functions, albeit a much simpler model. Registering to receive event notifications is as simple as writing a method with the correct name. No doubt you’ve already done something similar by defining methods such as initialize
and method_missing
. While these methods do seem to be callbacks with respect to object lifetimes and method dispatching, Ruby doesn’t consider them to be hooks. Technically, hooks facilitate metaprogramming at the class and module level and are therefore written as class and module methods (also known as singleton methods). There are about a dozen such methods which you can define, but let’s start with the most frequently used hooks first.
As you know, it’s quite common to mix modules into objects, classes, and other modules using the include
and extend
methods. Each time you mix in a module Ruby calls the included
or extended
hook depending on how the module was mixed in. This is basically a notification to the module that it’s being inserted into a class hierarchy somewhere. Both hooks are given a single argument, the receiver of the include
or extend
method. In other words, the argument is the object doing the including or extending. What can you do in one of these hooks? Let’s explore the extended
hook by revisiting the example from Item 21.
Recall that the RaisingHash
class used delegation instead of inheritance in order to reuse the functionality of the Hash
class. The major difference between Hash
and RaisingHash
is that the latter raises an exception if a nonexistent key is accessed. The majority of the Hash
instance methods were made available as instance methods in RaisingHash
using the def_delegators
method from the Forwardable
module. These delegated methods in RaisingHash
simply forward the method call on to the @hash
instance variable.
There were a few instance methods which weren’t so simple, however. The freeze
, taint
, and untaint
methods needed to invoke the appropriate method on @hash
, followed by a call to super
, so the RaisingHash
object itself was updated accordingly. The implementations of these methods were almost identical. Let’s fix that by writing a new delegation helper method which calls super
after delegating the method to @hash
. Now, we could just add this new method to the existing Forwardable
module but Item 32 advises against this. Instead, let’s write a new module called SuperForwardable
. To make life a bit easier for users of SuperForwardable
, we’ll use the extended
module hook to make sure that the Forwardable
library file is loaded and any class which extends SuperForwardable
also extends Forwardable
. Consider this:
module SuperForwardable
# Module hook.
def self.extended (klass)
require('forwardable')
klass.extend(Forwardable)
end
# Creates delegator which calls super.
def def_delegators_with_super (target, *methods)
methods.each do |method|
target_method = "#{method}_without_super".to_sym
def_delegator(target, method, target_method)
define_method(method) do |*args, &block|
send(target_method, *args, &block)
super(*args, &block)
end
end
end
end
Extending a class with the SuperForwardable
module triggers the SuperForwardable::extended
hook method. This method is given the object doing the extending as its only argument. It’s common practice to name this argument klass
or mod
depending on how you assume the module is going to be used. The name “klass
” is used because “class
” is a keyword in Ruby and can’t be used as a variable name. The extended
hook in SuperForwardable
expects to be given a class which it then extends with the Forwardable
module. Therefore, extending a class with SuperForwardable
not only brings in all of the methods defined in that module but it also brings in all of the methods defined in the Forwardable
module as well. Let’s see how we can use this new module from within the RaisingHash
class:
class RaisingHash
extend(SuperForwardable)
def_delegators(:@hash, :[], :[]=) # etc.
def_delegators_with_super(:@hash, :freeze, :taint, :untaint)
def initialize
# Create @hash...
end
end
Okay, let’s take a moment and trace what’s going on in the RaisingHash
class definition. The first line in the class uses the extend
method with a single argument, the SuperForwardable
module. The extend
method can actually take any number of modules as arguments. For each module, the extend
method adds all of the methods and constants defined in the module to the receiver. For the RaisingHash
class this means that the single instance method defined in SuperForwardable
becomes a class method in RaisingHash
.
After all of the definitions from the module are loaded into the receiver, extend
invokes the module’s extended
hook. When the hook method is called it is passed the object which invoked extend
. In our example SuperForwardable::extended
is called with RaisingHash
as its argument. This leads to SuperForwardable::extended
calling RaisingHash::extend
and passing in the Forwardable
module. This whole process repeats so that the Forwardable
module is loaded into RaisingHash
.
With SuperForwardable
and Forwardable
loaded into RaisingHash
we can use def_delegators
and def_delegators_with_super
to set up the method forwarding to @hash
. We’ve seen def_delegators
before in Item 21 and def_delegators_with_super
does the exact same thing with one exception. After forwarding the original method call to the target object, the generated delegation method also calls super
. Using this helper method gives RaisingHash
three new instance methods: freeze
, taint
, and untaint
. Each of which calls a corresponding method on the @hash
object followed by super
.
This trick of using the extended
hook in a module to further extend a class is an interesting way to emulate module inheritance. It’s as if the SuperForwardable
module inherited from the Forwardable
module. A similar sort of thing can be done with the include
method and the included
hook. Of course, the include
method brings in a module’s instance methods and sets them up as instance methods on the receiver. Mixing in a module with the include
method triggers the included
hook which is defined just like the extended
hook was in SuperForwardable
.
The extended
and included
hooks are unique to modules. There’s a third hook which was introduced in Ruby 2.0, prepended
. It’s triggered when you use the prepend
method to mix in a module. The prepended
hook and the prepend
method are discussed further in Item 35.
Almost all of the remaining hooks are available on modules and classes. The one exception is unique because it only works with classes. Each time a class is defined, Ruby triggers the inherited
hook on its parent class to notify it about the new subclass. Let’s do something interesting with this hook. Item 21 makes the case that you should never inherit from the core collection classes. Let’s enforce that with some code.
We can use the inherited
hook to intercept a class definition and raise an exception, preventing the inheritance. Since we want this logic to apply to all of the core collection classes, it makes sense to write it into a module. We can then use extend
to insert the module’s instance methods as class methods on each of the core collection classes. Consider the PreventInheritance
module:
module PreventInheritance
class InheritanceError < StandardError; end
def inherited (child)
raise(InheritanceError,
"#{child} cannot inherit from #{self}")
end
end
As an instance method in a module, the inherited
method does nothing special. But turn it into a class method using extend
and it becomes a proper hook method:
irb> Array.extend(PreventInheritance)
irb> class BetterArray < Array; end
PreventInheritance::InheritanceError:
BetterArray cannot inherit from Array
Defining a hook in a module and then mixing it into a class is rather indirect, but useful in this case. If you want to define an inherited
hook directly in a class you need to make sure it’s a class method. For example:
class Parent
def self.inherited (child)
# ..
end
end
The Parent::inherited
method will be called anytime a class is defined which inherits from the Parent
class. It’s worth mentioning that when the inherited
hook is called, the child class isn’t fully defined. That is, the body of the child class hasn’t yet executed and therefore hasn’t had a chance to define any methods. This may limit what you can do in an inherited
hook, something you’ll want to keep in mind.
That leaves us with the final six hooks which apply to both modules and classes alike. All of them have to do with methods. The method_added
, method_removed
, and method_undefined
hooks are for instance methods while the singleton_method_added
, singleton_method_removed
, and singleton_method_undefined
hooks are for class and module methods.
Defining these hooks for modules or classes is similar to the previous hooks we’ve seen. All of them should be defined in modules as module methods and in classes as class methods. Here’s an example of a class which monitors instance methods:
class InstanceMethodWatcher
def self.method_added (m); end
def self.method_removed (m); end
def self.method_undefined (m); end
# Triggers method_added(:hello)
def hello; end
# Triggers method_removed(:hello)
remove_method(:hello)
# Triggers method_added(:hello), again.
def hello; end
# Triggers method_undefined(:hello)
undef_method(:hello)
end
There are a couple of things to watch out for when using these hooks. The only argument which is given to each of the hooks is a symbol representing the name of the method which was added, removed, or undefined. You’re not given the class on which the method status changed. If a method is added to a subclass, you’ll need to rely of the value of self
to know that. Speaking of subclasses, since classes can participate in inheritance, you should probably call super
from within these hooks. See Item 29 for more information on hooks and super
(including the inherited
hook).
Hooks relating to singleton methods are very similar to their instance method counterparts, except for one strange side effect. Defining a singleton_method_added
hook will trigger itself. That is, defining the hook—which is a singleton method—causes Ruby to trigger the singleton_method_added
hook which now exists. You’ll want to watch out for that. Otherwise, just remember that class methods are implemented as singleton methods, which is why the following code uses the “class << self
” trick to enter into the singleton class before invoking remove_method
or undef_method
:
class SingletonMethodWatcher
def self.singleton_method_added (m); end
def self.singleton_method_removed (m); end
def self.singleton_method_undefined (m); end
# Triggers singleton_method_added(:hello)
def self.hello; end
# Triggers singleton_method_removed(:hello)
class << self; remove_method(:hello); end
# Triggers singleton_method_removed(:hello), again.
def self.hello; end
# Triggers singleton_method_undefined(:hello)
class << self; undef_method(:hello); end
end
And there you have it, all ten hook methods. Let’s wrap up with a few notes about hooks. First, all of the hook methods are automatically marked as private. They’re meant to be called by the Ruby interpreter and not from user space for obvious reasons. Second, there are three methods which are related to the hook methods but are not hooks themselves: extend_object
, append_features
, and prepend_features
.
None of these methods should be overridden, that’s what the hooks are for after all. As an example, when you use the include
method to mix a module into a class, the module’s append_features
method is invoked to do the actual work before the included
hook is triggered. While you could certainly override the append_features
method and call super
, the preferred way of intercepting this type of module mixing is by defining the included
hook. The same thing goes for extend_object
and the extended
hook, and prepend_features
and the prepended
hook. Prefer defining hook methods to overriding these internal methods.
• All of the hook methods should be defined as singleton methods.
• The hooks which are called when a method is added, removed, or undefined only receive the name of the method, not the class where the change occurred. Use the value of self
if you need to know this.
• Defining a singleton_method_added
hook will trigger itself.
• Don’t override the extend_object
, append_features
, or prepend_features
methods. Use the extended
, included
, or prepended
hooks instead.
Let’s say that after reading Item 28 you became really excited about using hook functions, specifically the inherited
class hook. As a matter of fact, it solves a problem that you’ve been having with one of your class hierarchies. Suppose that the base class of the troublesome hierarchy uses the factory method pattern. It’s an abstract class representing an interface for downloading files when given URLs. Each subclass knows how to work with a single protocol such as HTTP or FTP. The application you’re writing only has to pass a URL to the base class, and in response it receives back an instance of the appropriate subclass ready to download the specified file.
The problem you’ve been having has to do with the base class knowing about each of its subclasses. Up to this point you’ve had to manually connect everything together. But now, with the inherited
hook, things are much easier. Consider this:
class DownloaderBase
def self.inherited (subclass)
handlers << subclass
end
def self.handlers
@handlers ||= []
end
private_class_method(:handlers)
end
Thanks to the inherited
hook, the subclasses will now automatically register themselves with the base class when they are defined. As recommended in Item 15, the base class uses a class instance variable to track all of its subclasses as opposed to a regular class variable. This keeps it from appearing in the subclasses and avoids any accidental mutation.
With just a few more methods the DownloaderBase
class will be able to accept a URL and return the appropriate subclass. But there’s something missing in the inherited
hook that’s more important for our discussion here, it needs to invoke super
. Technically, the inherited
hook works fine just the way it is. The DownloaderBase
implicitly inherits from Object
and so there’s no real need to use super
to call an inherited
hook higher up in the hierarchy. As we’ve seen before though, inheritance isn’t the only way for a class to be associated with a superclass.
Including or extending a module might insert an inherited
hook higher in the hierarchy. Popular frameworks such as Ruby on Rails do this from time to time. Take the ActiveModel::Validations
module for example. When you include that module into a class it sets up an inherited
hook which copies any attribute validation callbacks into subclasses. If you tried to use the ActiveModel::Validations
module with the current implementation of DownloaderBase
, this copying wouldn’t happen. Other modules with more important inherited
hooks might break entirely. Of course, the solution is simple, make sure you call super
:
def self.inherited (subclass)
super
handlers << subclass
end
As the title of this item suggests, this advice goes beyond the inherited
hook. All of the class hooks should invoke versions of themselves higher up in the hierarchy using super
. (For a full list of the class hooks go back and take a look at Item 28.)
It might seem redundant to use super
from hooks in classes like DownloaderBase
, and perhaps it is. Keep in mind that since modules can insert class hooks, it’s not always obvious when the hook you’re writing might override another one higher up in the inheritance hierarchy. Using super
is good way to future-proof your code, but ultimately, you’ll have to use your best judgment.
• Invoke super
from within class hooks.
When newcomers to Ruby discover method_missing
, it’s as if they’ve just found a multipurpose tool which is begging to be used. It calls to them while in the shower and professes itself to be the perfect solution to a difficult problem from the previous day. One way or another, method_missing
is going to end up in their code, even if they have to use a crowbar to get it to fit. What is it about method_missing
that makes it so attractive?
Clearly, method_missing
is one of the most powerful tools in the Ruby toolbox. Unfortunately, it has a lot of dubious uses. Want an object to respond to any possible message? No problem. Ever needed a Hash
to act more like an OpenStruct
? Piece of cake. Do you like the idea of method names automatically being turned into SQL? Take a look at Rails 2. It only goes downhill from there.
You can do all these things with method_missing
because it’s a catchall, your last ditch effort to respond to a message when a matching method can’t be found. But it comes with a cost. We’ve already seen in Item 7 how defining method_missing
can lead to confusing error messages when using super
. Then you have introspection methods like respond_to?
which won’t agree with reality. There’s also a small performance difference when you use method_missing
due to the extra traversal of the inheritance hierarchy, but it’s fairly negligible so we won’t consider it further.
The good news is that there’s nearly always a way to implement the same features without resorting to method_missing
. In order to demonstrate this I’ll tackle two of the most common uses of method_missing
and show how define_method
can be used without incurring the drawbacks listed above. Let’s start with the biggest use of method_missing
, proxies.
We’ve already seen how to use the Forwardable
module to delegate methods to an instance variable. Back in Item 21 we looked at the RaisingHash
class which forwarded many of its instance methods without ever exposing the internal hash kept in @hash
. This is a great use of the Forwardable
module and the way I recommend you implement delegators or proxies. Unfortunately, the Forwardable
module isn’t well known and method_missing
seems to be the next best thing. Consider the HashProxy
class:
class HashProxy
def initialize
@hash = {}
end
private
def method_missing (name, *args, &block)
if @hash.respond_to?(name)
@hash.send(name, *args, &block)
else
super
end
end
end
This very simple class uses method_missing
to forward all undefined methods to its @hash
instance variable. That is, as long as the hash object responds to the current message. If it doesn’t, method_missing
calls super
so that the version of method_missing
in the BasicObject
class can raise a NoMethodError
exception. My biggest complaint about this class is that while it’s pretending to be a Hash
, it doesn’t do a very good job. Consider this:
irb> h = HashProxy.new
irb> h.respond_to?(:size)
---> false
irb> h.size
---> 0
irb> h.public_methods(false)
---> []
Ruby programmers espouse that duck typing is the correct way to work with dynamic types. The type of an object isn’t what’s important, it’s the interface that you should be concerned with. But using method_missing
this way exposes no interface at all. Using respond_to?
and other introspective methods to confirm that an object supports the needed interface isn’t possible. Even if you prefer run time NoMethodError
exceptions to using respond_to?
, there’s no good reason why delegation can’t be implemented properly. (One way to fix respond_to?
is with respond_to_missing?
. We’ll look at why it’s not a great solution in just a bit.)
If you’ve decided that the Forwardable
module won’t work for you, the next best thing is to use define_method
. That’s essentially what the Forwardable
module is doing behind the scenes anyway. Given a method name and a block, define_method
will create an instance method whose body and arguments are specified by the block. It’s a private class method so you can’t call it with a receiver. That’s usually not a problem though. Consider this implementation of the HashProxy
class:
class HashProxy
Hash.public_instance_methods(false).each do |name|
define_method(name) do |*args, &block|
@hash.send(name, *args, &block)
end
end
def initialize
@hash = {}
end
end
This version uses a little metaprogramming to iterate over the public instance methods from the Hash
class. For each of them an instance method is created in HashProxy
using define_method
. The generated methods simply forward their messages on to the @hash
object. The effect is the same as the version which used method_missing
, but this implementation is more explicit and correctly exposes a Hash
-like interface. See for yourself:
irb> h = HashProxy.new
irb> h.respond_to?(:size)
---> true
irb> h.public_methods(false).sort.take(5)
---> [:==, :[], :[]=, :assoc, :clear]
Ah, that looks much better. I would argue that using define_method
in this case isn’t any more complicated than using method_missing
. In fact, I’d say that it’s quite a bit clearer and you don’t have to guess at what’s going on. It’s too easy for method_missing
to become a black hole of confusion. Hopefully you can now see that it’s also easy to replace with define_method
. Let’s solidify this by looking at a more complicated example.
The next major thing that Ruby programmers use method_missing
for is to implement the decorator pattern. This pattern is very similar to the delegation pattern we just explored, but with a twist. Classes implementing the decorator pattern wrap an arbitrary object and extend its capabilities in some way. In the HashProxy
class we knew ahead of time that we’d be forwarding messages to a hash object. The decorator class, on the other hand, accepts objects of any class and needs to delegate to them appropriately. Let’s write a decorator class which records log entries before delegating to the target object. It’s pretty easy to implement using method_missing
:
class AuditDecorator
def initialize (object)
@object = object
@logger = Logger.new($stdout)
end
private
def method_missing (name, *args, &block)
@logger.info("calling `#{name}' on #{@object.inspect}")
@object.send(name, *args, &block)
end
end
The AuditDecorator
class can add a method logging feature to any object. Calling a method on an instance of AuditDecorator
will log the message and then forward it to the wrapped object, with an exception of course. Methods already defined in AuditDecorator
(or its superclasses) won’t trigger method_missing
. Therefore, implementing the decorator pattern this way isn’t as transparent as we’d like. Consider this:
irb> fake = AuditDecorator.new("Am I a String?")
irb> fake.downcase
INFO: calling `downcase' on "Am I a String?"
---> "am i a string?"
irb> fake.class
---> AuditDecorator
As before, using method_missing
means that AuditDecorator
instances don’t respond correctly to introspective methods such as respond_to?
and public_methods
. But now we have an additional problem. The AuditDecorator
instance methods get in the way and keep us from logging and forwarding methods like class
. Ideally, the decorator class would be completely transparent and would forward all methods. That’s where define_method
comes in. However, since the AuditDecorator
class can wrap any object we’re going to have to rely on a little more metaprogramming to make this work. The initialize
method needs to inspect the object which it’s wrapping and then create the appropriate forwarding methods. But those methods can’t be instance methods for the AuditDecorator
class since each instance should be able to wrap different objects with different classes. Therefore, the generated methods will have to exist in a single instance of AuditDecorator
and not for all AuditDecorator
instances. Thankfully, we have anonymous modules to work with:
class AuditDecorator
def initialize (object)
@object = object
@logger = Logger.new($stdout)
mod = Module.new do
object.public_methods.each do |name|
define_method(name) do |*args, &block|
@logger.info("calling `#{name}' on #{@object.inspect}")
@object.send(name, *args, &block)
end
end
end
extend(mod)
end
end
There’s a bit more going on in this version compared to its predecessor. In order to generate methods on the current AuditDecorator
instance (as opposed to all AuditDecorator
instances) we need to create an anonymous module and define the methods we want inside the module. Then, all we have to do is extend
the AuditDecorator
instance with the anonymous module, thus coping all of those generated methods into the instance. Pretty slick huh? In exchange for a few extra lines of code we now have full transparency:
irb> fake = AuditDecorator.new("I'm a String!")
irb> fake.downcase
INFO: calling `downcase' on "I'm a String!"
---> "i'm a string!"
irb> fake.class
INFO: calling `class' on "I'm a String!"
---> String
There’s one very subtle but important thing I need to point out. Did you notice that inside the block given to Module::new
the public_methods
message was sent to the object
variable and not @object
? That’s because inside the module—but outside a method definition—@object
refers to a module variable and not the instance variable defined in AuditDecorator#initialize
. The block does, however, form a closure which allows us to access the local variables defined in the initialize
method. That’s why using the object
variable works from inside the module. Knowing about these scoping rules will keep you from pulling your hair out while you’re exercising Ruby’s metaprogramming features.
I would be remiss if I didn’t mention a method related to define_method
. While only modules and classes respond to define_method
, objects have their own version called define_singleton_method
. Thanks to this metaprogramming gem we can remove the need for Module::new
in the previous example. Using define_singleton_method
has the same effect as defining a method in an anonymous module which is then extended:
class AuditDecorator
def initialize (object)
@object = object
@logger = Logger.new($stdout)
@object.public_methods.each do |name|
define_singleton_method(name) do |*args, &block|
@logger.info("calling `#{name}' on #{@object.inspect}")
@object.send(name, *args, &block)
end
end
end
end
Replacing method_missing
with define_method
doesn’t just make your code more explicit, it also restores proper introspection capabilities, and in the case of the decorator pattern, allows for complete transparency. Before reaching for method_missing
you should consider whether using defined_method
is possible. I’ve yet to find a situation where method_missing
was the only possible solution. If you do find yourself unable to use defined_method
(or define_singleton_method
) then there’s one last trick you should know about.
Just as method_missing
is a catchall for method dispatch, introspection in Ruby uses the respond_to_missing?
method as a half-baked callback. If you define this method it will be called for two different reasons. First, if you use respond_to?
to see if an object responds to a specific message and a matching method isn’t defined, respond_to?
will invoke respond_to_missing?
, giving you the opportunity to make respond_to?
return true. For the HashProxy
above you’d want to implement the respond_to_missing?
method like this:
def respond_to_missing? (name, include_private)
@hash.respond_to?(name, include_private) || super
end
Adding this code to the HashProxy
class would allow the respond_to?
method to return true
for all of the Hash
instance methods. The reason I said that it’s half-baked is because it doesn’t have any affect on methods like public_methods
. This leads to some of the introspective methods reporting that a method exists while others say it doesn’t. That’s even more confusing in my opinion. If you’re using method_missing
it’s the best you can do.
The other reason respond_to_missing?
is used has to do with an interesting method which goes by the name “method
”. It takes the name of a method and returns an object which can be used to invoke the method at a later time. If you call method
for a method which isn’t defined but for which respond_to_missing?
returns true, Ruby will return a Method
object that encapsulates a call to method_missing
. Again, this is another example where some methods report one interface and other methods report a different interface. Something that isn’t a problem with define_method
.
• Prefer define_method
to method_missing
.
• If you absolutely must use method_missing
consider defining respond_to_missing?
.
We’ve seen that Ruby has a rich set of features for run time metaprogramming, and without a doubt the most powerful of these are the family of eval
methods. The vanilla eval
method is similar to those found in other interpreted languages, you build up a string of valid Ruby code and have it evaluated at run time. This, of course, can be very dangerous, especially in an application which is processing untrustworthy data. Fortunately, there’s rarely ever a valid reason to evaluate a string these days. That’s because the majority of the evaluation methods in Ruby all except blocks of code. Combined with metaprogramming tricks like define_method
, you can replace nearly every use of string evaluation with block evaluation.
Knowing which of the evaluation methods to use can be the confusing part. Most of them have names which hint at how they work. Or more accurately, the context in which they evaluate their input. That’s the major differentiating feature between them, that and what they’re willing to evaluate: strings, blocks, or both. Most of the evaluation methods use their receiver as the evaluation context. A notable exception is the eval
method from the Kernel
module.
While eval
only accepts strings as input, you can have that string evaluated in any context you want. If you don’t specify a context then the string is evaluated as if it were written into the code at the point where eval
is used. That’s not always desirable. Perhaps you don’t want to expose the variables which are currently in scope. In this case you can explicitly provide a Binding
object which represents the context in which the string should be evaluated. The Kernel
module defines a private method called binding
which captures the local scope and returns it inside a Binding
object. This context can then be given to eval
as its second argument:
irb> def glass_case_of_emotion
x = "I'm in a " + __method__.to_s.tr('_', ' ')
binding
end
irb> x = "I'm in scope"
irb> eval("x")
---> "I'm in scope"
irb> eval("x", glass_case_of_emotion)
---> "I'm in a glass case of emotion"
Being able to specify the exact context to use for evaluation is pretty neat. But eval
only accepts strings as input, so you need to be very careful with what goes into them. Allowing any untrustworthy data (such as user input) to make their way into eval
exposes your application to code injection attacks. That’s why we’ll turn our attention away from strings and towards blocks. All of the remaining evaluation methods support blocks as input. The only difference between them is the context they use when evaluating those blocks.
Thanks to the BasicObject
class, every object in Ruby responds to the instance_eval
method. Its name provides a clue about the context it uses when evaluating its input. Unlike with eval
, you can’t provide a Binding
object directly to instance_eval
. Instead, the object you invoke instance_eval
on becomes the context for the evaluation. This allows you to reach into an object and access its private methods and instance variables. Things get a little confusing when you start defining methods with instance_eval
. In order to play around with the evaluation methods let’s look at a simple Widget
class:
class Widget
def initialize (name)
@name = name
end
end
Now we can see how instance_eval
can be used to access instance variables and define methods:
irb> w = Widget.new("Muffler Bearing")
irb> w.instance_eval {@name}
---> "Muffler Bearing"
irb> w.instance_eval do
def in_stock?; false; end
end
irb> w.singleton_methods(false)
---> [:in_stock?]
If you use instance_eval
to define a method then that method will only exist for a single object. In other words, instance_eval
creates singleton methods. What happens if you use instance_eval
with a class object? What are singleton methods in the context of a class? Yep, class methods. Observe:
irb> Widget.instance_eval do
def table_name; "widgets"; end
end
irb> Widget.table_name
---> "widgets"
irb> Widget.singleton_methods(false)
---> [:table_name]
It might be a bit confusing at first but if you remember that methods defined using instance_eval
are singleton methods, you’ll be in good shape. What if, on the other hand, we wanted to define an instance method in the Widget
class so that it’s available to all instances? That’s where our next evaluation method comes in: class_eval
. Just as its name suggests, class_eval
evaluates a string or a block in the context of a class. It’s exactly like opening the class back up and inserting new code. Anything you can do between the class
and end
keywords in a normal class definition can be done using class_eval
. For example:
irb> Widget.class_eval do
attr_accessor(:name)
def sold?; false; end
end
irb> w = Widget.new("Blinker Fluid")
irb> w.public_methods(false)
---> [:name, :name=, :sold?]
You can’t use class_eval
on just any object though. As a matter of fact, it’s defined in the Module
module as a singleton method which means it can only be used on modules and classes. There’s even an alias for it so you can make your code look better when you’re manipulating modules: module_eval
. It’s purely aesthetics though, there’s no difference between class_eval
and module_eval
.
An easy way to remember the context for these evaluation methods is to think about the receiver. While evaluating their input the instance_eval
and class_eval
methods set the self
variable to their receiver. That’s why you can access instance variables with instance_eval
and define instance methods with class_eval
. They also yield their receiver to the input block. This can sometimes be useful if there’s some indirection between the receiver and the block. For example, consider this variant of the Widget
class:
class Widget
attr_accessor(:name, :quantity)
def initialize (&block)
instance_eval(&block) if block
end
end
irb> w = Widget.new do |widget|
widget.name = "Elbow Grease"
@quantity = 0
end
irb> [w.name, w.quantity]
---> ["Elbow Grease", 0]
Because the block given to initialize
is passed to instance_eval
, it is evaluated in the context of the new Widget
object. When instance_eval
invokes the block it sets self
to its receiver (the Widget
object) and yields the same object to the block. Since the self
variable is set to the Widget
object the block can manipulate internal instance variables directly as if it were an instance method. This might sometimes be useful, but it does break encapsulation.
So far we’ve focused on some simple uses of run time evaluation. It might seem that evaluating a block isn’t as flexible as evaluating a string since you’re stuck with static code. It’s actually quite common to see instance_eval
or class_eval
used in Ruby libraries out in the wild with strings instead of blocks. Let’s put this myth to bed using the final set of evaluation methods: instance_exec
, class_exec
, and module_exec
.
These methods are very similar to their eval
counterparts. Where the eval
versions accept strings or blocks the exec
variants only accept blocks. They also differ in what they yield to their blocks. The exec
methods don’t yield anything to their blocks by default. Instead, any arguments given to them are passed on to the block. This gives us enough power to do just about everything we can do when evaluating strings.
Suppose you have a class to represents a counter which can be used to increment an instance variable, but not reset it back to its starting value. Why can’t you reset it? Don’t ask me, this is supposed to be your code, not mine. Anyways, also suppose that you don’t want to add a reset feature directly to the class but instead you want to reset counter objects externally. Consider this:
class Counter
DEFAULT = 0
attr_reader(:counter)
def initialize (start=DEFAULT)
@counter = start
end
def inc
@counter += 1
end
end
You’d like to reset the @counter
instance variable back to the value set in the DEFAULT
constant. You also want to make this code generic so you can use it with other classes in the future. As a first stab in the dark you resort to evaluating a string:
module Reset
def self.reset_var (object, name)
object.instance_eval("@#{name} = DEFAULT")
end
end
Using this helper method is pretty straight forward. If you give the reset_var
module function an object and a variable name it will set an instance variable with that name to the value in DEFAULT
. But notice what happens if you give it an invalid variable name:
irb> c = Counter.new(10)
---> #<Counter @counter=10>
irb> Reset.reset_var(c, "counter")
---> 0
irb> Reset.reset_var(c, "x;")
SyntaxError: (eval):1:
syntax error, unexpected '=', expecting end-of-input
This example might be slightly contrived but it does demonstrate how to inject code into an evaluation method. Let’s look at how we can use instance_exec
to write reset_var
without having to evaluate a string. Since instance_exec
will pass its arguments on to the block which it evaluates, we can use that as a way to pass constructed names into the block for use in methods like instance_variable_set
:
module Reset
def self.reset_var (object, name)
object.instance_exec("@#{name}".to_sym) do |var|
const = self.class.const_get(:DEFAULT)
instance_variable_set(var, const)
end
end
end
Ruby’s metaprogramming API is rich enough that we rarely need to even use the evaluation methods, especially those which evaluate strings. Methods like define_method
and instance_variable_set
are also much easier to read than a mess of strings with interpolated variables. Back to our use of instance_exec
, look what happens now if you give reset_var
a bad variable name:
irb> Reset.reset_var(c, "x;")
NameError: `@x;' is not allowed as an instance variable name
This time the code raises a NameError
unlike the previous version which raised a SyntaxError
. The difference of course is that the string which was passed to reset_var
wasn’t evaluated as Ruby code. Instead, it was only used to look up an instance variable in an object’s variable table. But before that even happened it was validated to ensure that it could be used as a valid variable name, which failed and raised an exception. This is another important difference between the variants of eval
, one that you’ll want to keep in mind.
• Methods defined using instance_eval
or instance_exec
are singleton methods.
• The class_eval
, module_eval
, class_exec
, and module_exec
methods can only be used with classes and modules. Methods defined with one of these become instance methods.
Unless you’ve been living on an island where this book happened to wash up you’ve no doubt heard of Ruby on Rails. It made a pretty big splash in the web application development community and helped put Ruby in the spotlight. But it hasn’t been all rainbows and roses. Rails includes a library called Active Support which modifies nearly every Ruby core class, something referred to as “monkey patching”.
While it’s not a new concept, Active Support’s heavy use of monkey patching kicked up quite a bit of dust in the Ruby community. The lines were drawn, you were either for or against monkey patching. Okay, maybe it wasn’t so dramatic, but there are definitely some outspoken Ruby experts who strongly advise against modifying the core classes. So, what’s the harm; what’s so bad about monkey patching?
As you know, classes, objects, and modules are always open in Ruby. You can modify them at any point while a program is running. Nothing is really off limits, maybe you’ll get a run time warning, maybe not. Have you ever wished that the String
class had a to_french
method? No problem, open the class or use something like class_exec
to add it. But what if one of the libraries you’re using in a project also adds a String#to_french
method? This is a pretty big problem and is often referred to as patch collision. Ruby will definitely give you a warning if you redefine an existing method, which is why Item 5 urges you to pay attention to run time warnings. But as we’ll see shortly, there are ways to run into patch collision without triggering a warning.
Now, if both of the colliding methods do the exact same thing, maybe you won’t care so much. If they happen to do completely different things but collide because they have the same name, well, you’ll probably care a lot more. Actually, both situations seem pretty serious to me. I want to know for sure which implementation of a method I’m using and that it’s been tested appropriately. Imagine tearing your hair out because your code looks totally fine, except thanks to monkey patching, it’s not your code which is actually running and breaking things. Trust me, this has happened to me more than once. Clearly monkey patching can be dangerous. That’s why we’re going to explore alternatives, ways to do some of the same things you can do with monkey patching but with different trade-offs.
There are a handful of Ruby Gems (including Active Support) which monkey patch the String
class to add a method which tests if a string is empty or only includes space characters. It’s a surprisingly useful feature which has somehow avoided getting included into the official String
class. Let’s write our own version of this method and experiment with a few ways to use it without resorting to altering the String
class. One of the safest ways to write this method is to make it a module function. Consider this:
module OnlySpace
ONLY_SPACE_UNICODE_RE = %r/A[[:space:]]*z/
def self.only_space? (str)
if str.ascii_only?
!str.bytes.any? {|b| b != 32 && !b.between?(9, 13)}
else
ONLY_SPACE_UNICODE_RE === str
end
end
end
The only_space?
method is callable directly through the OnlySpace
module. The biggest downsides to this technique are rather obvious. It’s not very object-oriented and it’s a bit verbose. Using the module function is simple, but just doesn’t feel right:
irb> OnlySpace.only_space?("
")
---> true
One way to improve upon this is to define an instance method version of only_space?
in the OnlySpace
module. You can then extend individual string objects as necessary.
module OnlySpace
def only_space?
# Forward to module function.
OnlySpace.only_space?(self)
end
end
irb> str = "Yo Ho!"
irb> str.extend(OnlySpace)
irb> str.only_space?
---> false
While this restores some object-oriented flavoring, it’s still a bit long-winded. The upside is that we’ve managed to avoid monkey patching the String
class. Strings which haven’t been extended by our module won’t be affected. On the other hand, this technique introduces inconsistency because some string objects will respond to only_space?
while others won’t. Extending individual objects with a module tends to work best when very little of your code needs to use the methods defined in that module. As you use those methods more and more, you’ll probably want to consider another alternative.
Even though extending individual string objects with the OnlySpace
module doesn’t alter the String
class in any way, it is still a form of monkey patching, albeit on a smaller scale and a bit more controlled. To completely avoid monkey patching altogether, let’s turn to our next technique, creating a new String
class.
Back in Item 21 we looked at changing the behavior of the Hash
class by writing a new class, RaisingHash
. We avoided inheritance in order to maintain total control over which methods were exposed by the RaisingHash
class. Instead, RaisingHash
stored a hash inside an instance variable. It then used method delegation to forward methods to the hash using the Forwardable
module. We can use this technique to create a new string class:
require('forwardable')
class StringExtra
extend(Forwardable)
def_delegators(:@string,
*String.public_instance_methods(false))
def initialize (str="")
@string = str
end
def only_space?
...
end
end
The StringExtra
class works just like the core String
class thanks to the Forwardable
module and the def_delegators
method. Unlike with the RaisingHash
class, I haven’t overridden any methods which might return a new String
object instead of a StringExtra
object. I also haven’t implemented some important methods like freeze
and taint
. If you define a delegating class like StringExtra
, make sure you go back to Item 21 and add in these missing features.
Using StringExtra::new
to wrap an existing string object can be easier to stomach than using the extend
method. Both techniques suffer from the fact that you need to take an extra step to get an object which responds to the only_space?
message. The String
class has a monopoly on automatic string creation from syntax literals. There’s just no getting away from needing to take this extra step if you want to avoid monkey patching. Then again, if you tuck the call to StringExtra::new
away in your initialize
method, it’s not such a big deal. And it’s a lot less painful than having to debug the mess caused by adding methods to an existing class.
Sometimes though, even after considering the alternatives, you really do want to modify one of the core classes. If you’re using at least Ruby 2.0, there’s a feature specifically designed to rein in monkey patching, refinements. You can think of refinements as being somewhat similar to the StringExtra
class, except that Ruby will automatically wrap and unwrap the string which we want to add extra features to. There are two parts to refinements, modifying a class in some way (usually by adding instance methods) and activating those changes for a limited scope.
Refinements are a very interesting way to deal with monkey patching, but they do come with some limitations. The biggest limitation may be that refinements are relatively new to Ruby and they’re still in flux. As I mentioned earlier, they were introduced in Ruby 2.0 as an experimental feature. Defining and activating refinements will produce run time warnings reminding you that these features are subject to change. Starting in Ruby 2.1 refinements are no longer an experimental feature and won’t produce any warnings. But the feature still isn’t considered stable and the next version of Ruby is free to change refinements as necessary.
Another limitation is that you can only refine classes. Attempting to refine anything else—like a module—will raise a TypeError
exception. This probably isn’t too restrictive, just something to keep in mind.
Defining a refinement is done inside a module using the refine
method. You pass in the class you plan to modify as the argument to refine
and do any necessary patching inside a block. Take a look at a refinement which adds the only_space?
method to String
:
module OnlySpace
refine(String) do
def only_space?
...
end
end
end
Using the refine
method to define a refinement isn’t enough to add the only_space?
method to String
, it’s just the first step. The next thing you need to do is to activate the refinement with the using
method. This is where things get a little tricky. Ruby 2.0 only allows you to activate a refinement at the top-level of a file, outside of any module or class definition. After activating the refinement it will be available from that point until the end of the file. Ruby 2.1 is more flexible, you can activate a refinement at the top-level of a file, inside a module, or inside a class. Consider this:
class Person
using(OnlySpace)
def initialize (name)
@name = name
end
def valid?
!@name.only_space?
end
def display (io=$stdout)
io.puts(@name)
end
end
The using
method expects a single argument, a module which contains refinements. The refinements in the module are activated, but only for the current lexical scope. This is an important feature and the reason why refinements are safer than monkey patching. Instead of patching a class and making those changes globally visible, refinements automatically deactivate outside of the lexical scope in which they were activated. Clearly, the only_space?
method is available on strings inside the Person
class. But what about the display
method and the string it passes to puts
? Here’s the cool part. Once control leaves display
and enters puts
, the refinements defined in OnlySpace
are deactivated. The puts
method can’t call only_space?
on the string, that method is no longer available.
It makes sense that puts
can’t use the refinements activated in the Person
class. But you may be wondering why I made it a point to say “lexical scoping” over and over again in the previous paragraph. Obviously it’s important. The lexical scoping rules are stricter than you might first think. For example, if you defined a Customer
class which inherits from Person
, you would not be able to use only_space?
from within Customer
just because its parent class can. Refinements aren’t scoped that way. In this example, if you invoked the valid?
method on a Customer
object it would indeed work correctly since that method is defined in Person
. But any methods defined directly in the Customer
class cannot call only_space?
without the refinement being activated in Customer
first. (For a refresher on the lexical scoping rules take a look at Item 11.)
Just like anything in software development, you should use the simplest technique which will get the job done. If you can get away with something like the StringExtra
class, prefer that over refinements. If you can’t resist the temptation to monkey patch one of the core classes, then at least protect those around you with one of these techniques.
• While refinements might not be experimental anymore, they’re still subject to change as the feature matures.
• A refinement must be activated in each lexical scope in which you want to use it.
Back when I programmed on Motorola 68k processors running Macintosh System 7, I would use a pretty neat hack to replace parts of the operating system with my own code. System 7 had a dispatch table in RAM where it would look up the RAM or ROM locations of system code. I remember working on an application that needed to monitor key press events, even when it wasn’t the active application. You couldn’t do this directly so a really common workaround back then was to patch the system dispatch table.
The technique was simple enough. You start by searching through the dispatch table and find the entry for the system function which handles keyboard events, then you stash away the function’s address somewhere in your running application, and finally you install a new entry with an address that points to your code. (System 7 didn’t have protected memory so an application could write into any RAM location, including the system heap.) When a key was pressed, the operating system would look in the dispatch table, find your address, and then invoke your code instead of the system function. Since you had the location of the original system function you could resume normal keyboard processing by invoking the function at the stored location.
As a matter of fact, this technique was so common that when you fetched the address of the system function from the dispatch table, it might not refer to the actual system function. Another application might have already patched the dispatch table and inserted its address in place of the original. Each application that modified the dispatch table formed a call chain where one function would invoke the next function until the original operating system function was finally called and the chain ended. (Apple itself used this technique to patch bugs that were burned into the system ROM.)
If you think about method names as being addresses to some chunk of code you want to run, it becomes clear that this sort of thing is possible in Ruby too. That’s because you can use alias_method
to give an existing method a new name. You can then call the method by its old name and its new name. But if you then redefine that method and give it a new implementation, you’ll still be able invoke the original implementation through its other name. You can therefore hijack a method like in my dispatch table example and eventually call the real version. This is referred to as alias chaining and an example is in order.
Suppose you want to enhance a method in one of the core classes so it outputs logging information each time it’s called. You don’t want to change its behavior in any way, you just want to wrap around it so you can log when it’s called and when it’s finished. Sounds like a good use of alias chaining to me.
Even though it’s a form of monkey patching, what allows alias chaining to avoid the downsides discussed in Item 30 is that it can be undone, and usually doesn’t alter the behavior of the target class in a way that will affect other code. Let’s take a look at a module which can be used to add logging capabilities to any method, in any class:
module LogMethod
def log_method (method)
# Choose a new, unique name for the method.
orig = "#{method}_without_logging".to_sym
# Make sure name is unique.
if instance_methods.include?(orig)
raise(NameError, "#{orig} isn't a unique name")
end
# Create a new name for the original method.
alias_method(orig, method)
# Replace original method.
define_method(method) do |*args, &block|
$stdout.puts("calling method `#{method}'")
result = send(orig, *args, &block)
$stdout.puts("`#{method}' returned #{result.inspect}")
result
end
end
end
When a class is extended with the LogMethod
module, it will receive a new class method named log_method
. You can use log_method
to wrap any existing method so that it outputs messages before and after it invokes the original method. Before we dig into the details let’s see it in action:
irb> Array.extend(LogMethod)
irb> Array.log_method(:first)
irb> [1, 2, 3].first
calling method `first'
`first' returned 1
---> 1
irb> %w(a b c).first_without_logging
---> "a"
Before redefining the target method, log_method
uses alias_method
to create a new name for it. The first argument to alias_method
is the new name you want to create and the second argument is the existing name. After alias_method
is called the method will be available by both names. Then log_method
redefines the method using define_method
, giving it a new implementation which performs the logging and uses the aliased name to invoke the original method. It’s like my Macintosh hacking days all over again. Unfortunately, the LogMethod
module isn’t as safe as patching the good old dispatch table in System 7.
One thing you’ll want to ensure is that the new name you create with alias_method
is unique. If a method already exists with that name you’ll clobber it without so much as a warning. That’s why log_method
raises an exception if the aliased name already exists. This version is also a bit simplistic and can’t be used with operators. If you passed “:*
” as the method name to log_method
, it would try to create an alias named “:*_without_logging
”. Obviously that’s not going to work. If you’re looking for something more elaborate and robust you might consider continuously generating method names with a random component until you find one that isn’t already defined. The technique you choose for generating the aliased name will depend on your particular needs.
A final feature to consider is adding a method which can put things back to the way they were originally. This usually involves a call to alias_method
to restore the original implementation and a couple of calls to remove_method
to delete the patched version and aliased name. Consider unlog_method
:
module LogMethod
def unlog_method (method)
orig = "#{method}_without_logging".to_sym
# Make sure log_method was called first.
if !instance_methods.include?(orig)
raise(NameError, "was #{orig} already removed?")
end
# Remove the logging version.
remove_method(method)
# Put the method back to it's original name.
alias_method(method, orig)
# Remove the name created by log_method.
remove_method(orig)
end
end
Alias chaining is an interesting way to intercept method calls. As long as each link in the chain uses a unique name with alias_method
, the original method can eventually be called through the chain.
• When setting up an alias chain, make sure the aliased named is unique.
• Consider providing a method which can undo the alias chaining.
Instances of the Proc
class are ubiquitous in Ruby. Off the top of my head I can think of at least seven different ways to create Proc
objects. And that’s saying something since I have the attention span of a block argument. (That’s right folks, enjoy the comedy while you can.) The most idiomatic way to create a Proc
object is by passing a block to a method. While the block itself is just Ruby syntax, it eventually gets wrapped up in a Proc
and passed to the method. We can see this directly if we write a method that accepts a block and then passes that block through as its return value:
irb> def pass (&block) block; end
irb> greeter = pass {|name| "Hello #{name}"}
---> #<Proc>
irb> greeter.call("World")
---> "Hello World"
The pass
method takes a block and binds it to the variable block
, then simply returns it. What is actually bound to the block
variable is an instance of the Proc
class. Like all good objects you can send it messages, call
being one of them. Creating Proc
objects this way is the most common, but not the only way. There are the proc
and lambda
methods, the Proc::new
method, lambda syntax literals, and several other ways which I could continue to enumerate and bore us both to death. The reason I even bring this up is because all these various ways of creating a Proc
object can be divided into two categories which I’ll call weak and strong. The major differences between weak and strong Proc
objects are how they deal with invalid arguments. (They also differ in how they’re affected by control flow expressions. I won’t go into that here but any introductory book on Ruby should include this information.)
Weak Proc
objects play fast and loose with their arguments. Calling a weak Proc
object with the wrong number of arguments doesn’t raise an exception or produce a warning. If you give too few arguments the missing ones will be set to nil
. If you give too many arguments the extras are ignored. This is much different than strong Proc
objects. Calling a strong Proc
object obeys all the rules of a normal method call. If the given number of arguments isn’t exactly correct, an ArgumentError
exception will be raised. Blocks turn into weak Proc
objects and lambdas into strong, it’s pretty easy to see this in action:
irb> def test
# Yield one argument.
yield("a")
end
irb> test {|x, y, z| [x, y, z]} # Expect 3.
---> ["a", nil, nil]
irb> test {"b"} # Expect 0.
---> "b"
irb> func = ->(x) {"Hello #{x}"} # Expect 1.
---> #<Proc>
irb> func.call("a", "b") # Send 2.
ArgumentError: wrong number of arguments (2 for 1)
You can distinguish between weak and strong Proc
objects using the lambda?
method. It returns false
for weak Proc
objects and true
for strong. This can be helpful in methods which accept blocks because they may receive weak or strong blocks, depending on how they’re called. For example:
irb> def test (&block)
block.lambda?
end
irb> test {|x| x}
---> false
irb> test(& ->(x){x})
---> true
Knowing whether a Proc
is weak or strong isn’t in itself very useful. That is to say, it’s not likely that you’ll want to treat them any differently just by knowing what type of Proc
they are. But knowing that strong Proc
objects raise exceptions if they are called with the wrong number of arguments is good motivation for knowing how many arguments they expect. Let me illustrate this with an example. Suppose you’ve written a class for streaming data from an I/O object to a Proc
. The class feeds the Proc
data in chunks until the input has been exhausted. You also keep track of how many seconds it takes to read each chunk just in case the Proc
wants to calculate throughput. Consider this:
class Stream
def initialize (io=$stdin, chunk=64*1024)
@io, @chunk = io, chunk
end
def stream (&block)
loop do
start = Time.now
data = @io.read(@chunk)
return if data.nil?
time = (Time.now - start).to_f
block.call(data, time)
end
end
end
The stream
method will always give the Proc
object two arguments, the data which was read and the timing information. If the Proc
doesn’t want it—and it’s a weak Proc
—it just ignores the argument. Consider this naive (and inefficient) method for calculating the size of a file:
def file_size (file)
File.open(file) do |f|
bytes = 0
s = Stream.new(f)
s.stream {|data| bytes += data.size}
bytes
end
end
By always yielding two arguments to the Proc
object you’re limiting yourself to weak Proc
objects or alternatively, strong Proc
objects which declare an argument that goes unused. But you can’t always control how many arguments a Proc
expects. What if you wanted to pass a method to stream
instead of a block and that method only accepted a single argument? For example, here’s a method which uses the Stream
class to generate a SHA256 cryptographic hash:
require('digest')
def digest (file)
File.open(file) do |f|
sha = Digest::SHA256.new
s = Stream.new(f)
s.stream(&sha.method(:update))
sha.hexdigest
end
end
The Digest::SHA256
class has a method called update
which allows you to supply data in chunks instead of having to read an entire file into memory. It expects one argument, a string containing the next chuck to add to the hash. We can use the “&
” operator within a method invocation to turn the update
method into a strong Proc
object. But with the Stream
class the way it is now, passing the update
method to stream
will raise an exception because it’s passing two arguments to the Proc
instead of one.
Wouldn’t it be nice if we knew how many arguments the Proc
object expected? That’s where the Proc#arity
method comes in. It returns a Fixnum
which contains the number of arguments which the Proc
object expects to be given. Well, almost. If only it were that simple. Recall that methods can have default arguments, making them optional, what should arity
return in that case? A method might also have a variadic argument by using “*
” to collect all remaining arguments into a single array, basically making the method’s arity infinite. In these cases the arity
method will return a negative Fixnum
which tells you indirectly how many arguments are required.
I say that it’s indirect because the negative Fixnum
is actually the 1s-compliment of the number of required arguments. If a method had one mandatory argument and one optional argument then arity
will return -2
. You can use the unary complement operator (“~
”) to turn that result into the number of required arguments:
irb> func = ->(x, y=1) {x+y}
irb> func.arity
---> -2
irb> ~ func.arity
---> 1
Now we can rewrite the stream
method so that only gives the timing information to Proc
objects that are expecting two arguments:
def stream (&block)
loop do
start = Time.now
data = @io.read(@chunk)
return if data.nil?
arg_count = block.arity
arg_list = [data]
if arg_count == 2 || ~arg_count == 2
arg_list << (Time.now - start).to_f
end
block.call(*arg_list)
end
end
And with that change the digest
method can now pass the Digest::SHA256#update
method as a Proc
object to the stream
method. Being able to use strong Proc
objects this way is a neat trick and something you should consider when writing methods which take blocks.
• Unlike weak Proc
objects, their strong counterparts will raise an ArgumentError
exception if called with the wrong number of arguments.
• You can use the Proc#arity
method to find out how many arguments a Proc
object expects. A positive number means it expects that exact number of arguments. A negative number, on the other hand, means there are optional arguments and is the 1s-compliment of the number of required arguments.
Back in Item 6 we looked into Ruby’s internals to see how including modules into a class altered the inheritance hierarchy. Recall that when you use the include
method from within a class, Ruby creates a singleton class to hold the module’s methods and inserts it as an invisible superclass. When multiple modules are included into a class they are found by the method dispatching algorithm in reverse order. An example makes this easier to visualize:
module A
def who_am_i?
"A#who_am_i?"
end
end
module B
def who_am_i?
"B#who_am_i?"
end
end
class C
include(A)
include(B)
def who_am_i?
"C#who_am_i?"
end
end
The two modules (A
and B
) are included into the C
class. All three define a method named who_am_i?
to help us see how the different implementations override one another based on the include order. We can also use the ancestors
class method to get an idea of how the class hierarchy is constructed and which method would be invoked if we called super
from within each of the who_am_i?
methods. Consider this:
irb> C.ancestors
---> [C, B, A, Object, Kernel, BasicObject]
irb> C.new.who_am_i?
---> "C#who_am_i?"
As you’d expect, methods defined in C
come before those included from the modules. And since the include
method inserts modules between the C
class and its superclass, the B
module comes before A
in the search order. Based on the output from the ancestors
method we can see that if the C#who_am_i?
method used super
it would invoke the B#who_am_i?
method. The ordering is really important because it allows you to emulate multiple inheritance while retaining the ability to override specific methods in the class which is doing the including, just like you can with a traditional parent class. In other words, methods defined in the class take priority over any methods higher up in the hierarchy. Pretty standard object-oriented behavior. But it’s not the only way a module can appear in the class hierarchy.
Starting in Ruby 2.0 you can use the prepend
method as another way to insert a module into the inheritance hierarchy. It looks and feels just like the include
method, it even has its own module hook called prepended
. But prepend
works much differently than include
. Where include
inserts a list of modules between the receiver and its superclass, prepend
inserts them before the receiver. That’s right, before. This makes for some very surprising changes to method dispatching. Let’s change the C
class to use prepend
instead of include
and see what happens:
class C
prepend(A)
prepend(B)
def who_am_i?
"C#who_am_i?"
end
end
irb> C.ancestors
---> [B, A, C, Object, Kernel, BasicObject]
irb> C.new.who_am_i?
---> "B#who_am_i?"
After prepending the A
and B
modules into the C
class you can see that they show up before C
in the list of ancestors. Calling the who_am_i?
method on an instance of C
will therefore trigger the implementation in the B
module first, before the definition in C
is even seen. The biggest side effect of prepending is that you can no longer override a module’s method by simply defining a version in the class. The definition in the class is overridden by the module and will only be invoked if the module’s method calls super
. This goes against the grain of most object-oriented languages because method dispatch will start below the object’s class in the hierarchy and only make its way upward to the class if the method isn’t found or super
is used.
You might be wondering if prepending a module is useful or not. For the most part it gives us a second way of doing things we could already do without it. Take method alias chaining from Item 33 for example. We used alias_method
to create a new name for an existing method so we could redefine it with a new implementation but retain the ability to invoke the original implementation. This is analogous to prepending a module in order to redefine a method and then using super
to access the original. I prefer using alias_method
, however, because it’s easy to put things back the way they were by using it a second time to restore the original implementation. There’s no way to do the same thing with prepend
, removing a module once it’s been prepended isn’t possible.
Overall, using prepend
to add a module to a class leaves the inheritance hierarchy in a non-intuitive state. If you’re going to use it, think very carefully about how it affects method dispatching before proceeding.
• Using the prepend
method inserts a module before the receiver in the class hierarchy, which is much different than include
which inserts a module between the receiver and its superclass.
• Similar to the included
and extended
module hooks, prepending a module triggers the prepended
hook.