Custom YAML Emitter
Friday, March 23, 2007 by Nate Murray.
Rational Numbers
Just recently I needed to store a rational
number in a database. YAML is
perfect for this sort of thing. Unfortunately there isn’t a built in to_yaml
for the standard Rational
class.
require 'yaml'
rat = Rational(4,3)
rat.to_s
y = YAML.dump(rat)
back = YAML.load(y)
Notice that rat
gets emitted as a vanilla ruby object with class Rational
but then the emitter just converts rat
into a string and we get "4/3"
appended to the YAML output. Because the YAML parser doesn’t know what do to
with the string "4/3"
we get back a Rational
object but it doesn’t have its
numerator or denominator set. We want back
to be set to Rational(4, 3)
,
just like the original object.
Register Your Class
What we need to do is register our Rational
class with YAML so that it knows
how to emit and parse our specific type of object.
We can specify our yaml_type
by defining a method to_yaml_type
. We then
register with YAML by calling YAML::add_domain_type
and passing it a block.
The YAML parser will then call this block when it tries to emit an object of
this matching type.
Notice below that YAML::add_domain_type
yields two variables type
and
val
. type
is the YAML type we specified with to_yaml_type
and the val
is the value that was stored during the YAML creation process.
require 'yaml'
class Rational
def to_yaml_type; "!pasadenarb.com,2007-03-23/rational"; end
end
YAML::add_domain_type( "pasadenarb.com,2007-03-23", "rational") do |type, val|
type
val
end
rat = Rational(4,3)
yam = YAML.dump(rat)
back = YAML.load(yam)
Simple Parsing
Notice here that YAML.load
returned the value returned by the block we passed
add_domain_type
. In this case it is val
("4/3"
). We are a little closer to
our goal, but back
is still not a Rational
, it’s a String. What we need to
do is improve on the block we are passing to add_domain_type
.
We are getting the string "4/3"
in val
so we can derive the numerator and
denominator from that string and then return a Rational
number from that
string.
require 'yaml'
class Rational
def to_yaml_type; "!pasadenarb.com,2007-03-23/rational"; end
end
YAML::add_domain_type( "pasadenarb.com,2007-03-23", "rational") do |type, val|
num, den = val.split("\/")
Rational(num.to_i, den.to_i)
end
rat = Rational(4,3)
yam = YAML.dump(rat)
back = YAML.load(yam)
Notice that back
is Rational(4, 3)
, just as we originally wanted.
However this method is not as perfect as it could be. In this case we are able
to derive the attributes we need pretty easily, but what if we had a more
complicated object that did not store all of its attributes if you call #to_s
on the object? What we need is more control over the YAML creation process.
Thankfully that power is available by creating our own #to_yaml
method.
Advanced Emitting
If you look at the #to_yaml
method below you will see that we are iterating
through the instance_variables
and setting the key to be the instance
variable name and the value is the instance variable value.
Then when we need to create the Rational
number from that we just grab the
hash keys from val
.
require 'yaml'
class Rational
def to_yaml_type; "!pasadenarb.com,2007-03-23/rational"; end
def to_yaml( opts = {} )
YAML.quick_emit( self.object_id, opts ) { |out|
out.map( taguri, to_yaml_style ) { |map|
instance_variables.sort.each { |iv|
map.add( iv[1..-1], instance_eval( iv ) )
}
}
}
end
end
YAML::add_domain_type( "pasadenarb.com,2007-03-23", "rational") do |type, val|
num, den = val["numerator"], val["denominator"]
Rational(num.to_i, den.to_i)
end
rat = Rational(4,3)
yam = YAML.dump(rat)
back = YAML.load(yam)
Conclusion
As you can see YAML is a very powerful way to get complex objects into strings.
There are a few other shortcuts to get custom objects into YAML such as
defining #to_yaml_properties
. If you are interested in doing something simple, I’d start by looking here.
Labels: articles, ruby, yaml
Introduction to Bindings
Tuesday, March 20, 2007 by Nate Murray.
The Pixaxe book defines Binding objects to:
encapsulate the execution context at some particular place in the code and
retain this context for future use.
You can get a Binding for the current context by calling
Kernel#binding.
The Binding stores information about the variables, methods, and
self and you can access them by passing the Binding to
eval.
class Product
def set_title(title)
@title = title
end
def get_binding
binding
end
end
p = Product.new
p.set_title("nice and shiny")
q = Product.new
q.set_title("old and ugly")
eval "@title", p.get_binding
eval "@title", q.get_binding
You can see here that @title gets evaluated differently depending on
the binding. The first eval
returns "nice and shiny"
because
that is the value of @title
for the first Product
p
.
Blocks and Procs
Blocks carry information about their Binding.
a = "inside a"
a_block = lambda { a }
def try_to_set_a(block)
a = "resetting a"
block.call
end
try_to_set_a(a_block)
Notice here that a
is "inside a"
and not "resetting a"
. This is beacuse
a block stores the variables as they were originally defined. The a
in
try_to_set_a
does not interfere with the a
in a_block
.
An interesting note is that you can redefine variable within a Binding
.
a = "inside a"
a_block = lambda { a }
def try_to_set_a(block)
a = "resetting a"
block.call
end
eval "a = 'something else'"
try_to_set_a(a_block)
This is because the Binding
in this case is the top-level binding which
happens to be the same binding in which a
was defined in originally.
Practical Use
Bindings are often used when evaluating ERB
. (For those of you who don’t know,
ERB
is a template system that is included in the Ruby Standard Library.)
ERB#result
takes a Binding
object as its argument and the variables in the
ERB
template are evaluated in this context.
Going back to our Product example from earlier, lets see how we can use the
Product’s bindings in this fashon:
require 'erb'
class Product
def set_title(title)
@title = title
end
def set_cost(cost)
@cost = cost
end
def get_binding
binding
end
end
p = Product.new
p.set_title("nice and shiny")
p.set_cost("19.95")
q = Product.new
q.set_title("old and ugly")
q.set_cost("230.00")
template = ERB.new <<-EO_ERB
== Invoice
Title: <%= @title %>
Cost: <%= @cost %>
EO_ERB
template.result(p.get_binding)
template.result(q.get_binding)
Conclusion
As you can see Binding
is a very handy object but this article serves as
only an introduction to the subject. Here are a couple articles that deal
with binding a little more in-depth.
Jim Weirich’s Variable Bindings in Ruby
Pick Axe page on BindingLabels: articles, bindings, ruby
6 Ways to Run Shell Commands in Ruby
Tuesday, March 13, 2007 by Nate Murray.
Often times we want to interact with the operating system or run shell commands
from within Ruby. Ruby provides a number of ways for us to perform this task.
Exec
Kernel#exec (or simply exec) replaces the current process by
running the given command For example:
$ irb
>> exec 'echo "hello $HOSTNAME"'
hello nate.local
$
Notice how exec replaces the irb process is with the
echo command which then exits. Because the Ruby effectively ends this
method has only limited use. The major drawback is that you have no knowledge
of the success or failure of the command from your Ruby script.
System
The system command operates similarly but the system command
runs in a subshell instead of replacing the current process. system
gives us a little more information than exec in that it returns
true if the command ran successfully and false otherwise.
$ irb
>> system 'echo "hello $HOSTNAME"'
hello nate.local
=> true
>> system 'false'
=> false
>> puts $?
256
=> nil
>>
system sets the global variable $? to the exit status of the
process. Notice that we have the exit status of the false command
(which always exits with a non-zero code). Checking the exit code gives us the
opportunity to raise an exception or retry our command.
System is great if all we want to know is “Was my command successful or not?”
However, often times we want to capture the output of the command and then use
that value in our program.
Backticks (`)
Backticks (also called “backquotes”) runs the command in a subshell and returns
the standard output from that command.
$ irb
>> today = `date`
=> "Mon Mar 12 18:15:35 PDT 2007\n"
>> $?
=> #<Process::Status: pid=25827,exited(0)>
>> $?.to_i
=> 0
This is probably the most commonly used and widely known method to run commands
in a subshell. As you can see, this is very useful in that it returns the
output of the command and then we can use it like any other string.
Notice that $? is not simply an integer of the return status but
actually a Process::Status object. We have not only the exit status
but also the process id. Process::Status#to_i gives us the exit status
as an integer (and #to_s gives us the exit status as a string).
One consequence of using backticks is that we only get the standard
output (stdout) of this command but we do not get the standard
error (stderr). In this example we run a Perl script which outputs
a string to stderr.
$ irb
>> warning = `perl -e "warn 'dust in the wind'"`
dust in the wind at -e line 1.
=> ""
>> puts warning
=> nil
Notice that the variable warning doesn’t get set! When we warn in
Perl this is output on stderr which is not captured by backticks.
IO#popen
IO#popen is another way to run a command in a subprocess.
popen gives you a bit more control in that the subprocess standard
input and standard output are both connected to the IO object.
$ irb
>> IO.popen("date") { |f| puts f.gets }
Mon Mar 12 18:58:56 PDT 2007
=> nil
While IO#popen is nice, I typically use Open3#popen3 when I
need this level of granularity.
Open3#popen3
The Ruby standard library includes the class Open3. It’s easy to use
and returns stdin, stdout and stderr. In this
example, lets use the interactive command dc. dc is
reverse-polish calculator that reads from stdin. In this example we
will push two numbers and an operator onto the stack. Then we use p to
print out the result of the operator operating on the two numbers. Below we
push on 5, 10 and + and get a response of
15\n to stdout.
$ irb
>> stdin, stdout, stderr = Open3.popen3('dc')
=> [#<IO:0x6e5474>, #<IO:0x6e5438>, #<IO:0x6e53d4>]
>> stdin.puts(5)
=> nil
>> stdin.puts(10)
=> nil
>> stdin.puts("+")
=> nil
>> stdin.puts("p")
=> nil
>> stdout.gets
=> "15\n"
Notice that with this command we not only read the output of the command
but we also write to the stdin of the command. This allows us a
great deal of flexibility in that we can interact with the command if needed.
popen3 will also give us the stderr if we need it.
# (irb continued...)
>> stdin.puts("asdfasdfasdfasdf")
=> nil
>> stderr.gets
=> "dc: stack empty\n"
However, there is a shortcoming with popen3 in ruby 1.8.5 in that it
doesn’t return the proper exit status in $?.
$ irb
>> require "open3"
=> true
>> stdin, stdout, stderr = Open3.popen3('false')
=> [#<IO:0x6f39c0>, #<IO:0x6f3984>, #<IO:0x6f3920>]
>> $?
=> #<Process::Status: pid=26285,exited(0)>
>> $?.to_i
=> 0
0? false is supposed to return a non-zero exit status! It is
this shortcoming that brings us to Open4.
Open4#popen4
Open4#popen4 is a Ruby Gem put together by Ara Howard. It operates
similarly to open3 except that we can get the exit status from the
program. popen4 returns a process id for the subshell and we can get
the exit status from that waiting on that process. (You will need to do a
gem instal open4 to use this.)
$ irb
>> require "open4"
=> true
>> pid, stdin, stdout, stderr = Open4::popen4 "false"
=> [26327, #<IO:0x6dff24>, #<IO:0x6dfee8>, #<IO:0x6dfe84>]
>> $?
=> nil
>> pid
=> 26327
>> ignored, status = Process::waitpid2 pid
=> [26327, #<Process::Status: pid=26327,exited(1)>]
>> status.to_i
=> 256
A nice feature is that you can call popen4 as a block and it will
automatically wait for the return status.
$ irb
>> require "open4"
=> true
>> status = Open4::popen4("false") do |pid, stdin, stdout, stderr|
?> puts "PID #{pid}"
>> end
PID 26598
=> #<Process::Status: pid=26598,exited(1)>
>> puts status
256
=> nil
Please send comments and revision suggestions to Nate Murray
$Id$ Tue Mar 13 07:45:42 PDT 2007
Labels: articles, shell
Testing Private Methods
Friday, March 09, 2007 by Nate Murray.
This probably isn't news to most of you, but it might help someone. Sometimes you want to test private methods. If you want you can just set the method to be public from within a #class_eval. Then call it in your test. For example:
def test_private_method
product = products(:first)
product.class.class_eval do
public :some_private_method
end
assert product.some_private_method
end
Labels: beginner, ruby, snippets
Directory Trees
Tuesday, March 06, 2007 by Nate Murray.
Below is a code snippet for putting directory tree into a data structure.
Basically what I wanted was for each folder to be a hash with the key being the folder name and the value was an array of the files and folders it contains. For example:
The folders:
content/policy
content/policy/privacy_policy.txt
content/policy/about_us.txt
content/policy/mean_policy
content/policy/mean_policy/nice_people.txt
content/policy/mean_policy/mean_people.txt
content/index.txt
content/content
content/content/misc
Creates the structure:
{"content"=>
[{"content"=>[{"misc"=>[]}]},
"index.txt",
{"policy"=>
["about_us.txt",
{"mean_policy"=>["mean_people.txt", "nice_people.txt"]},
"privacy_policy.txt"]}]}
The recursive code snippet is posted below:
def content_files_in_dir(dir, results = {}, opts = {})
return nil unless File.exist?(dir)
entries = Dir.entries(dir).delete_if { |f| f =~ /^\./ }
key = File.basename(dir)
values = []
entries.each do |entry|
full_entry = File.join(dir, entry)
values << ( File.directory?(full_entry) ?
content_files_in_dir(full_entry, results, opts) :
entry )
end
{ key => values }
end
Labels: recursion, ruby, snippets
Using Parameters as Default Parameters
by Nate Murray.
I noticed something interesting about arguments in parameters today. You can actually use default parameters in data structures in other default parameters. For instance:
[nathan@nate ~]$ irb
>> def foo(arg1, arg2 = [arg1])
>> puts arg1.inspect
>> puts arg2.inspect
>> end
=> nil
>> foo 3
3
[3]
=> nil
Labels: interesting, ruby, snippets
Dealing with Collections in Rails Views
Wednesday, February 14, 2007 by Nate Murray.
Often, we want to render collections of things in Rails views. In this example
the end goal is to output a list of Category names with urls. This is often
referred to as "breadcrumbs"
Example 1
For our breadcrumbs the first thing one might think to do is create a helper to perform this function.
For those of you who are new to Rails, helpers are often designed to return a string and this string is added to the output.
Below is our first attempt at a helper for this problem. This code returns a string with our anchor tags separated by an image.
== helpers/sites_helper.rb
def make_breadcrumbs(breadcrumbs)
crumbs = []
spacer = "<img src='/images/separator.gif' width=5 height=5
border=0>"
breadcrumbs.each do |crumb|
crumbs << "<a href='#{url_for_category(crumb)}'>
#{crumb.name}</a>"
end
return crumbs.join(spacer)
end
We can call this helper easily from our view:
== views/sites/_breadcrumbs.rhtml
<%= make_breadcrumbs(@breadcrumbs) %>
This works just fine, but there are a couple consequences:
- its very specific to our problem.
- All the HTML is in the code instead of in our views.
Example 2
"But wait a minute," you ask, "Doesn't Rails come with built-in collection-rendering methods?" You're right! In fact we can use Rails render method to achieve the same effect.
== views/sites/_breadcrumbs.rhtml
<%= render :partial => "sites/breadcrumb",
:collection => @breadcrumbs,
:spacer_template => 'sites/breadcrumb_spacer' %>
In our view we call the render method and we pace the options :collection and :spacer_template. This renders the partial sites/breadcrumb and creates the local variable breadcrumb with each element of @breadcrumbs.
The two templates are below:
== views/sites/_breadcrumb.rhtml
<a href='<%= url_for_category(breadcrumb)%>'><%=breadcrumb.name%></a>
== views/sites/_breadcrumb_spacer.rhtml
<img src='/images/separator.gif' width=5 height=5 border=0>
A benefit of this is that all of our HTML is in our views.However, a consequence that we have to create two new partials that contain only one line each.
What we would really want is a way to keep all of our html in our view while not having to create any extra templates.
Example 3
Here is an example of the syntax we want:
== views/sites/_breadcrumbs.rhtml
<% spacer = capture do %>
<img src="/images/separator.gif" width=5 height=5 border=0>
<% end %>
<% with_collection @breadcrumbs, :spacer_template => spacer do |crumb| %>
<a href='<%= url_for_category(crumb) %>'><%= crumb.name %></a>
<% end %>
The Rails helper capture takes the block and puts it in the variable spacer.
Then we put this in helpers/sites_helper.rb
== helpers/sites_helper.rb
def with_collection(collection, opts={}, &proc)
collection.each_with_index do |element, i|
yield element
if spacer = opts[:spacer_template]
concat(spacer, proc.binding) unless i == collection.size - 1
end
end
end
This helper takes the collection and yields each element. As each element is
yielded the block in the view is added to the output.
The trick is that we add the spacer to the output by using the concat method. The
concat method outputs the string to the view from the helper. This is how we
can output something to the template without using <%= (This is also
what the helpers such as form_for use).
Labels: articles, helpers, rails, ruby
Finding a Better State Pattern (in Ruby)
Wednesday, February 07, 2007 by Nate Murray.
Introduction
Gang of Four outlines a pattern for modifying an object depending on its state.
This is called the State pattern. I'm not going to go into all the details
here, so if you are unfamiliar with the State pattern I'd recommend looking
here and here.
This post addresses how to implement the typical State pattern in Ruby and
explores its consequences and alternatives.
Problem: An object's behavior needs to be modified depending on what state it is in.
Solution: Create a State object and delegate the functionality to that State object.
Traditional State Pattern
Gang of Four outlines a pattern for modifying an object depending on its state.
This is typically done be creating an Abstract State and then
subclassing it to create a Concrete State. The originating object,
referred to as the Context then delegates the specific method to an
instance of the Concrete State along with any information needed.
For example, say we have a Product and we want the inventory levels to vary
depending on where we are selling the Product. While the actual inventory we
have doesn't change, we may want to tell, say, eBay or Amazon, that the
inventory is lower than what we actually have to prevent overselling.
Figure 1. is a traditional implementation of this.
A simple implementation would delegate the method to the state 100% of the
time. However, in our system, assume that we only want to use the State if it
exists and if it contains the method we are interested in. This allows
us to have default behaviors in our Product object. This may not always be the
case, but in our example it is.
In Ruby, this looks something like the following:
class Product
attr_accessor :state
def inventory
if state && state.respond_to?(:inventory)
return state.send(:inventory, self)
else
return @inventory
end
end
end # end Product
class AmazonProductState < ProductState
# take the inventory from the Context object (the Product) and divide it by 2
def inventory(context)
context.inventory / 2
end
end
Consequences
- A consequence of this approach is that you have to write the lines
if state && state.respond_to?(:inventory)
return state.send(:inventory, self)
every time. Ideally we wouldn't have to write this over and over for every method we want to delegate.
A slight improvement
What we could do is write a method that writes the delegation code for the method for us.
def define_state_method(name, &block)
# ...
end
# Then just call that method fo r
define_state_method :quantity_on_hand do
return @inventory
end # writes #1 in effect
Consequences
- Eliminates the repetition of #1
- However, you still have to plan ahead to make this method be delegated to the State object.
Ideally, what we want to do is have any method overwritten for a
particular instance when the state is set. We want to keep the
same class, and we don't want to change the methods of other instances of that
class.
Just extend it
We can achive the affect we are looking for by simply extending the class on our Product object. See below:
class AmazonProductState < ProductState
def inventory
@inventory / 2
end
def new_method
"7 llamas"
end
end
class Product
...
def set_state(klass)
self.instance_eval do
extend klass
end
end
...
end
>> p = Product.find(1) # => #<Product:@id=1...>
q = Product.find(1) # => #<Product:@id=1...>
q.inventory # => 10
p.set_state(AmazonProductState) # => nil
p.inventory # => 5
p.respond_to(:new_method) # => true
q.respond_to(:new_method) # => false
Consequences
- Allows us to override any method without having to plan ahead when designing the Product object
- Allows us to add new methods that only exist in the state
- The new state does not change the class of the object
- The new state only affects particular instances of the object, not the whole class
Labels: articles, design patterns, gof, mixins, ruby