« March 2008 | Main

April 2008 Archives

April 13, 2008

I lead a typical, boring life

~ $ history|awk '{a[$2]++} END{for(i in a){printf "%5d\t%s\n ",a[i],i}}'|sort -rn|head
   285	hg
   51	script/server
    32	cd
    17	rm
    15	ls
    10	rake
     8	vi
     8	ps
     8	mate
     8	cap

April 1, 2008

A Modest Syntax Proposal: RBXML

Ruby is a wonderful language, but its syntax has some shortcomings. Inspired by Perl, the syntax of Ruby requires many special cases and is difficult to parse by anything but a full Ruby interpreter. We seek to fix this issue with a new, unambiguous syntax for Ruby: RBXML.

"Business Reporting and XML go hand in hand. I plan to convert Ruport to RBXML as soon as I can!"
-- Gregory Brown, Ruby Reports maintainer

Consider the following Ruby code. While it is certainly concise, it achieves this goal through use of obscure symbols such as . and |. This makes it difficult to read, and more importantly, difficult to parse by machine.

class Integer
  def factorial
    (1..self).inject(1) {|p, x| p*x}
  end
end

Rubyists talk a lot about metaprogramming, but how are you supposed to introspect on a language like that? Some projects try to work around these limitations by letting Ruby code introspect on itself; however, the result can be unwieldy. Why reinvent the wheel when XML has been around for years? Consider the much more regular and understandable syntax in RBXML:

<class name="Integer">
  <def name="factorial">
    <method-call>
      <name>inject</name>
      <receiver>
        <range exclude-end="false">
          <start>
            <integer>1</integer>
          </start>
          <end>self</end>
        </range>
      </receiver>
      <argument>
        <integer>1</integer>
      </argument>
      <block>
        <parameter name="p" />
        <parameter name="x" />
        <code>
          <method-call>
            <name>*</name>
            <receiver>p</receiver>
            <argument>x</argument>
          </method-call>
        </code>
      </block>
    </method-call>
  </def>
</class>

Just as languages in the Lisp family represent their parse trees with s-expressions, this regular syntax represents the parse tree directly, with only a small bit of XML parsing. Rubyists now have the advantage of not having an extra layer of syntax between them and their concepts.

But it doesn't stop there. We can add some simple syntactic sugar to condense our code further without making it significantly less readable. With a simple application of "duck typing," we infer a value of 1 from the string "1", thus eliminating the need for complex type annotations such as <integer>1</integer>. This principle can be applied to the <parameter> tag as well, using the well-known grammatical trick of a "comma splice". Here is the result:

<class name="Integer">
  <def name="factorial">
    <method-call name="inject" arguments="1"> <!-- quack! -->
      <receiver>
        <range exclude-end="false" start="1" end="self" />
      </receiver>
      <block parameters="p,x">
        <code>
          <method-call name="*" receiver="p" arguments="x" />
        </code>
      </block>
    </method-call>
  </def>
</class>

With a special syntax (Integer#factorial) familiar to all Ruby coders, we can eliminate the <class> tag that is really just a holdover from Ruby's original syntax. The <block> tag also now contains a superfluous <code> tag, which we can remove. In addition, the range/exclude-end parameter can be omitted and sensibly defaults to false. This tightens up the RBXML even more:

<def name="Integer#factorial">
  <method-call name="inject" arguments="1">
    <receiver>
      <range start="1" end="self" />
    </receiver>
    <block parameters="p,x">
      <method-call name="*" receiver="p" arguments="x" />
    </block>
  </method-call>
</def>

Some community members have expressed the feeling that some Ruby syntax is "intuitive." As a transition aid for such people, we offer a compromise syntax. The code attribute will interpret code according to Ruby's old syntax rules, hopefully easing the transition to the new RBXML syntax:

<def name="Integer#factorial">
  <method-call name="inject" arguments="1">
    <receiver code="1..self" />
    <block code="|p, x| p * x" />
  </method-call>
</def>

However, users should be cautioned that the code attribute spins up a new Ruby interpreter process for every invocation, and thus should be used sparingly. It may be more efficient to use the <code> tag at the toplevel, as all toplevel blocks are evaluated within one Ruby interpreter. This code should perform almost as well as current versions of MRI:

<code>
<![CDATA[
  class Integer
    def factorial
      (1..self).inject(1) {|p, x| p*x}
    end
  end
]]>
</code>

RBXML is still under development. Unfortunately, we are unable to do a public release at this point, on the advice of our legal team. We hope to release the 1.0 version by April 1, 2009.

About April 2008

This page contains all entries posted to Brad Ediger in April 2008. They are listed from oldest to newest.

March 2008 is the previous archive.

Many more can be found on the main index page or by looking through the archives.