Tools that rewrite Ruby code, such as rubocop, do so using the excellent gem parser. The gem parser allows you to convert your Ruby code into an AST (abstract syntax tree). For an introduction to this topic, see the introduction to the gem parser.
When building textractor, we often found ourselves writing code to query and filter ASTs to find the exact node to modify. For example, to activate by programming
<%= f.text_field :name, placeholder: "Your name" %> in
<%= f.text_field :name, placeholder: t('.your_name') %> we have to find the node of the value for the
placeholder key, in a hash that happens to be an argument for a
It turns out that there is already an excellent query language for finding trees: XPath! All we need to do is turn an AST into an XML tree, run the XPath query, and find the original AST node belonging to the matches.
TL; DR: This article shows you how to activate this:
So what is AST for our example input
<%= f.text_field :name, placeholder: "Your name" %> look like?
We need to recursively convert this data structure to XML. Here is a small class that does exactly that:
We use REXML because it comes with the Ruby standard library. Performance so far has been good, but if XML / XPath processing becomes your bottleneck, it’s pretty easy to replace it with nokogiri.
Let’s see it in action:
However, if we want to be able to query the values of literals, we will also need to add a value attribute:
Now our XML looks like this:
It’s time to try XPath. First, we add a convenience method to our
XMLAST to classify:
Very neat! But we are not there yet. If we’re going to do anything useful with the results, we’ll need the original Ruby objects representing the AST nodes.
We could cheat and convert the XML results to a new AST, but that would almost certainly break the rewrite library built into the gem parser. Not to mention being horribly inefficient.
Instead, we’ll add a bit of metadata to our XML tree, specifically the Ruby object IDs of the original nodes. Fortunately, it’s as easy as
Which gives the following XML:
Now that we have the original object IDs in our XML output, we can browse the tree to find the original nodes. The implementation below is not very efficient, but it is very short. The optimization of the performance of a recursive tree path is left to the reader as an exercise.
First, we need a way to recursively add all nodes to an array:
Then we can use it to find our matching object ID:
There you have it, a very quick and expressive way to juggle your AST:
See the full source at the bottom of this article.
If you want to further shorten your XPaths, you can add more metadata to your XML tree. For example in textractor, if we encounter a
send node (a method call) we automatically add
message=”method_name” to the XML element. This allows us to write XPath such as
We are currently developing several products using this library. Once the XML format is stabilized, we plan to extract the library from our product and release a gem. If you would like to use these techniques in your project, we would be happy to help! Email us at firstname.lastname@example.org.
TO Snooty software, we develop tools that modify the code by programming. Our first product, Textractor, takes an existing Rails project and prepares your ERB views for translation by replacing the string literals with