Glyph – A quick look at Glyph's internals

If you plan on extending Glyph, knowing how it works inside helps. It is not mandatory by any means, but it definitely helps, especially when creating complex macros.

What happens behind the scenes when you call glyph compile? Glyph's code is parsed, analyzed and then translated into text, and here's how:

A sequence diagram for document generation

From the diagram, it is possible to divide the document generation process into three phases:

  • The Parsing Phase starts when a chunk of Glyph code is passed (by the generate:document Rake task, for example) to a Glyph::Interpreter. The interpreter initializes a Glyph::Parser that parses the code and returns an Abstract Syntax Tree (AST) of Glyph::SyntaxNode objects.
  • The Analysis Phase (Processing) starts when the interpreter method calls the analyze method, instantiating a new Glyph::Document. The Glyph::Document object evaluates the AST expanding all macro nodesth (that’s when macros are executed) and generates string.
  • The Finalization Phase (Post-Processing) starts when the interpreter calls the finalyze method, causing the Glyph::Document object to perform a series of finalizations on the string obtained after analysis, i.e. it replaces escape sequences and placeholders.

Example: A short note

As an example, consider the following Glyph code:

1fmi[something|#test]
2...
3section[
4  @title[Test Section]
5  @id[test]
6...
7]

This simple snippet uses the fmi macro to link to a section later on in the document. When parsed, the produced AST is the following:

 1{:name=>:"--"}
 2  {:name=>:fmi, :escape=>false}
 3    {:name=>:"0"}
 4      {:value=>"something"}
 5    {:name=>:"1"}
 6      {:value=>"#test"}
 7  {:value=>"\n"}
 8  {:value=>"\[", :escaped=>true}
 9  {:value=>"..."}
10  {:value=>"\]", :escaped=>true}
11  {:value=>"\n"}
12  {:name=>:section, :escape=>false}
13    {:name=>:"0"}
14      {:value=>"\n\t"}
15      {:value=>"\n\t"}
16      {:value=>"\n"}
17      {:value=>"\[", :escaped=>true}
18      {:value=>"..."}
19      {:value=>"\]", :escaped=>true}
20      {:value=>"\n"}
21    {:name=>:title, :escape=>false}
22      {:value=>"Test Section"}
23    {:name=>:id, :escape=>false}
24      {:value=>"test"}

This output is produced by calling the inspect method on the AST. Each Glyph::SyntaxNode object in the tree is basically an ordinary Glyph Hash with a parent and 0 or more chidren, so the code snippets above shows how the syntax nodes are nested.

The AST contains information about macro, parameter and attribute names, and escaping, and raw text values (the nodes without a :name key), but nothing more.

When the AST is analyzed, the resulting textual output is the following:

1<span class="fmi">for more information on something, see ‡‡‡‡‡PLACEHOLDER ¤ 1‡‡‡‡‡
2</span>
3\/[...\/]
4<div class="section">
5<h2 id="test">Test Section</h2>
6\/[...\/]
7
8</div>

This looks almost perfect, except that:

  • There's a nasty placeholder instead of a link: this is due to the fact that when the link is processed, there is no #text anchor in the document, but there may be one afterwards (and there will be).
  • There are some escaped brackets.

Finally, when the document is finalized, placeholders and escape sequences are removed and the final result is the following:

1<span class="fmi">for more information on something, 
2  see <a href="#test">Test Section</a></span>
3[...]
4<div class="section">
5<h2 id="test">Test Section</h2>
6[...]
7
8</div>