Skip to content

TUTORIAL: Let's build a parser!

famished-tiger edited this page Feb 13, 2022 · 10 revisions

Overview

This tutorial aims to show how to write a parser with Rley.
The chosen language to parse is TOML, a readable format for configuration files.

Why TOML?

TOML was chosen mainly for two reasons:

  • It is a language of moderate complexity that fits the format of this tutorial
  • A TOML reader can be more easily be tested than a parser for a programming language (e.g., no need for a runtime system for an interpreter or compiler).

How is the tutorial organized?

Iteratively. We start with a limited subset of TOML, then with each iteration one expands the grammar, tokenizer and parser to cover more and more the language's intricacies.
Each iteration follows the same flow:

Tutorial structure

Here are the links to the different iterations:

Our challenge

In the home page of the TOML official site, a sample TOML document is displayed.
For convenience, you can find its contents here:

# This is a TOML document

title = "TOML Example"

[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00

[database]
enabled = true
ports = [ 8000, 8001, 8002 ]
data = [ ["delta", "phi"], [3.14] ]
temp_targets = { cpu = 79.5, case = 72.0 }

[servers]

[servers.alpha]
ip = "10.0.0.1"
role = "frontend"

[servers.beta]
ip = "10.0.0.2"
role = "backend"

Our challenge is at the end of the tutorial to:

  • Build a parser that can read this sample document and generate an abstract syntax tree (AST) from it.
  • Convert the parsed input into the following Ruby data structure:
{ :title=>"TOML Example", 
  :owner=>{:name=>"Tom Preston-Werner", :dob=>1979-05-27 07:32:00 -0800}, 
  :database=>{:enabled=>true, :ports=>[8000, 8001, 8002], 
  :data=>[["delta", "phi"], [3.14]], 
  :temp_targets=>{:cpu=>79.5, :case=>72.0}}, 
  :servers=>{:alpha=>{:ip=>"10.0.0.1", :role=>"frontend"}, 
    :beta=>{:ip=>"10.0.0.2", :role=>"backend"}
  }
}

What's next?

Iteration 1