Library
Module
Module type
Parameter
Class
Class type
Catala surface representation
This representation is the first in the compilation chain (see Architecture). Its purpose is to host the output of the Catala parser, before any transformations have been made.
The module describing the abstract syntax tree is:
Surface.Ast
Abstract syntax tree built by the Catala parser
This representation can also be weaved into literate programming outputs using the literate programming modules.
Lexing
Relevant modules:
Surface.Lexer
Concise syntax with English abbreviated keywords.Surface.Lexer_fr
Surface.Lexer_en
The lexing in the Catala compiler is done using sedlex, the modern OCaml lexer that offers full support for UTF-8. This support enables users of non-English languages to use their favorite diacritics and symbols in their code.
While the parser of Catala is unique, three different lexers can be used to produce the parser tokens.
Surface.Lexer
corresponds to a concise and programming-language-like syntax for Catala. Examples of this syntax can be found in the test suite of the compiler.Surface.Lexer_en
is the adaptation ofSurface.Lexer
with verbose English keywords matching legal concepts.Surface.Lexer_fr
is the adaptation ofSurface.Lexer
with verbose French keywords matching legal concepts.
Parsing
Relevant modules:
Surface.Parser
Surface.Parser_driver
Wrapping module around parser and lexer that offers theparse_source_file
APISurface.Parser_errors
Interface of the module auto-generated based on "parser.messages".
The Catala compiler uses Menhir to perform its parsing.
Surface.Parser
is the main file where the parser tokens and the grammar is declared. It is automatically translated into its parsing automata equivalent by Menhir.
In order to provide decent syntax error messages, the Catala compiler uses the novel error handling provided by Menhir and detailed in Section 11 of the Menhir manual.
A parser.messages
source file has been manually annotated with custom error message for every potential erroneous state of the parser, and Menhir automatically generated the Surface.Parser_errors
module containing the function linking the erroneous parser states to the custom error message.
To wrap it up, Surface.Parser_driver
glues all the parsing and lexing together to perform the translation from source code to abstract syntax tree, with meaningful error messages.
Name resolution and translation
Relevant modules:
Surface.Name_resolution
Builds a context that allows for mapping each name to a precise uid, taking lexical scopes into accountSurface.Desugaring
Translation fromSurface.Ast
toDesugaring
.Ast.
The desugaring consists of translating Surface.Ast
to Desugared.Ast
of the desugared representation. The translation is implemented in Surface.Desugaring
, but it relies on a helper module to perform the name resolution: Surface.Name_resolution
. Indeed, in Surface.Ast
, the variables identifiers are just string
, whereas in Desugared.Ast
they have been turned into well-categorized types with an unique identifier like Scopelang.Ast.ScopeName.t
.