Parsing library based on combinators and ppx extension to write languages

Pacomb is a parsing library that compiles grammars to combinators prior to parsing together with a PPX extension to write parsers inside OCaml files.

The advantages of Pacomb are

  • Grammars as first class values defined in your OCaml files. This is an example from the distribution:

(* The three levels of priorities ) type p = Atom | Prod | Sum let%parser rec ( This includes each priority level in the next one ) expr p = Atom < Prod < Sum ( all other rule are selected by their priority level ) ; (p=Atom) (x::FLOAT) => x ; (p=Atom) '(' (e::expr Sum) ')' => e ; (p=Prod) (x::expr Prod) '' (y::expr Atom) => x*.y ; (p=Prod) (x::expr Prod) '/' (y::expr Atom) => x/.y ; (p=Sum ) (x::expr Sum ) '+' (y::expr Prod) => x+.y ; (p=Sum ) (x::expr Sum ) '-' (y::expr Prod) => x-.y

  • Good performances:

    • on non ambiguous grammars, 2 to 3 time slower compared to ocamlyacc
    • on ambiguous grammars O(N^3 ln(N)) can be achieved.
  • Parsing from left to right (despite the use of combinators) allowing not to keep the whole input in memory and allowing to parse streams.

  • Dependant sequence allowing for self extensible grammars (like new infix with a given priority in a given example).

  • Managing of blanks that for instance allows for nested language using different kind of comments or blanks.

  • Support for cache and merge for ambiguous grammars (to get O(N^3 ln(N)))

  • Enough support for utf8 to write parser for a language using utf8.

  • Comes with documentation and various examples illustrating most possibilities.

All this makes Pacomb a promising solution to write languages in OCaml.

14 Oct 2021
ppxlib >= "0.10.0"
dune >= "1.9.0"
ocaml >= "4.04.1"
Reverse Dependencies