package bap-core-theory

  1. Overview
  2. Docs

An untyped grammar for a subset of Core Theory languages.

This module defines grammars for six sub-languages for each sort of Core Theory terms.

Notation

Each rule in the grammar S returns a value of type S.t, which is a semantic action that will eventually build a Core Theory term of a corresponding to the rule type. This type is abstract and is totally co-inductive (i.e., it could also produced, there are no grammar rules (functions) that consume values of this type.

Since each grammar rule will eventually build a Core Theory term, we will describe rules using the Core Theory denotations. The parser generator will invoke recursively the top-level rules on each non-terminal. Those rules are referenced using the following names:

  • bitv - parses the language of bitvectors;
  • bool - parses the language of booleans;
  • mem - parses the language of memories;
  • stmt - parses the language of statements;
  • float - parses the language of floats;
  • rmode - parses the language of rounding modes.

The parsing rules occasionally need to refer to the sorts of therms, we use the following short-hand notation for sorts:

  • bits m = Bitv.define m - a sort of bitvectors with m bits;
  • Bool.t - the sort of booleans;
  • mems k v = Mem.define k v - a sort of memories;

Finally, the width associated with the sort s is denoted with size s = Bitv.size s.

Example,

(** [add x y] is [add (bitv x) (bitv y)].  *)
val add : exp -> exp -> t

says that the grammar rule add interprets its arguments in the bitv context (recursively calls the bitv function) and have the denotation of the add p q term of the Core Theory, where p = bitv x and q = bitv y.

Contexts

To ensure the freshness of generated variables and to enable a higher-order abstract syntax style (a generalization of the De Bruijn notation) we wrap each semantic action in a context that holds the binding rules.

The var family of grammar rules rename the passed names if they are bound in the renaming context, which is denoted with context s which is a function that returns the name bound to s in the context or s if it is unbound.

The let_<T> n ... and tmp_<T> n ... take the old name n, create a fresh variable n' and append the n,n' binding to the context in which the grammar rule is invoked, denoted with rule [<non-term>|n->n'], e.g.,

[let_reg s x y] is [scoped @@ fun v -> (bitv x) (bitv [y|s->v])]

As a result of the let_reg rule applications any free occurrence of the variable s will be substituted with the freshly generated variable v. This will ensure alpha-equivalence of expressions that use the let_<T> forms.

The tmp_<T> rules are basically the same as let_<T> except that the scope of the freshly created variable is indefinite, so these forms could be used to create hygienic symbol generators.

type ieee754 = IEEE754.parameters
module type Bitv = sig ... end

Bitvectors.

module type Bool = sig ... end

Booleans.

module type Mem = sig ... end
module type Stmt = sig ... end

Statements.

module type Float = sig ... end

Floating point expressions.

module type Rmode = sig ... end

Rounding modes.

OCaml

Innovation. Community. Security.