A Syntax Extension for all Monadic Syntaxes
#+TITLE: =ppx_monad= /An OCaml Syntax Extension for all Monadic Syntaxes./ [[https://github.com/Niols/ppx_monad/actions/workflows/ci.yml][https://github.com/Niols/ppx_monad/actions/workflows/ci.yml/badge.svg?branch=main]] ** Overview =ppx_monad= aims at providing its users with a set of standard monadic syntaxes as well as an easy mean to define their own. Concretely, it consists in: - a library making it easy (-ier) to write syntax extensions for monads, - a set of pre-defined such extensions, and - a library providing runtime dependencies for the pre-defined extensions. One of the simplest examples is the option monad. Using =ppx_monad=, one can write: #+BEGIN_SRC ocaml let add_string_ints (s1 : string) (s2 : string) : string option = let%option n1 = int_of_string_opt s1 in let%option n2 = int_of_string_opt s2 in Some (string_of_int (n1 + n2)) #+END_SRC which is then converted to: #+BEGIN_SRC ocaml let add_string_ints (s1 : string) (s2 : string) : string option = Option.bind (int_of_string_opt s1) @@ fun n1 -> Option.bind (int_of_string_opt s2) @@ fun n2 -> Some (string_of_int (n1 + n2)) #+END_SRC equivalent to the following bind-free code: #+BEGIN_SRC ocaml let add_string_ints (s1 : string) (s2 : string) : string option = match int_of_string_opt s1 with | None -> None | Some n1 -> match int_of_string_opt s2 with | None -> None | Some n2 -> Some (string_of_int (n1 + n2)) #+END_SRC =ppx_monad= can behave as PPX for the option monad as well as a library to help someone define similar extensions. ** Installation The easiest way to install =ppx_monad= is via OPAM: #+BEGIN_SRC sh opam install ppx_monad #+END_SRC Alternatively, one can close this repository, make sure to have Dune and ppxlib installed and run: #+BEGIN_SRC sh make install #+END_SRC ** Using =ppx_monad= as a PPX Using =ppx_monad= as a PPX is simply a matter of adding a preprocessing step to your target. With Dune, for instance, this is done by putting: #+BEGIN_SRC dune (executable (name example) (preprocess (pps ppx_monad))) #+END_SRC in the appropriate Dune file. With ocamlbuild, this goes by adding the following to the =_tags= file: #+BEGIN_SRC <src/*>: package(ppx_monad) #+END_SRC *** Extensions =ppx_monad= contains the following syntax extensions: | Name/Label | Other Labels | Error | Since | |--------------+-------------------------+-------+-------| | =option= | =opt= | yes | | | =result.ok= | =res.ok=, =res=, =result=, =ok= | yes | 4.08 | | =result.error= | =err=, =error= | no | 4.08 | | =list= | =lst= | no | | | =seq= | | no | 4.07 | | =either.right= | =either=, =right= | yes | 4.12 | | =either.left= | =left= | no | 4.12 | The second column indicate alternative labels that one can use to trigger the PPX. for instance, =let%result.ok=, =let%res= and =let%ok= all trigger the same extension. The third column indicate whether the extension covers an error part of the monad. This error part gives access to error-related structures such as =assert= and =try=. For instance, one can right =try%opt f () with () -> Some 7= which will return =Some x= if =f= returns =Some x= and =Some 7= if =f= returns =None=. On the other hand, =try%list= does not exist. In the case of =Result=, the same module gives birth to two monads and therefore two extensions: =result.ok= and =result.error=. The former is what you expect: =let%result.ok= (=let%ok= or =let%res= for short) binds on value of the form =Ok x=, propagating values of the form =Error y=. It features an error part and =try%res f () with Foo -> Ok 7 | Bar -> Error "no bar"= returns =Ok x= if =f= returns =Ok x=, =Ok 7= if =f= returns =Error Foo= and =Error "no bar"= if =f= returns =Error Bar=. The latter is the counterpart which binds on values of the form =Error y=, propagating values of the form =Ok x=. It does not feature an error part. The same considerations hold for =Either= which the convention that values of the form =Right x= are the “good” ones, even if OCaml's standard library specifies that one should only use =Either= “without assigning a specific meaning to what each case should be.” *** The =monad= Extension =ppx_monad= contains one specific extension, =ppx_monad.ppx.monad=, which does not rely on any particular monad but uses the “current” one. For instance: #+BEGIN_SRC ocaml let%monad x = ...some computation... in ...some other computation... #+END_SRC is rewritten into: #+BEGIN_SRC ocaml bind ...some computation... (fun x -> ...some other computation...) #+END_SRC and it is up to the user to make sure that the =bind= function is in scope. This can be useful when manipulating modules that define a monad with the usual keywords. For instance: #+BEGIN_SRC ocaml let open Result in let%m x = ...some computation... in ...some other computation... #+END_SRC will rely on =bind= from the =Result= monad. (Note that =m= can be used as a short label for =monad.=) This extension will expect the functions =return=, =bind=, =fail= and =catch= to be defined in order to provide all the possible structures. *** The =do= Extension Additionally, =ppx_monad= contains a specific extension, =ppx_monad.ppx.do= that implements a Haskell-like =do=-notation. For instance, the following code: #+BEGIN_SRC ocaml let add_string_ints (s1 : string) (s2 : string) : string option = begin%do [@monad Option] n1 <- int_of_string_opt s1; n2 <- int_of_string_opt s2; Some (string_of_int (n1 + n2)) end #+END_SRC is rewritten to: #+BEGIN_SRC ocaml let add_string_ints (s1 : string) (s2 : string) : string option = Option.bind int_of_string_opt s1 @@ fun n1 -> Option.bind int_of_string_opt s2 @@ fun n2 -> Some (string_of_int (n1 + n2)) #+END_SRC This notation supports the use of a sequence when no value should be bound. It does not, however, support patterns on the left-hand side of =<-= but one can recover this functionality with =<--= which supports basic patterns with the peculiarity that the wildcard =_= has to be written =__=, without which the compiler will complain with =Syntax error: wildcard "_" not expected=. If one needed to use an actual operator =<-= or =<--= or a usual OCaml sequence, they should wrap those inside a =let () = ... in=. Finally, this notation supports two optional attributes, =@bind= and =@monad=. =@bind= allows to provide a custom =bind= function; =@monad= allows to specify a module in which to find the =bind= function. Without attributes, the PPX simply uses =bind=. Note that this extension has little to with the others because it only requires a bind function. It is however compatible with them and it is possible to use eg. =match%res= inside a do notation. ** Defining Custom Monadic Syntaxes Defining a new monadic syntax consists in writing a rewriting library calling the following endpoint at top-level: #+BEGIN_SRC ocaml val register : (* Easiest *) ?monad:string -> ?monad_error:string -> (* Finer grained *) ?mk_return:(loc:Location.t -> expression -> expression) -> ?mk_bind: (loc:Location.t -> expression -> expression -> expression) -> ?mk_fail: (loc:Location.t -> expression -> expression) -> ?mk_catch: (loc:Location.t -> expression -> expression -> expression) -> (* Bookkeeping *) ?applies_on:string -> string -> unit #+END_SRC The easiest way to create an extension is by only providing the =~monad= argument as well as a name. For instance, one can redefine =ppx_monad.ppx.option=, an extension for the option monad, with: #+BEGIN_SRC ocaml let () = Ppx_monad.register "option" ~monad:"Stdlib.Option" #+END_SRC The =~monad= argument tells =ppx_monad= to go look for the functions =return=, =bind=, =fail= and =catch= in the given module name. (Note that the actual definition of =ppx_monad.ppx.option= also makes use of the =~applies_on= argument explained below.) Sometimes, on might want to define an error side monad, where =let%ext= would be extended to =catch= instead of =bind=. This is the purpose of the =~monad_error= argument. For instance, one can redefine =ppx_monad.ppx.result=, an extension for the error monad, both its normal and error sides, with: #+BEGIN_SRC ocaml let () = Ppx_monad.register "result" ~monad:"Stdlib.Result" let () = Ppx_monad.register "error" ~monad_error:"Stdlib.Result" #+END_SRC The last two arguments of =register= are the name of the PPX and what it applies on. By default, this is also the labels on which the PPX applies: a PPX of name =result= will apply on =let%result=, =match%result=, etc. Additionally, one can provide =applies_on=, another string describing on which labels the PPX applies. This string can contain simple regular expressions using grouping with =(...)=, optional parts with =?= and choices with =|=. For instance, =ok|res(ult)?(.ok)?= matches exactly all of the following: =ok=, =res=, =result=, =res.ok=, =result.ok=. *** Finer-grained Syntax With the =mk_*= Functions The functions =mk_return=, =mk_bind=, =mk_fail= and =mk_catch= take precedence over =~monad= and =~monad_error=. They are here to build the corresponding monadic expressions. =mk_bind=, for instance, takes two OCaml expressions representing the two usual arguments to a =bind= function and builds an expression representing the application of such a =bind= function to the arguments. For instance, one could define it as: #+BEGIN_SRC ocaml let mk_bind ~loc e f = [%expr bind [%e e] [%e f]] #+END_SRC The syntax extensions =[%expr ...]= and =[%e ...]= are helpers provided by ppxlib to manipulate OCaml expression. =[%expr ...]= allows to write OCaml code and to get an OCaml syntax tree out of it. This OCaml code can contain holes which =[%e ...]= allows to fill with a syntax tree. /For more information about this, check out [[https://ppxlib.readthedocs.io/en/latest/ppx-for-plugin-authors.html#metaquot][the documentation of ppxlib's Metaquot]]./ The functions =mk_return=, =mk_bind=, =mk_fail= and =mk_catch= are all optional but providing more of them allows =ppx_monad= to provide more syntactic structures. For instance, =mk_bind= will suffice for =ppx_monad= to provide a simple =let ... in=, but =mk_return= is necessary for =let ... and ... in=. As another example, most structures can be defined using only =mk_return= and =mk_bind=, but =mk_fail= and =mk_catch= are necessary to define =assert=, =try= or =match= with exception patterns. *** Redefining =ppx_monad_result= Let us now re-implement the PPX for =Result=, available through =ppx_monad.ppx.result= or =ppx_monad=. We can do this with the following file, of only 21 lines: #+BEGIN_SRC ocaml open Ppxlib let mk_return ~loc x = [%expr Result.ok [%e x]] let mk_bind ~loc e f = [%expr Result.bind [%e e] [%e f]] let mk_fail ~loc y = [%expr Result.error [%e y]] let mk_catch ~loc e f = [%expr (fun e f -> match e with | Ok x -> Ok x | Error y -> f y) [%e e] [%e f]] let () = Ppx_monad.register "result" ~applies_on:"ok|res(ult)?(.ok)?" ~mk_return ~mk_bind ~mk_fail ~mk_catch #+END_SRC It is then only a matter of building this file as a library that depends on =ppx_monad= and gets pre-processed by ppxlib's Metaquot. For instance, with Dune, assuming that our library is called =ppx_result=: #+BEGIN_SRC dune (library (name ppx_result) (public_name ppx_result) (libraries ppx_monad) (preprocess (pps ppxlib.metaquot)) (kind ppx_rewriter)) #+END_SRC You can have a look at how =ppx_monad.ppx.result= [[./src/ppx/result/][is actually defined in this repository]]. *** Sanitising Variable Names For most of the functions in the example above (=mk_return=, =mk_bind= and =mk_fail=), there exists an implementation in the =Result= module which we can use directly. This is however not the case for =mk_catch= and we had to implement it by hand. The way we wrote it might feel weird and we might be tempted to write it as either of the following: #+BEGIN_SRC ocaml let incorrect_mk_catch ~loc e f = [%expr let catch e f = match e with | Ok x -> Ok x | Error y -> f y in catch [%e e] [%e f] let incorrect_mk_catch ~loc e f = [%expr match [%e e] with | Ok x -> Ok x | Error y -> [%e f] y] #+END_SRC These implementations, however, are incorrect, because they bind variables in the scope of =e= and =f=. For instance, in the first implementation, if =e= or =f= were to contain the free variable =catch=, it would . The same issue is present in the second implementation if =f= were to contain the free variable =y=. If one wants to go down this road, the proper way is to ensure that the variable names are unique. Luckily, =ppx_monad= includes a mechanism for this. A proper way to write the above functions would be the following: #+BEGIN_SRC ocaml let mk_catch ~loc e f = let (pcatch, catch) = Ppx_monad.fresh_variable () in [%expr let [%p pcatch] e f = match e with | Ok x -> Ok x | Error y -> f y in [%e catch] [%e e] [%e f]] let mk_catch ~loc e f = let (py, y) = Ppx_monad.fresh_variable () in [%expr match [%e e] with | Ok x -> Ok x | Error [%p py] -> [%e f] [%e y]] #+END_SRC =[%p ...]= is similar to =[%e ...]= for wholes in pattern positions. =Ppx_monad.fresh_variable= returns a pair of a pattern and an expression, the former binding a unique variable name which the latter mentions. ** Related Works This section attempts to list all works that provide similar features as =ppx_monad=. We consider not mentioning such a project here a bug and welcome any [[https://github.com/Niols/ppx_monad/issues/new][issue]] or [[https://github.com/Niols/ppx_monad/pulls/compare][pull request]] aiming at fixing this. - [[https://ocaml.org/manual/bindingops.html][OCaml's binding operators]]: - pure OCaml - [[https://github.com/zepalmer][zepalmer]]/[[https://github.com/zepalmer/ocaml-monadic][ocaml-monadic]], a “lightweight PPX extension for OCaml to support natural monadic syntax.” - last updated in 2021. - [[https://github.com/marigold-dev][marigold-dev]]/[[https://github.com/marigold-dev/ppx_let_binding][ppx_let_binding]], an “OCaml syntax extension for monads in the style of ReasonML.” - very new (to be followed), - not documented yet, and - not published on OPAM yet. - [[https://github.com/kandu][kandu]]/[[https://github.com/kandu/ppx_ok_monad][ppx_ok_monad]], “a ppx syntax extension for monad syntax sugar.” - last updated 2 years ago. - [[https://github.com/foretspaisibles][foretspaisibles]]/[[https://github.com/foretspaisibles/ppx_monad][ppx_monad]], “a monad syntax extension for OCaml, that provides two major monad syntaxes: clean but incomplete Haskell-style monad syntax and verbose but complete let monad syntax.” - last updated in 2017. - [[https://github.com/rizo][rizo]]/[[https://github.com/rizo/ppx_monad][ppx_monad]], a “minimalistic monad syntax for OCaml.” - last updated in 2017. - [[https://github.com/danmey][danmey]]/[[https://github.com/danmey/omonad][omonad]], a “monad syntax using ppx extensions.” - last updated in 2013, - no documentation, and - no OPAM package. - [[https://github.com/pippijn][pippijn]]/[[https://github.com/pippijn/pa_monad_custom][pa_monad_custom]]: - based on Camlp4, - last updated in 2013, and - no OPAM package.
sectionYPositions = computeSectionYPositions($el), 10)" x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)" >