OCaml Compiler - May 2021

I’m happy to publish the second issue of the “OCaml compiler development newsletter”. (This is by no means exhaustive: many people didn’t end up having the time to write something, and it’s fine.)

Feel free of course to comment or ask questions!

If you have been working on the OCaml compiler and want to say something, please feel free to post in this thread! If you would like me to get in touch next time I prepare a newsletter issue (some random point in the future), please let me know by email at (gabriel.scherer at gmail).

Previous issue:

OCaml compiler development newsletter, issue 1: before May 2021

Gabriel Scherer and Nicolas Chataing (@gasche and @nchataing)

[Gabriel writing] my main recent compiler-related activity is ongoing work with my intern Nicolas Chataing to implement a prototype of variant constructor unboxing, a core subset of what Jeremy Yallop proposed ( https://github.com/ocaml/RFCs/pull/14 ). Currently OCaml can "unbox" a variant if it has a single constructor (with a single parameter),

type t = Int of int [@@unboxed]

Jeremy's idea is to support the case where there are other constructors, but the tag (immediate value or block constructor tag) of the constructor parameter is disjoint from the tag of any other value at this type.

type t = Short of int [@unboxed] | Long of Mpz.t

Nicolas' prototype implementation is going along nicely, with some interesting challenges encountered and solved, and a few refactoring PRs along the way (#10307, #10412, #10428).

A key ingredient is to be able to compute the "head shape" of an OCaml type, an over-approximation of the set of possible tags of its values. We hit a few engineering and research issues in doing this. Where in the codebase should this be computed (beware of circular module dependencies)? Can we compute this information in a precise way in presence of mutually-recursive types, without risking non-termination?

We are taking inspiration from a general approach proposed by Leo White and Stephen Dolan to compute these kind of "type declaration properties" on-demand instead of as part of the type declaration's signature (see their proposal for "immediacy" at #10017, #10041 ), but our property is demanded more often (any occurrence of the constructor) and is more fine-grained (it is sensitive to type parameters), so we had to invent some solutions for new problems. (A close cousin is get_unboxed_type_representation, which avoids non-termination by using fuel, and we wanted something nicer than that.)

We discuss our handling of termination in (some) details in the following short abstract: Unfolding ML datatype definitions without loops.

Xavier Leroy (@xavierleroy)

I worked with Damien Doligez and Sadiq Jaffer on the "safe points" proposal (#10039), which is required to move forward with integrating Multicore OCaml. I re-expressed the static analyses that support the insertion of polls as backward dataflow analyses, making them simpler to understand and more robust. We also discussed whether to insert polls at the top of loops or at the bottom. Both strategies are implemented in the current state of the PR, and Sadiq is currently benchmarking them.

All this rekindled my interest in dataflow analyses. I wrote a generic backward dataflow analyzer, parameterized by an abstract domain and a transfer function (#10404). Originally I intended to use it only for the insertion of polls, but I also used it to reimplement the liveness analysis that plays a crucial role for register allocation and dead code elimination. A problem with the old liveness analysis is that it takes time exponential in the nesting of loops. The new generic analyzer avoids this pitfall by starting fixpoint iterations not systematically at the bottom of the abstract domain, but at the fixpoint found earlier, if any. This makes liveness analysis linear in the nesting of loops, and at worst cubic in the size of the function, instead of exponential.

Then I applied the same trick to the two passes that insert spills and reloads preventively (#10414). These are "analyze and simultaneously transform" passes, so I could not use the generic dataflow analyzer, but I could reuse the same improved fixpoint iteration strategy, again avoiding behaviors exponential in the nesting of loops. For instance, a trivial function consisting of 16 nested "for" loops now compiles in a few milliseconds, while it took several seconds before.

Jacques Garrigue (@garrigue)

No new PR this month, but I have kept working on those that were started in April, and are not yet merged:

#10348 improves the way expansion is done during unification, to avoid some spurious GADT related ambiguity errors
#10364 changes the typing of the body of the cases of pattern-matchings, allowing to warn in some non-principal situations; it also uncovered a number of principality related bugs inside the the type-checker
#10337 enforces that one always manipulate a normalized view of types by making type_expr an abstract type (with Takafumi Saikawa (@t6s))

For this last PR, we have interestingly observed that while this multiplied the number of calls to repr by a factor of up to 4, resulting in a 4% overhead in stdlib for instance, we could see no performance degradation in the compilation of Coq.

I have also discovered a new principality bug in the implementation of GADTs (see #10348 again), which fortunately should not affect soundness.

In a slightly different direction, I have started working on a backend targetting Coq: https://github.com/COCTI/ocaml/tree/ocaml_in_coq

If you add the -coq option to ocamlc, you get a .v file in place of a .cmo. It is still in a very early stage, only able to compile core ML programs, including references. The main difference with coq-of-ocaml is that the translation is intended to be soundness preserving: the resulting Coq code can be typed and evaluated without axioms, and should reduce to the same resut as the source program, so that the type soudness of Coq underwrites that of ocaml (for individual programs). At this point, it only relies on a single relaxation of the positivity restriction of Coq.

Thomas Refis (@trefis)

Recently, Didier Rémy and I have been looking at modular explicits, a small extension between the core and module language to help manipulate first-class functors and give the illusion of abstraction over module arguments in the core language via a new construct (tentatively) called dependent functions.

This construct was first introduced in the context of the modular implicits proposal; roughly it's what you're left with if you take away the "resolution of implicit arguments" part of that proposal.

As such, it is a natural stepping stone towards modular implicits and already has its own self-contained PR: #9187, contributed by Matthew Ryan.

What Didier and I have been focusing on recently is producing a more formal description of the feature and its relationship to first-class modules, as well as some arguments to justify that it is reasonable and desirable to add it to the language, even in the absence of modular implicits.

Stephen Dolan (@stedolan)

I've just opened #10437, which allows explicit quantifiers for type variables in signatures and GADTs, a small feature I promised to OCaml maintainers a dozen of months ago. (The ulterior motive is that these explicit quantifiers give a good place to put layout information, but I think they're worth having on their own merits).

Note: The on-demand immediacy proposal in #10017 / #10041, which Gabriel mentioned above, is extracted from part of the kinding system in the experimental branch https://github.com/janestreet/ocaml/tree/layouts , which additionally allows quantification over types of a given immediacy / layout: for instance, one can write type ('a : immediate) t = { foo : 'a } and have inference, etc. work as expected.

Sadiq Jaffer (@sadiqj)

The Safepoints PR #10039 has a few updates. It now has a new static analysis, written by Xavier Leroy and Damien Doligez, and has working code emitters on all 64-bit platforms.

The static analysis had some flexibility on poll placement in loops. We've benchmarked on amd64 and arm64, choosing to go with the option that results in slightly fewer instructions and branches across the Sandmark suite. Short of some refactoring I don't think there are any other oustanding issues with the PR.

Building on safepoints, I should soon have an attribute to propose which will enable users with atomic code blocks to safely migrate to a version of OCaml with safepoints. A draft PR or RFC will be coming very soon.

I am also doing some work on the instrumented runtime. One of the project's goals is to be able to continuously monitor OCaml applications running in a production environment. To that end I'm evaluating the instrumented runtime's performance overhead (both enabled and not), determining what work would be required to reduced the overhead and how we could modify the runtime to continuously extract metrics and events.

Anil Madhavapeddy (@avsm)

Ewan Mellor and I are working on a CI that'll make it easy to test individual changesets to the OCaml compiler and run "reverse dependencies" against a set of opam packages to isolate precisely what's causing a failure.

A failure to build an opam package can come from a variety of reasons. This can range from a build failure against a stable released compiler, to a failure on just OCaml trunk (but success on a released compiler), to a failure just on OCaml trunk + the PR in question. It's the triage of which of these situations is causing the package build failure that our new CI focusses on. Having this CI should let us quickly determine a PR's impact and potential regressions on the package ecosystem. Once the CI is stable on the OCaml multicore trees, I plan to submit it as a CI to run against mainline OCaml as well.

The working tree is at ocaml-multicore-ci (although it's called a "multicore CI", its really just turned into an "ocaml-compiler-ci" and we will rename the repository before a first release).

Florian Angeletti (@Octachron)

This week I have been working a bit on adding swaps and moves to the diffing based error messages for type declarations in #10361.

(And the release of the first alpha for OCaml 4.13.0)

The core idea of the PR is that when comparing

type t = { a:int; b:int; c:int; d:int }

with

type t = { a:int; c:int; d:int }

in an error message, it is better to notice that we are missing one field rather than trying to compare the fields b and c.

And with the machinery introduced for functor diffing, this is quite straigthforward to implement. I have been experimenting with this option since last december, and with the functor diffing PR merged #9331, I proposed a PR #10361 in April.

However, compared to functors, in type declarations, we have have one supplementary piece of information: the name of fields and constructors at a given position. Not using this piece of information yields slightly akward error messages:

  module M: sig type t = { a:int; b:int } end = struct type t = {b:int;
a:int}

   1. Fields have different names, x and y.
   2. Fields have different names, y and x.

Here, it would be better to recognize that the two fields have been swapped.

One simple way to do this without increasing the diffing complexity is to identify swaps at posteriori on the optimal patch produced by the diffing algorithm.

In this way we can replace the previous error message by

   1<->2. Fields x and y have been swapped.

without increasing the cost of the error analysis.

A similar situation happens when the position of a field changes between the interface and the implementation

module M: sig
   type t = { a:unit; b:int; c:float}
end = struct
   type t = { b:int; c:float; a:unit}
end

Explaining that the implementation can be transformed into the interface by adding a field a before the field b and deleting another field a after c is correct. But it is much nicer to sum up the issue as

1->3. Field a has been moved from position 1 to 3

Both composite moves are now recognized.

People interested by error message in OCaml should also have a look at the great work by Antal Spector-Zabusky in #10407 to improve the module level error message by expanding them with a full error trace. (The two PRs are quite complementary.)