Chapter 27 Runtime detection of data races with ThreadSanitizer

1 Overview and usage

OCaml, since version 5.0, allows shared-memory parallelism and thus mutation of data shared between multiple threads. This creates the possibility of data races, i.e., unordered accesses to the same memory location with at least one of them being a write. In OCaml, data races are easy to introduce, and the behaviour of programs with data races can be unintuitive — the observed behaviours cannot be explained by simply interleaving operations from different concurrent threads. More information about data races and their consequences can be found in section ‍9.4 and Chapter ‍10.

To help detect data races, OCaml supports ThreadSantizer (TSan), a dynamic data race detector that has been successfully used in languages such as C/C++, Swift, etc. TSan support for OCaml is available since OCaml 5.2.

To use TSan, you must configure the compiler with --enable-tsan. You can also install an opam switch with the TSan feature enabled as follows:

opam switch create <YOUR-SWITCH-NAME-HERE> ocaml-option-tsan

TSan support for OCaml is currently available for the x86_64 architecture, on FreeBSD, Linux and macOS, and for the arm64 architecture on Linux and macOS. Building OCaml with TSan support requires GCC or Clang. Minimal supported versions are GCC 11 and Clang 14. Note that TSan data race reports with GCC 11 are known to result in poor stack trace reporting (no line numbers), which is fixed in GCC 12.

A TSan-enabled compiler differs from a regular compiler in the following way: all programs compiled by ocamlopt are instrumented with calls to the TSan runtime, and TSan will detect data races encountered during execution.

For instance, consider the following program:

let a = ref 0 and b = ref 0 let d1 () = a := 1; !b let d2 () = b := 1; !a let () = let h = Domain.spawn d2 in let r1 = d1 () in let r2 = Domain.join h in assert (not (r1 = 0 && r2 = 0))

This program has data races. The memory locations a and b are read and written concurrently by multiple domains d1 and d2. a and b are “non-atomic” locations according to the memory model (see Chapter ‍10), and there is no synchronization between accesses to them. Hence, there are two data races here corresponding to the two memory locations a and b.

When you compile and run this program with ocamlopt, you may observe data race reports on the standard error, such as:

==================
WARNING: ThreadSanitizer: data race (pid=3808831)
  Write of size 8 at 0x8febe0 by thread T1 (mutexes: write M90):
    #0 camlSimple_race.d2_274 simple_race.ml:8 (simple_race.exe+0x420a72)
    #1 camlDomain.body_706 stdlib/domain.ml:211 (simple_race.exe+0x440f2f)
    #2 caml_start_program <null> (simple_race.exe+0x47cf37)
    #3 caml_callback_exn runtime/callback.c:197 (simple_race.exe+0x445f7b)
    #4 domain_thread_func runtime/domain.c:1167 (simple_race.exe+0x44a113)

  Previous read of size 8 at 0x8febe0 by main thread (mutexes: write M86):
    #0 camlSimple_race.d1_271 simple_race.ml:5 (simple_race.exe+0x420a22)
    #1 camlSimple_race.entry simple_race.ml:13 (simple_race.exe+0x420d16)
    #2 caml_program <null> (simple_race.exe+0x41ffb9)
    #3 caml_start_program <null> (simple_race.exe+0x47cf37)
[...]

WARNING: ThreadSanitizer: data race (pid=3808831)
  Read of size 8 at 0x8febf0 by thread T1 (mutexes: write M90):
    #0 camlSimple_race.d2_274 simple_race.ml:9 (simple_race.exe+0x420a92)
    #1 camlDomain.body_706 stdlib/domain.ml:211 (simple_race.exe+0x440f2f)
    #2 caml_start_program <null> (simple_race.exe+0x47cf37)
    #3 caml_callback_exn runtime/callback.c:197 (simple_race.exe+0x445f7b)
    #4 domain_thread_func runtime/domain.c:1167 (simple_race.exe+0x44a113)

  Previous write of size 8 at 0x8febf0 by main thread (mutexes: write M86):
    #0 camlSimple_race.d1_271 simple_race.ml:4 (simple_race.exe+0x420a01)
    #1 camlSimple_race.entry simple_race.ml:13 (simple_race.exe+0x420d16)
    #2 caml_program <null> (simple_race.exe+0x41ffb9)
    #3 caml_start_program <null> (simple_race.exe+0x47cf37)
[...]

==================
ThreadSanitizer: reported 2 warnings

For each detected data race, TSan reports the location of the conflicting accesses, their nature (read, write, atomic read, etc.), and the associated stack trace.

If you run the above program several times, the output may vary: sometimes TSan will report two data races, sometimes one, and sometimes none. This is due to the combination of two factors:

This example illustrates the fact that data races can sometimes be hidden by unrelated synchronizing operations.

2 Performance implications

TSan instrumentation imposes a non-negligible cost at runtime. Empirically, this cost has been observed to cause a slowdown which can range from 2x to 7x. One of the main factors of high slowdowns is frequent access to mutable data. In contrast, the initialising writes to and reads from immutable memory locations are not instrumented. TSan also allocates very large amounts of virtual memory, although it uses only a fraction of it. The memory consumption is increased by a factor between 4 and 7.

3 False negatives and false positives

As illustrated by the previous example, TSan will only report the data races encountered during execution. Another important caveat is that TSan remembers only a finite number of memory accesses per memory location. At the time of writing, this number is 4. Data races involving a forgotten access will not be detected. Lastly, the documentation of TSan states that there is a tiny probability to miss a race if two threads access the same location at the same time. TSan may overlook data races only in these three specific cases.

For data races between two memory accesses made from OCaml code, TSan does not produce false positives; that is, TSan will not emit spurious reports.

When mixing OCaml and C code, through the use of C primitives, the very notion of false positive becomes less clear, as it involves two memory models – OCaml and C11. However, TSan should behave mostly as one would expect: non-atomic reads and writes in C will race with non-atomic reads and writes in OCaml, and C atomics will not race with OCaml atomics. There is one theoretical possibility of false positive: if a value is initialized from C without using caml_initialize (which is allowed under the condition that the GC does not run between the allocation and the write, see Chapter ‍22) and a conflicting access is made later by another thread. This does not constitute a data race, but TSan may report it as such.

4 Runtime options

TSan supports a number of configuration options at runtime using the TSAN_OPTIONS environment variable. TSAN_OPTIONS should contain one or more options separated by spaces. See the documentation of TSan flags and the documentation of flags common to all sanitizers for more information. Notably, TSAN_OPTIONS allows suppressing some data races from TSan reports. Suppressing data race reports is useful for intentional races or libraries that cannot be fixed.

For example, to suppress reports originating from functions in the OCaml module My_module, one can run

TSAN_OPTIONS="suppressions=suppr.txt" ./my_instrumented_program

with suppr.txt a file containing:

race_top:^camlMy_module

(Note that this depends on the format of OCaml symbols in the executable. Some builders, like Dune, might result in different formats. You should adapt this example to the symbols effectively present in your stack traces.)

The TSAN_OPTIONS variable also allows for increasing the “history size”, e.g.:

TSAN_OPTIONS="history_size=7" ./my_instrumented_program

TSan’s history records events such as function entry and exit, and is used to reconstruct stack traces. Increasing the history size can sometimes be necessary to obtain the second stack trace, but it also increases memory consumption. This setting does not change the number of memory accesses remembered per memory location.

5 Guidelines for linking

As a general rule, OCaml programs instrumented with TSan should only be linked with OCaml or C objects also instrumented with TSan. Doing otherwise may result in crashes. The only exception to this rule are C libraries that do not call into the OCaml runtime system in any way, i.e., do not allocate, raise exceptions, call back into OCaml code, etc. Examples include the libc or system libraries. Data races in non-instrumented libraries will not be reported.

C code interacting with OCaml should always be built through the ocamlopt command, which will pass the required instrumentation flags to the C compiler. The CAMLno_tsan qualifier can be used to prevent functions from being instrumented:

CAMLno_tsan void f(int arg)
{
  /* This function will not be instrumented. */
  ...
}

Races from non-instrumented functions will not be reported. CAMLno_tsan should only be used by experts. It can be used to reduce the performance overhead in certain corner cases, or to suppress some known alarms. For the latter, using a suppressions file with TSAN_OPTIONS should be preferred when possible, as it allows for finer-grained control, and qualifying a function f with CAMLno_tsan results in missing entries in TSan’s stack traces when a data race happens in a transitive callee of f.

There is no way to disable instrumentation in OCaml code.

6 Changes in the delivery of signals

TSan intercepts all signals and passes them down to the instrumented program. This overlay from TSan is not always transparent for the program. Synchronous signals such as SIGSEV, SIGILL, SIGBUS, etc. will be passed down immediately, whereas asynchronous signals such as SIGINT will be delayed until the next call to the TSan runtime (e.g. until the next access to mutable data). This limitation of TSan can have surprising effects: for instance, pure, recursive functions that do not allocate cannot be interrupted until they terminate.