art

Adaptive Radix Tree
README

This is an implementation in OCaml of ART. Adaptive Radix Tree is like a
simple Hashtbl with order:

# let tree = Art.make () ;;
# Art.insert tree (Art.key "foo") 42 ;;
# Art.insert tree (Art.key "bar") 21 ;;
# Art.find tree (Art.key "foo")
- : int = 42

Operation like minimum or maximum are available (which don't exist for a
simple Hashtbl.t):

# let tree = Art.make () ;;
# Art.insert tree (Art.key "0") 0 ;;
# Art.insert tree (Art.key "1") 1 ;;
# Art.insert tree (Art.key "2") 2 ;;
# Art.minimum tree
- : int = 0
# Art.maximum tree
- : int = 2

If you want the order and the speed of Hashtbl.t, Art is your library:

Read Optimised Write Exclusion (ROWEX) in OCaml

ROWEX is a second implementation of ART with atomic operations. It's a functor
which expects an implementation of atomic operations such as load or store.

Parallelism, atomic operation & OCaml

The current version of OCaml has a global lock for the GC. By this way, it's not
possible for us to execute ROWEX operations (find/insert) with true
parallelism if we use the same OCaml runtime. Even if you use LWT or ASYNC, you
execute jobs concurrently.

However, ROWEX wants to provide an implementation where find/insert can be
executed in parallel without any problems (race condition or ABA problem). So
ROWEX provides an implementation, persistent, which implements atomic
operations on a memory area. Then, we are able, as parmap, to
simulate true parallelism as long as each operations are executed into their own
fork().

The goal of this library is provide:

  • the most easy way to switch the implementation to
    ocaml-multicore

  • a baby step to be able to manipulate a file by several processes
    (consumers/find, producers/insert) in parallel

ROWEX follows two main papers:

  • The initial implementation of ROWEX

  • A derivation of it to be persistent: PART

Tools

The distribution comes with some tools to manipulate an index:

$ opam pin add -y https://github.com/dinosaure/art
$ opam install rowex
$ part.make index.idx
$ ls -lh
-rw-r--r-- 1 user user 8,0M ----- -- --:-- index.idx
prw------- 1 user user    0 ----- -- --:-- index.idx.socket
prw------- 1 user user    0 ----- -- --:-- index.idx-truncate.socket
$ part.insert index.idx foo 1
$ part.find index.idx foo
1

On the OCaml side, a Part module exists which implements these functions:

type 'a t constraint 'a = [< `Rd | `Wr ]

val create : ?len:int -> string -> unit
val insert : [> `Rd | `Wr ] t -> string -> int -> unit
val lookup : [> `Rd ] t -> string -> int

part is Unix dependent (and it need an Unix named pipe). It ensures with
explained internal mechanisms to use multiple readers and one writer:

  • The writer can take the exclusive ownership on the index file and its named
    pipe

  • readers don't need to take the ownership but they must send a signal into
    the named pipe (to the writer) that they start to introspect the index

For readers, some functions exist to signal their existence to the write:

val append_reader : Ipc.t -> unit
val delete_reader : Ipc.t -> unit

val ipc : _ t -> Ipc.t

Status: experimental

This part of the distribution is experimental - even if the distribution
comes with several tests to ensure that the implementation works, ROWEX is
fragile! It still need a synchronization mechanism fsync() which is added
pervasively in some parts of the code according to outcomes of errors.

Install
Published
20 Jul 2022
Sources
art-0.2.0.tbz
sha256=3b0690b13902d30c857bb7edf4491731c374c7504a999e6fabed9228504de9bb
sha512=b3f96eb07fb50ed014483dabe4b364c195a9e7bd2641225cb599d783c096c9d8be036966c33e5eb0c1d9c193328f4163292d460481c258aa1c5d512ccc713690
Dependencies
monolith
with-test
crowbar
with-test
alcotest
with-test
fmt
>= "0.8.7"
dune
>= "2.8.0"
ocaml
>= "4.07.0"
Reverse Dependencies