package obandit

  1. Overview
  2. Docs

Ocaml Multi-Armed Bandits

%%VERSION%% — homepage

Obandit

module type BanditParam = sig ... end
module type Bandit = sig ... end

Exp3 Bandit.

UCB1 Bandit.

Epsilon-Greedy Bandit with a fixed exploration rate.

This functor wraps a bandit algorithm with the doubling trick. This means that all rewards are rescaled according to a scale (initially, 1). When a value is observed above the scale, the bandit algorithm is restarted and the scale is doubled. This is useful when reward scale is unknown and larger than 1.

OCaml

Innovation. Community. Security.