package obandit

  1. Overview
  2. Docs

The Exp3 Bandit for adversarial regret minimization.

Parameters

module P : BanditParam

Signature

val getAction : float -> int

A Mutable bandit.

The getAction function mutates the bandit one step further in the bandit game. The argument is the reward for the last action and the result is the next action. Rewards are floats in 0,1 and actions are integers in 0,n-1. The first reward is discarded. In order to use rewards larger than 1, please use the WrapDoubling functor.

OCaml

Innovation. Community. Security.