Random forests are an ensemble learning method for classification and regression. Random forests correct decision trees' habit of over-fitting to their training set. Cf. "Breiman, Leo, 2001. Random forests. Machine learning, 45(1), pp.5-32" for details.
Published: 15 Feb 2018
A pure OCaml implementation of a random forest classifier based on OC4.5. See Wikipedia for more information.
OC4.5 is an implementation of C4.5 that can be found here.
This project uses OBuild as a compilation manager.
To sum up, in order to get the project running,
opam install obuild # If you don't have it yet
Basic usage example
Assuming you're using integers (if not, use
ORandForest.FloatRandForest or reimplement the needed functions to functorize
ORandForest.ORandForest with your datatype, a basic session would look like the following
First generate a dataset;
let trainSet = Oc45.IntOc45.emptyTrainSet
nbFeatures nbCategories featuresContinuity in
Oc45.IntOc45.addDataList trainDataPoints trainSet
then tweak the dataset to your needs;
then turn it into a random forest;
let forest = ORandForest.IntRandForest.genRandomForest nbTrees trainSet in
let categories = List.map
(ORandForest.IntRandForest.classify forest) testDataPoints in