package biotk

  1. Overview
  2. Docs

Data structures to represent sets of (possibly annotated) genomic regions

This module is useful to deal with sets of genomic regions. It provides set operations like union, intersection, difference or membership tests. Specific data types are also provided when the regions are annotated with some value.

Genomic regions are represented as a pair formed by a range and an abstract representation of a sequence/chromosome identifier. The data structures implemented here are parameterized over this abstract type. To obtain an implementation for the most common case where chromosomes are identified with a string, simply apply the functor Make on the String module.

The functor Make provides four datatypes, which corresponds to variants where:

  • the regions in the set can overlap or not
  • the regions are annotated with some values
module Selection : sig ... end

A collection of non-overlapping regions (e.g. a set of CpG islands)

module LSet : sig ... end

A set of locations (e.g. a set of gene loci)

module LMap : sig ... end

A set of locations with an attached value on each of them

module LAssoc : sig ... end