Legend:
Library
Module
Module type
Parameter
Class
Class type
This module offers a partition refinement data structure.
This data structure is useful in partition refinement algorithms, such as automata minimization algorithms. It is taken from Antti Valmari's 2012 paper, "Fast brief practical DFA minimization".
An element is represented as an integer index in the range [0, n). See Indexing for more information about type-level indices.
type'n t
A data structure of type 'n t represents a partition of a set of nelements. The cardinal of this set is represented at the type level by 'n. The elements are grouped into nonempty disjoint blocks. The data structure allows blocks to be split, so, over time, blocks become smaller: hence the name partition refinement.
The data structure allows elements to be temporarily marked; marks are used to determine how blocks must be split.
Furthermore, the data structure allows some elements to be permanently discarded. The discarded elements are set aside in a special block and no longer participate in the partition refinement process.
create ?partition n creates a fresh partition data structure for a set of cardinal n.
If the optional argument partition is absent, then the partition initially has a single block, which contains all elements. In this case, the time complexity of this operation is O(n).
If the optional argument partition is present, then partition must be a function that implements a total order. Two elements x and y are placed in distinct blocks if and only if they are distinguished by this order, that is, if and only if partition x y is nonzero. In this case, the time complexity of this operation is O(n.log n).
mark p elt marks the element elt. If this element was marked already, then this operation has no effect. If this element was discarded earlier, then this operation has no effect.
split p splits every block that has both marked and unmarked elements into two blocks: a block of the marked elements and a block of the unmarked elements. Then, all marks are cleared.
Whenever a block is split into two subblocks, the existing block index is re-used for the larger half, while a new block number is allocated and given to the smaller half.
The amortized time complexity of this operation is O(1).
discard p f discards every element x such that f x is true. discard requires that no element is marked, and preserves this property.
If every element of a block is discarded, then this block disappears; so, this operation can cause the number of blocks to decrease.
The time complexity of this operation is O(n).
Querying the Data Structure
type block = int
Blocks are identified by integer indices, comprised between 0 inclusive and block_count p exclusive. Of course, the number of blocks, the mapping of elements to blocks, and the choice of a distinguished element within each block can change when the data structure is modified.
find p elt returns the index of the block that currently contains the element elt. If this element has been discarded, the special index -1 is returned. Otherwise, a block index in the range [0, block_count p) is returned.
The time complexity of this operation is O(1).
val iter_block_elements : 'nt->block->('nelt-> unit)-> unit
iter_block_elements p blk yield applies yield to each element that currently belongs to the block blk.
The time complexity of this operation is linear in the size of the block blk.
val iter_all_elements : 'nt->('nelt-> unit)-> unit
iter_all_elements p yield applies yield to every element of every block. The discarded elements are ignored.
is_chosen p elt returns true if the element elt has not been discarded and is the distinguished element of its block, that is, if choose p (find p elt) is elt.
The time complexity of this operation is O(1).
val exhaust_marked_elements : 'nt->block->('nelt-> unit)-> unit
exhaust_marked_elements p blk yield applies yield to each marked element of the block blk. The function yield is allowed to mark new elements of this block or of other blocks. The iteration continues until every marked element of the block blk has been yielded once.
The time complexity of this operation is linear in the number of yielded elements.