uuseg
v14.0.0
Uuseg is an OCaml library for segmenting Unicode text. It implements
the locale independent Unicode text segmentation algorithms to
detect grapheme cluster, word and sentence boundaries and the
Unicode line breaking algorithm to detect line break
opportunities.
The library is independent from any IO mechanism or Unicode text data
structure and it can process text without a complete in-memory
representation.
Uuseg depends on Uucp and
optionally on Uutf for support on
OCaml UTF-X encoded strings. It is distributed under the ISC license.
Homepage: http://erratique.ch/software/uuseg
Installation
Uuseg can be installed with opam
:
opam install uuseg
opam install uutf uuseg # for support on OCaml UTF-X encoded strings
If you don't use opam
consult the opam
file for build
instructions.
Documentation
The documentation and API reference can be consulted online or
via odig doc uuseg
.
Sample programs
If you installed Uuseg with opam
sample programs are located in
the directory opam config var uuseg:doc
.
In the distribution sample programs are located in the test
directory of the distribution, they can be built with:
topkg build --tests true
sha512=3f089baf95f010663a0c2f060b2911395d9b396f478efb10fd979815f527c9e61e0a70b3192f2e921f59287bfde0da6e25109d4a1825554e2e4a50c0535e97aa
>= "14.0.0" & < "15.0.0"
build & >= "1.0.3"
build
build
>= "4.03.0"