sedlex

An OCaml lexer generator for Unicode
Library sedlex
Module Sedlexing . Utf16
type byte_order =
| Little_endian
| Big_endian
val from_gen : char Gen.t -> byte_order option -> lexbuf

Utf16.from_gen s opt_bo creates a lexbuf from an UTF-16 encoded stream. If opt_bo matches with None the function expects a BOM (Byte Order Mark), and takes the byte order as Utf16.Big_endian if it cannot find one. When opt_bo matches with Some bo, bo is taken as byte order. In this case a leading BOM is kept in the stream - the lexer has to ignore it and a `wrong' BOM (0xfffe) will raise Utf16.InvalidCodepoint.

val from_channel : in_channel -> byte_order option -> lexbuf

Works as Utf16.from_gen with an in_channel.

val from_string : string -> byte_order option -> lexbuf

Works as Utf16.from_gen with a string.

val lexeme : lexbuf -> byte_order -> bool -> string

utf16_lexeme lb bo bom as Sedlexing.lexeme with a result encoded in UTF-16 in byte_order bo and starting with a BOM if bom = true.

val sub_lexeme : lexbuf -> int -> int -> byte_order -> bool -> string

sub_lexeme lb pos len bo bom as Sedlexing.sub_lexeme with a result encoded in UTF-16 with byte order bo and starting with a BOM if bom=true