package astring

  1. Overview
  2. Docs

US-ASCII character support

The following functions act only on US-ASCII code points, that is on the bytes in range [0x00;0x7F]. The functions can be safely used on UTF-8 encoded strings, they will of course only deal with US-ASCII related matters.

References.

Predicates

val is_valid : char -> bool

is_valid c is true iff c is an US-ASCII character, that is a byte in the range [0x00;0x7F].

val is_digit : char -> bool

is_digit c is true iff c is an US-ASCII digit '0' ... '9', that is a byte in the range [0x30;0x39].

val is_hex_digit : char -> bool

is_hex_digit c is true iff c is an US-ASCII hexadecimal digit '0' ... '9', 'a' ... 'f', 'A' ... 'F', that is a byte in one of the ranges [0x30;0x39], [0x41;0x46], [0x61;0x66].

val is_upper : char -> bool

is_upper c is true iff c is an US-ASCII uppercase letter 'A' ... 'Z', that is a byte in the range [0x41;0x5A].

val is_lower : char -> bool

is_lower c is true iff c is an US-ASCII lowercase letter 'a' ... 'z', that is a byte in the range [0x61;0x7A].

val is_letter : char -> bool

is_letter c is is_lower c || is_upper c.

val is_alphanum : char -> bool

is_alphanum c is is_letter c || is_digit c.

val is_white : char -> bool

is_white c is true iff c is an US-ASCII white space character, that is one of space ' ' (0x20), tab '\t' (0x09), newline '\n' (0x0A), vertical tab (0x0B), form feed (0x0C), carriage return '\r' (0x0D).

val is_blank : char -> bool

is_blank c is true iff c is an US-ASCII blank character, that is either space ' ' (0x20) or tab '\t' (0x09).

val is_graphic : char -> bool

is_graphic c is true iff c is an US-ASCII graphic character that is a byte in the range [0x21;0x7E].

val is_print : char -> bool

is_print c is is_graphic c || c = ' '.

val is_control : char -> bool

is_control c is true iff c is an US-ASCII control character, that is a byte in the range [0x00;0x1F] or 0x7F.

Casing transforms

val uppercase : char -> char

uppercase c is c with US-ASCII characters 'a' to 'z' mapped to 'A' to 'Z'.

val lowercase : char -> char

lowercase c is c with US-ASCII characters 'A' to 'Z' mapped to 'a' to 'z'.

Escaping to printable US-ASCII

val escape : char -> string

escape c escapes c with:

  • '\\' (0x5C) escaped to the sequence "\\\\" (0x5C,0x5C).
  • Any byte in the ranges [0x00;0x1F] and [0x7F;0xFF] escaped by an hexadecimal "\xHH" escape with H a capital hexadecimal number. These bytes are the US-ASCII control characters and non US-ASCII bytes.
  • Any other byte is left unchanged.

Use String.Ascii.unescape to unescape.

val escape_char : char -> string

escape_char c is like escape except is escapes s according to OCaml's lexical conventions for characters with:

  • '\b' (0x08) escaped to the sequence "\\b" (0x5C,0x62).
  • '\t' (0x09) escaped to the sequence "\\t" (0x5C,0x74).
  • '\n' (0x0A) escaped to the sequence "\\n" (0x5C,0x6E).
  • '\r' (0x0D) escaped to the sequence "\\r" (0x5C,0x72).
  • '\\'' (0x27) escaped to the sequence "\\'" (0x5C,0x27).
  • Other bytes follow the rules of escape

Use String.Ascii.unescape_string to unescape.

OCaml

Innovation. Community. Security.