package mecab

  1. Overview
  2. Docs

Character kinds

val between : 'a -> 'a -> 'a -> bool

Top-level character ranges

is_ascii c checks c is an ASCII character: [U+0000-U+007F].

val is_basic_latin : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

An alias of is_ascii.

is_latin_1 c checks c is a Latin-1 character: [U+00A0-U+00FF].

is_latin_a c checks c is a Latin-A character: [U+0100-U+017F].

is_latin_b c checks c is a Latin-B character: [U+0180-U+024F].

is_ipa_ext c checks c is an IPA extension: [U+0250-U+02AF].

val is_greek_coptic : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_greek_coptic c checks c is a Greek and Coptic letter: [U+0370-U+03FF].

is_cyrillic c checks c is a Cyrillic letter: [U+0400-U+052F].

is_american c checks c is an American letter: [U+0530-U+058F].

is_hebrew c checks c is a Hebrew letter: [U+0590-U+05FF].

is_arabic c checks c is an Arabic letter: [U+0600-U+06FF].

is_syriac c checks c is a Syriac letter: [U+0700-U+074F].

is_thaana c checks c is a Thaana letter: [U+0780-U+07BF].

is_devanagari c checks c is a Devanagari letter: [U+0900-U+097F].

ASCII characters

val is_ascii_digit : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_digit c checks c is an ASCII digit: [0-9].

is_ascii_upper c checks c is an ASCII lower character: [a-z].

val is_ascii_upper : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_upper c checks c is an ASCII lower character: [a-z].

is_ascii_lower c checks c is an ASCII upper character: [A-Z].

val is_ascii_lower : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_lower c checks c is an ASCII upper character: [A-Z].

is_ascii_alpha c checks c is an ASCII alphabet: [a-zA-Z].

val is_ascii_alpha : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_alpha c checks c is an ASCII alphabet: [a-zA-Z].

is_ascii_alnum c checks c is an ASCII digit or alphabet: [a-zA-Z0-9].

val is_ascii_alnum : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_alnum c checks c is an ASCII digit or alphabet: [a-zA-Z0-9].

is_ascii_word c checks c is an ASCII word character: [a-zA-Z0-9_].

is_ascii_word c checks c is an ASCII word character: [a-zA-Z0-9_].

is_ascii_cntrl c checks c is ASCII control character: [\x00-\x1f\x7f].

val is_ascii_cntrl : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_cntrl c checks c is ASCII control character: [\x00-\x1f\x7f].

is_ascii_graph c checks c is a visible character (anything except spaces and control characters): \x21-\x7e.

val is_ascii_graph : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_graph c checks c is a visible character (anything except spaces and control characters): \x21-\x7e.

is_ascii_print c checks c is a visible character or a space: \x20-\x7e.

val is_ascii_print : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_print c checks c is a visible character or a space: \x20-\x7e.

is_ascii_punct c checks c is punctuation or symbol: is_ascii_graph c && not (is_ascii_alnum c).

val is_ascii_punct : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_punct c checks c is punctuation or symbol: is_ascii_graph c && not (is_ascii_alnum c).

is_ascii_blank checks c is a space or tab: [ \t].

val is_ascii_blank : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_blank checks c is a space or tab: [ \t].

is_ascii_space c checks c is whitespace: [ \t\r\n\v\f].

val is_ascii_space : CamomileLibraryDefault.Camomile.UChar.uchar -> bool

is_ascii_space c checks c is whitespace: [ \t\r\n\v\f].

Non-ASCII characters

is_wide_ascii c checks c is a full-width ASCII character.

is_wide_hira c checks c is a Japanese hiragana.

is_wide_hira c checks c is a Japanese hiragana.

is_wide_kana c checks c is a Japanese full-width katakana.

is_wide_kana c checks c is a Japanese full-width katakana.

is_wide_kana c checks c is a Japanese half-width katakana.

is_wide_kana c checks c is a Japanese half-width katakana.

is_kana c is is_half_kana c || is_wide_kana c.

is_kana c is is_half_kana c || is_wide_kana c.

is_kanji c checks c is a Chinese or Japanese kanji.

is_kanji c checks c is a Chinese or Japanese kanji.

is_blank c checks c is

  • U+0009 horizontal tab,
  • U+0020 space,
  • U+00a0 non-breaking space,
  • U+2002-U+200B various width space,
  • U+3000 ideographic space, or
  • U+ffef zero width non-breaking space.