package macaque

  1. Overview
  2. Docs
type reserved_keyword_status = {
  1. reserved_in_sql2003 : bool;
  2. reserved_in_postgresql : bool;
}

Most lowercase identifier are perfectly fine as far as Macaque is concerned, but some may fail at "sql query generation time" because they turn out to be SQL reserved keywords. This runtime failure has been reported by Vincent Valat.

We here have a list of SQL keywords and are careful to warn the user when one is used, and to "escape" it into a non-reserved identifier.

Macaque currently only supports PostgreSQL, so it would make sense to care about PostgreSQL reserved keywords only, but this will hopefully change someday in the future so I decided to also consider reserved keywords as defined in the 2003 SQL standard.

val reserved_keywords : (string * reserved_keyword_status) list
val normalize_keyword_case : string -> string

SQL compatibility warning:

We are going to "quote" identifiers that correspond to reserved keywords, so that the query still stays syntactically correct. An issue with automatic quoting is that quoted identifiers, beside being allowed to contain reserved words, are taken in a case-sensitive manner while the rest of SQL is case-insensitive, in the sense that they are implicitly normalized by the SQL server.

Now there are funny problem that may arise with this: if you define a table as tAbLe, it will internally be define as TABLE (if normalized to uppercase) on the server side, and requesting the table "tAbLe" will then fail with a "table not found" error.

Our choice is therefore to case-normalize reserved identifiers before quoting them.

Finally, PostGreSQL does not follow the SQL norm of normalizing identifiers to uppercase, it instead normalizes to lowercase. As long as Macaque is pgsql-only, we choose lowercase here, but that will have to be runtime-configurable in a hopeful future where Macaque gets ported to other backends.

val keyword_safe : string -> string

It is rather awkward to protect SQL identifiers here, at the parser level. It would make more sense to preserve the user-input identifier as far as possible, that is upto the SQL query generation. However, this would require sharing the keyword-base code between the Camlp4 extension (which needs to have it to generate the warnings) and the output code, which is rather awkward to do with the .cmo loading scheme used for Camlp4 extensions. Doing everything in the extension is just more convenient.