Legend:
Library
Module
Module type
Parameter
Class
Class type

The Basic Theory of Floating Points.

Floating point numbers represent a finite subset of the set of real numbers. Some formats also extend this set with special values to represent infinities or error conditions. This, in general, exceeds the scope of the floating point theory, however the theory includes predicates with domains that potentially may include this special numbers, e.g., is_nan. For floating point formats that do not support special values, such predicates will become constant functions.

All operations in the Floating Point theory are defined in terms of operations on real numbers. Since floating point numbers represent only a subset of the real set, denotations select a number from the set of numbers of the floating point sort using concrete rules expressed in terms of rounding modes. The rounding mode is a parameter of many operations, denoted with a term of sort rmode.

The denotation is not defined if x represents zero.

Rounding modes

Many operations in the Theory of Floating Point numbers are defined using the rounding mode parameter.

The rounding mode gives a precise meaning to the phrase "the closest floating point number to x", where x is a real number. When x is not representable by the given format, some other number x' is selected based on rules of the rounding mode.

The denotation is the floating point number nearest to the denoted real number. If the two nearest numbers are equally close, then the one with an even least significant digit shall be selected. The denotation is not defined, if both numbers have an even least significant digit.

The denotation is the floating point number nearest to the denoted real number. If the two nearest numbers are equally close, then the one with larger magnitude shall be selected.