Adding NUMERIC and DECIMAL types (#186)

`NUMERIC` and `DECIMAL` types are used for arbitrary precision exact numbers. This has a been long in development. It's required several rewrites and some major changes and improvements to vsql in previous versions to lay the foundation for these types to be fully integrated as first class citizens. Now, number literals that contains a '.' (even if they are a whole numbers such as `123.`) are treated as `NUMERIC` with the scale and precision determined from the number. Arithmetic operations can result in types that are higher in scale and precision, according to the standard (which is very specific about all of that). As far as I'm aware vsql is the only SQL database to treat these as distinct types according to the standard, rather than being aliases of the same underlying type. In a nutshell, `NUMERIC` and `DECIMAL` are both stored as fractions, but `NUMERIC` permits any denominator within range, whereas a `DECIMAL` must have a base 10 denominator. You can think of `DECIMAL` as having "exactly" the precision specified (i.e good for currency), but `NUMERIC` has "at least" the precision specified. Meaning it's possible to `CAST` a `NUMERIC` to a higher precision and get more decimal places (from the inherent nature of a fraction). The docs explain this much better, with examples. Since this does introduce new storage types, a database file version bump is required but this likely be the last version bump before v1.0.0 is released. Along with the two new SQL types, there are some functions that work directly with exact numbers: `ABS`, `CEIL`, `FLOOR` and `MOD`. SQL Standard E011-03
elliotchance · Dec 24, 2023 · 9895e3b · 9895e3b
1 parent 2a1989a
commit 9895e3b
Show file tree

Hide file tree

Showing 32 changed files with 1,553 additions and 447 deletions.
diff --git a/docs/functions.rst b/docs/functions.rst
@@ -151,15 +151,19 @@ ABS
 .. code-block:: sql
 
    ABS(DOUBLE PRECISION) DOUBLE PRECISION
+   ABS(NUMERIC) NUMERIC
 
-Absolute value.
+Absolute value. When using ``NUMERIC`` the same scale and precision is returned.
 
 **Examples**
 
 .. code-block:: sql
 
-  VALUES ABS(1.2);
-  -- 1.2
+  VALUES ABS(1.2e0);
+  -- 1.2e0
+  
+  VALUES ABS(-1.23e0);
+  -- 1.23e0
   
   VALUES ABS(-1.23);
   -- 1.23
@@ -218,34 +222,43 @@ CEIL
 .. code-block:: sql
 
    CEIL(DOUBLE PRECISION) DOUBLE PRECISION
+   CEIL(NUMERIC) NUMERIC
 
-Round up to the nearest integer.
+Round up to the nearest integer. When using ``NUMERIC`` the same scale and
+precision is returned.
 
 **Examples**
 
 .. code-block:: sql
 
-  VALUES CEIL(3.7);
-  -- COL1: 4
+  VALUES CEIL(3.7e0);
+  -- COL1: 4e0
+
+  VALUES CEIL(3.3e0);
+  -- COL2: 4e0
 
-  VALUES CEIL(3.3);
-  -- COL2: 4
+  VALUES CEIL(-3.7e0);
+  -- COL3: -3e0
 
-  VALUES CEIL(-3.7);
-  -- COL3: -3
+  VALUES CEIL(-3.3e0);
+  -- COL4: -3e0
 
-  VALUES CEIL(-3.3);
-  -- COL4: -3
+  VALUES CEILING(3.7e0);
+  -- COL1: 4e0
 
-  VALUES CEILING(3.7);
+  VALUES CEIL(3.7);
   -- COL1: 4
 
+  VALUES CEILING(-3.7);
+  -- COL1: -3
+
 CEILING
 ^^^^^^^
 
 .. code-block:: sql
 
    CEILING(DOUBLE PRECISION) DOUBLE PRECISION
+   CEILING(NUMERIC) NUMERIC
 
 ``CEILING`` is an alias of ``CEIL``.
 
@@ -303,25 +316,33 @@ FLOOR
 .. code-block:: sql
 
    FLOOR(DOUBLE PRECISION) DOUBLE PRECISION
+   FLOOR(NUMERIC) NUMERIC
 
-Round down to the nearest integer.
+Round down to the nearest integer. When using ``NUMERIC`` the same scale and
+precision is returned.
 
 **Examples**
 
 .. code-block:: sql
 
-  VALUES FLOOR(3.7);
-  -- COL1: 3
+  VALUES FLOOR(3.7e0);
+  -- COL1: 3e0
 
-  VALUES FLOOR(3.3);
+  VALUES FLOOR(3.3e0);
+  -- COL1: 3e0
+
+  VALUES FLOOR(-3.7e0);
+  -- COL1: -4e0
+
+  VALUES FLOOR(-3.3e0);
+  -- COL1: -4e0
+
+  VALUES FLOOR(3.7);
   -- COL1: 3
 
   VALUES FLOOR(-3.7);
   -- COL1: -4
 
-  VALUES FLOOR(-3.3);
-  -- COL1: -4
-
 LN
 ^^^
 
@@ -360,15 +381,20 @@ MOD
 .. code-block:: sql
 
    MOD(DOUBLE PRECISION, DOUBLE PRECISION) DOUBLE PRECISION
+   MOD(NUMERIC, NUMERIC) NUMERIC
 
-Modulus.
+Modulus. When using ``NUMERIC`` the result will have a precision that is the
+highest precision between either parameter. The scale is undetermined.
 
 **Examples**
 
 .. code-block:: sql
 
-  VALUES MOD(232, 3);
-  -- COL1: 1
+  VALUES MOD(232e0, 3e0);
+  -- COL1: 1e0
+
+  VALUES MOD(10.7e0, 0.8e0);
+  -- COL1: 0.3e0
 
   VALUES MOD(10.7, 0.8);
   -- COL1: 0.3

diff --git a/docs/numbers.rst b/docs/numbers.rst
@@ -195,6 +195,155 @@ range a ``SQLSTATE 22003 numeric value out of range`` is raised.
     - -9223372036854775808 to 9223372036854775807
     - 8 or 9 bytes [2]_
 
+  * - ``DECIMAL(scale,prec)``
+    - Variable, described below.
+    - Variable based on scale
+
+  * - ``NUMERIC(scale,prec)``
+    - Variable, described below.
+    - Variable based on scale
+
+DECIMAL vs NUMERIC
+^^^^^^^^^^^^^^^^^^
+
+``DECIMAL`` and ``NUMERIC`` are both exact numeric types that require a scale
+and precision. Both store their respective values as fractions. For example,
+``1.23`` could be represented as ``123/100``.
+
+The main difference between these two types comes down to the allowed
+denominators. In short, a ``NUMERIC`` may have any denominator, whereas a
+``DECIMAL`` must have a denominator of exactly 10^scale. This can also be
+expressed as:
+
+.. list-table::
+  :header-rows: 1
+
+  * - Type
+    - Numerator
+    - Denominator
+
+  * - ``NUMERIC(scale, precision)``
+    - ± 10^scale (exclusive)
+    - ± 10^scale (exclusive)
+
+  * - ``DECIMAL(scale, precision)``
+    - ± 10^scale (exclusive)
+    - 10^scale
+
+When calculations are performed on a ``DECIMAL``, the result from each operation
+will be normalized to always satisfy this constraint.
+
+This means that a ``DECIMAL`` is always exact at the scale and precision
+specified and casting to a higher precision will not alter the value. In
+contrast, a ``NUMERIC`` promises to have *at least* the precision specified but
+the value may change as to be more exact if the precision is increased. This is
+best understood with some examples:
+
+.. code-block:: sql
+
+   VALUES CAST(1.23 AS DECIMAL(3,2)) / CAST(5 AS DECIMAL) * CAST(5 AS DECIMAL);
+   -- 1.20
+
+Because:
+
+1. ``1.23 AS DECIMAL(3,2)`` -> ``123/100``
+2. Normalize denominator -> ``123/100``
+3. Divide by ``5`` -> ``123/500``
+4. Normalize denominator -> ``24/100``
+5. Multiply by ``5`` -> ``120/100``
+6. Normalize denominator -> ``24/100``
+
+Whereas,
+
+.. code-block:: sql
+  
+  VALUES CAST(1.23 AS NUMERIC(3,2)) / 5 * 5;
+  -- 1.23
+
+Because:
+
+1. ``1.23 AS NUMERIC(3,2)`` -> ``123/100``
+2. Divide by ``5`` -> ``123/500``
+3. Multiply by ``5`` -> ``615/500``
+
+This may seem like the only difference is that ``NUMERIC`` does not normalize
+the denominator, but actually they both need to normalize a denominator that
+would be out of bounds. Consider the example:
+
+.. code-block:: sql
+
+   VALUES CAST(1.23 AS NUMERIC(3,2)) / 11;
+   -- 0.11
+
+1. ``1.23 AS NUMERIC(3,2)`` -> ``123/100``
+2. Divide by ``11`` -> ``123/1100``
+3. Denominator is out of bounds as it cannot be larger than 100. Highest
+   precision equivalent would be -> ``11/100``
+
+This the same process and result that a ``DECIMAL`` that the equivalent decimal
+operation. Casting to higher precision might result in a different value for
+``NUMERIC`` values, for example:
+
+.. code-block:: sql
+
+   VALUES CAST(CAST(5 AS NUMERIC(3,2)) / CAST(7 AS NUMERIC(5,4)) AS NUMERIC(5,4));
+   -- 0.7142
+
+Because:
+
+1. ``5 AS NUMERIC(3,2)`` -> ``5/1``
+2. Divide by ``7`` -> ``5/7``
+3. Cast to ``NUMERIC(5,4)`` -> ``5/7``
+4. Formatted result based on 4 precision -> ``0.7142``
+
+.. code-block:: sql
+
+   VALUES CAST(CAST(5 AS DECIMAL(3,2)) / CAST(7 AS DECIMAL) AS DECIMAL(5,4));
+   -- 0.7100
+
+Because:
+
+1. ``5 AS DECIMAL(3,2)`` -> ``500/100``
+2. Divide by ``7`` -> ``500/700``
+3. Normalize denominator -> ``71/100``
+4. Cast to ``DECIMAL(5,4)`` -> ``7100/10000``
+5. Formatted result based on 4 precision -> ``0.7100``
+
+Operations Between Exact Types
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Arithmetic operations can only be performed when both operates are the same
+fundamental types, ``NUMERIC`` or ``DECIMAL``, although they do not need to
+share the same scale or precision.
+
+.. list-table::
+  :header-rows: 1
+
+  * - Operation
+    - Result Type
+
+  * - ``NUMERIC(s1, s2) + NUMERIC(s2, p2)``
+    - ``NUMERIC(MAX(s1, s2), MIN(p1, p2))``
+
+  * - ``NUMERIC(s1, s2) - NUMERIC(s2, p2)``
+    - ``NUMERIC(MAX(s1, s2), MIN(p1, p2))``
+
+  * - ``NUMERIC(s1, s2) * NUMERIC(s2, p2)``
+    - ``NUMERIC(s1 * s2, p1 + p2)``
+
+  * - ``NUMERIC(s1, s2) / NUMERIC(s2, p2)``
+    - ``NUMERIC(s1 * s2, p1 + p2)``
+
+Examples:
+
+.. code-block:: sql
+
+   VALUES CAST(10.24 AS NUMERIC(4,2)) + CAST(12.123 AS NUMERIC(8,3));
+   -- 22.36 as NUMERIC(32, 3)
+
+   VALUES CAST(10.24 AS NUMERIC(4,2)) * CAST(12.123 AS NUMERIC(8,3));
+   -- 124.13952 as NUMERIC(32, 5)
+
 Casting
 -------
 

diff --git a/docs/sql-compliance.rst b/docs/sql-compliance.rst
@@ -13,7 +13,7 @@ Mandatory Features
 ------------------
 
 As of the latest version (or at least the version of this documentation)
-**vsql supports 60 of the 164 mandatory features** of the
+**vsql supports 62 of the 164 mandatory features** of the
 `SQL:2016 Standard <https://www.iso.org/standard/63556.html>`_.
 
 .. list-table:: Table 43 — Feature taxonomy and definition for mandatory features
@@ -22,7 +22,7 @@ As of the latest version (or at least the version of this documentation)
    * - Feature ID
      - Feature Name
 
-   * - ⭕ **E011**
+   * - ✅ **E011**
      - **Numeric data types**
 
    * - ✅ E011-01
@@ -31,7 +31,7 @@ As of the latest version (or at least the version of this documentation)
    * - ✅ E011-02
      - ``REAL``, ``DOUBLE PRECISON``, and ``FLOAT`` data types
 
-   * - ❌ E011-03
+   * - ✅ E011-03
      - ``DECIMAL`` and ``NUMERIC`` data types
 
    * - ✅ E011-04