- Support for PostgreSQL! The test suite now runs against PostgreSQL, and
datools.explanations.diff
now allows you to ask "why" about data stored in Postgres. Get excited! datools.sqlalchemy_utils.grouping_sets_query
will now generate a GROUPING SETs query for databases that support grouping sets (e.g., Postgres, DuckDB) or the equivalent UNION ALL version for databases without grouping sets support (e.g., SQLite). For more, check out the example in the docs.
- Python 3.10 support.
- Updated test suite to run tests against multiple databases, in particular expanding from SQLite only to DuckDB and SQLite.
- As a result of the last bullet, ensured code runs against DuckDB in addition to SQLite.
- First stab at documentation (https://datools.readthedocs.io/en/latest/).
- Introduced mypy to linting and CI to ensure code that makes it to
main
has proper types. - Created first working example of DIFF working on a real-world dataset as a Jupyter notebook. This example partially replicates the Scorpion paper when only moteid/sensorids are considered.
- Separated the
on_columns
argument ofdiff
intoon_column_values
(columns for which you want to generate equality predicates as explanations) and andon_column_ranges
(columns for which you want to generate range predicates as explanations after bucketing the ranges into 15 equi-sized buckets).
- First release of DIFF algorithm implementation.
- First release on PyPI.