Skip to content

Commit

Permalink
Adding SnowflakeCompare (Snowflake/Snowpark compare) (#333)
Browse files Browse the repository at this point in the history
* adding snowflake/snowpark compare

* undo change

* add partial case sensitive support

* update doc

* pr comments

* remainder PR comments

* update readme

* mocking abs, trim

* remove local testing, fix text

* ignore snowflake tests

* catch snowpark import in test config

* python 3.12 actions without snowflake

* clean

* fix

* catch snowflake imports, fix snowflake type annotations

* conditional install

* conditional install

* conditional install

* fix conditional
  • Loading branch information
rhaffar authored Oct 29, 2024
1 parent 1013ca8 commit 1ea649a
Show file tree
Hide file tree
Showing 9 changed files with 2,844 additions and 6 deletions.
18 changes: 14 additions & 4 deletions .github/workflows/test-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,17 +64,27 @@ jobs:
java-version: '8'
distribution: 'adopt'

- name: Install Spark and datacompy
- name: Install Spark, Pandas, and Numpy
run: |
python -m pip install --upgrade pip
python -m pip install pytest pytest-spark pypandoc
python -m pip install pyspark[connect]==${{ matrix.spark-version }}
python -m pip install pandas==${{ matrix.pandas-version }}
python -m pip install numpy==${{ matrix.numpy-version }}
- name: Install Datacompy without Snowflake/Snowpark if Python 3.12
if: ${{ matrix.python-version == '3.12' }}
run: |
python -m pip install .[dev_no_snowflake]
- name: Install Datacompy with all dev dependencies if Python 3.9, 3.10, or 3.11
if: ${{ matrix.python-version != '3.12' }}
run: |
python -m pip install .[dev]
- name: Test with pytest
run: |
python -m pytest tests/
python -m pytest tests/ --ignore=tests/test_snowflake.py
test-bare-install:

Expand All @@ -101,7 +111,7 @@ jobs:
python -m pip install .[tests]
- name: Test with pytest
run: |
python -m pytest tests/
python -m pytest tests/ --ignore=tests/test_snowflake.py
test-fugue-install-no-spark:

Expand All @@ -127,4 +137,4 @@ jobs:
python -m pip install .[tests,duckdb,polars,dask,ray]
- name: Test with pytest
run: |
python -m pytest tests/
python -m pytest tests/ --ignore=tests/test_snowflake.py
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ pip install datacompy[spark]
pip install datacompy[dask]
pip install datacompy[duckdb]
pip install datacompy[ray]
pip install datacompy[snowflake]

```

Expand Down Expand Up @@ -95,6 +96,7 @@ with the Pandas on Spark implementation. Spark plans to support Pandas 2 in [Spa
- Pandas: ([See documentation](https://capitalone.github.io/datacompy/pandas_usage.html))
- Spark: ([See documentation](https://capitalone.github.io/datacompy/spark_usage.html))
- Polars: ([See documentation](https://capitalone.github.io/datacompy/polars_usage.html))
- Snowflake/Snowpark: ([See documentation](https://capitalone.github.io/datacompy/snowflake_usage.html))
- Fugue is a Python library that provides a unified interface for data processing on Pandas, DuckDB, Polars, Arrow,
Spark, Dask, Ray, and many other backends. DataComPy integrates with Fugue to provide a simple way to compare data
across these backends. Please note that Fugue will use the Pandas (Native) logic at its lowest level
Expand Down
2 changes: 2 additions & 0 deletions datacompy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,14 @@
unq_columns,
)
from datacompy.polars import PolarsCompare
from datacompy.snowflake import SnowflakeCompare
from datacompy.spark.sql import SparkSQLCompare

__all__ = [
"BaseCompare",
"Compare",
"PolarsCompare",
"SnowflakeCompare",
"SparkSQLCompare",
"all_columns_match",
"all_rows_overlap",
Expand Down
Loading

0 comments on commit 1ea649a

Please sign in to comment.