Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix snowflake formatting issue #346

Merged
merged 3 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion datacompy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
Then extended to carry that functionality over to Spark Dataframes.
"""

__version__ = "0.14.2"
__version__ = "0.14.3"

import platform
from warnings import warn
Expand Down
26 changes: 15 additions & 11 deletions docs/source/snowflake_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,12 @@ For ``SnowflakeCompare``
- ``on_index`` is not supported.
- Joining is done using ``EQUAL_NULL`` which is the equality test that is safe for null values.
- Compares ``snowflake.snowpark.DataFrame``, which can be provided as either raw Snowflake dataframes
or the as the names of full names of valid snowflake tables, which we will process into Snowpark dataframes.
or the as the names of full names of valid snowflake tables, which we will process into Snowpark dataframes.
fdosani marked this conversation as resolved.
Show resolved Hide resolved


SnowflakeCompare Object Setup
---------------------------------------------------
SnowflakeCompare setup
----------------------

There are two ways to specify input dataframes for ``SnowflakeCompare``

Provide Snowpark dataframes
Expand Down Expand Up @@ -66,11 +67,12 @@ Provide Snowpark dataframes
print(compare.report())


Provide the full name (``{db}.{schema}.{table_name}``) of valid Snowflake tables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Provide the full name (``db.schema.table_name``) of valid Snowflake tables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Given the dataframes from the prior examples...

.. code-block:: python

df_1.write.mode("overwrite").save_as_table("toy_table_1")
df_2.write.mode("overwrite").save_as_table("toy_table_2")

Expand Down Expand Up @@ -210,6 +212,7 @@ There are a few convenience methods and attributes available after the compariso
print(compare.df2_unq_columns())
# OrderedSet()


Duplicate rows
--------------

Expand Down Expand Up @@ -260,9 +263,10 @@ as uniquely in the second.

Additional considerations
-------------------------
- It is strongly recommended against joining on float columns (or any column with floating point precision).
Columns joining tables are compared on the basis of an exact comparison, therefore if the values comparing
your float columns are not exact, you will likely get unexpected results.
- Case-sensitive columns are only partially supported. We essentially treat case-sensitive
columns as if they are case-insensitive. Therefore you may use case-sensitive columns as long as
you don't have several columns with the same name differentiated only be case sensitivity.

- It is strongly recommended against joining on float columns or any column with floating point precision.
Columns joining tables are compared on the basis of an exact comparison, therefore if the values
comparing your float columns are not exact, you will likely get unexpected results.
- Case-sensitive columns are only partially supported. We essentially treat case-sensitive columns as
if they are case-insensitive. Therefore you may use case-sensitive columns as long as you don't have several
columns with the same name differentiated only be case sensitivity.
Loading