Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

Commit 25692cb

Browse files
Tidying up duplication between /docs and docs.datafold.com (#495)
* remove how to use file, which is duplicative of docs.datafold.com * post questions in github discussions * update docs/index.rst * remove use cases * Create python_examples.rst --------- Co-authored-by: Will Sweet <[email protected]>
1 parent ed971de commit 25692cb

File tree

4 files changed

+48
-196
lines changed

4 files changed

+48
-196
lines changed

README.md

Lines changed: 1 addition & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -32,26 +32,6 @@ For their corresponding connection strings, check out our [detailed table](https
3232
#### Looking for a database not on the list?
3333
If a database is not on the list, we'd still love to support it. [Please open an issue](https://github.com/datafold/data-diff/issues) to discuss it, or vote on existing requests to push them up our todo list.
3434

35-
## Use cases
36-
37-
### Diff Tables Between Databases
38-
#### Quickly identify issues when moving data between databases
39-
40-
<p align="center">
41-
<img alt="diff2" src="https://user-images.githubusercontent.com/1799931/196754998-a88c0a52-8751-443d-b052-26c03d99d9e5.png" />
42-
</p>
43-
44-
### Diff Tables Within a Database
45-
#### Improve code reviews by identifying data problems you don't have tests for
46-
<p align="center">
47-
<a href=https://www.loom.com/share/682e4b7d74e84eb4824b983311f0a3b2 target="_blank">
48-
<img alt="Intro to Diff" src="https://user-images.githubusercontent.com/1799931/196576582-d3535395-12ef-40fd-bbbb-e205ccae1159.png" width="50%" height="50%" />
49-
</a>
50-
</p>
51-
52-
&nbsp;
53-
&nbsp;
54-
5535
## Get started
5636

5737
### Installation
@@ -126,10 +106,7 @@ In both code examples, I've used `<>` carrots to represent values that **should
126106

127107
### We're here to help!
128108

129-
We know that in some cases, the data-diff command can become long and dense. And maybe you're new to the command line.
130-
131-
* We're here to help [on slack](https://getdbt.slack.com/archives/C03D25A92UU) if you have ANY questions as you use `data-diff` in your workflow.
132-
* You can also post a question in [GitHub Discussions](https://github.com/datafold/data-diff/discussions).
109+
We're here to help! Please post any questions in [GitHub Discussions](https://github.com/datafold/data-diff/discussions).
133110

134111
## How to Use
135112

docs/how-to-use.md

Lines changed: 0 additions & 159 deletions
This file was deleted.

docs/index.rst

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,22 +4,12 @@
44
:hidden:
55

66
python-api
7+
python_examples
78

89
data-diff
910
---------
1011

11-
**Data-diff** is a command-line tool and Python library to efficiently diff
12-
rows across two different databases.
13-
14-
⇄ Verifies across many different databases (e.g. *PostgreSQL* -> *Snowflake*) !
15-
16-
🔍 Outputs diff of rows in detail
17-
18-
🚨 Simple CLI/API to create monitoring and alerts
19-
20-
🔥 Verify 25M+ rows in <10s, and 1B+ rows in ~5min.
21-
22-
♾️ Works for tables with 10s of billions of rows
12+
**Data-diff** is a command-line tool and Python library for comparing tables in and across databases.
2313

2414
For more information, `See our README <https://github.com/datafold/data-diff#readme>`_
2515

@@ -32,4 +22,4 @@ Resources
3222
- :doc:`python-api`
3323
- The rest of the `documentation`_
3424

35-
.. _documentation: https://docs.datafold.com/os_diff/about/
25+
.. _documentation: https://docs.datafold.com/guides/os_data_diff

docs/python_examples.rst

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
Python API Examples
2+
---------
3+
4+
**Example 1: Diff tables in mysql and postgresql**
5+
6+
.. code-block:: python
7+
# Optional: Set logging to display the progress of the diff
8+
import logging
9+
logging.basicConfig(level=logging.INFO)
10+
11+
from data_diff import connect_to_table, diff_tables
12+
13+
table1 = connect_to_table("postgresql:///", "table_name", "id")
14+
table2 = connect_to_table("mysql:///", "table_name", "id")
15+
16+
for different_row in diff_tables(table1, table2):
17+
plus_or_minus, columns = different_row
18+
print(plus_or_minus, columns)
19+
20+
21+
**Example 2: Connect to snowflake using dictionary configuration**
22+
23+
.. code-block:: python
24+
SNOWFLAKE_CONN_INFO = {
25+
"driver": "snowflake",
26+
"user": "erez",
27+
"account": "whatever",
28+
"database": "TESTS",
29+
"warehouse": "COMPUTE_WH",
30+
"role": "ACCOUNTADMIN",
31+
"schema": "PUBLIC",
32+
"key": "snowflake_rsa_key.p8",
33+
}
34+
35+
snowflake_table = connect_to_table(SNOWFLAKE_CONN_INFO, "table_name") # Uses id by default
36+
37+
Run `help(connect_to_table)` and `help(diff_tables)` or read our API reference to learn more about the different options:
38+
39+
- connect_to_table_
40+
41+
- diff_tables_
42+
43+
.. _connect_to_table: https://data-diff.readthedocs.io/en/latest/python-api.html#data_diff.connect_to_table
44+
.. _diff_tables: https://data-diff.readthedocs.io/en/latest/python-api.html#data_diff.diff_tables

0 commit comments

Comments
 (0)