Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Testing #10

Open
visch opened this issue Jul 17, 2022 · 2 comments
Open

Performance Testing #10

visch opened this issue Jul 17, 2022 · 2 comments

Comments

@visch
Copy link
Member

visch commented Jul 17, 2022

Setup a test for how long it takes to load

  1. 10,000 records
  2. 100,000 records
  3. 1,000,00 records

Compare these to:

  1. current target-postgres
  2. loading directly to postgres with a psql import directly

Programmatically would be best here

@sebastianswms
Copy link
Contributor

I have run the specified tests. The first few rows from the dummy database I used are provided below.

last_name|first_name |street_address                |email                      |index|_sdc_file_name                   |_sdc_line_number|_sdc_last_modified     |_sdc_extracted_at      |_sdc_received_at       |_sdc_batched_at        |_sdc_deleted_at|_sdc_sequence|_sdc_table_version|_sdc_sync_started_at|
---------+-----------+------------------------------+---------------------------+-----+---------------------------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+---------------+-------------+------------------+--------------------+
Butler   |Chelsea    |914 Davis Street              |[email protected]        |0    |/git/jsonl2postgres/data/1M.jsonl|               1|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.904|2023-11-24 21:46:52.904|               |1700862412905|                  |       1700862412850|
Garrison |Raymond    |66496 Pierce Burgs Apt. 300   |[email protected]     |1    |/git/jsonl2postgres/data/1M.jsonl|               2|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.905|2023-11-24 21:46:52.904|               |1700862412906|                  |       1700862412850|
Daniel   |Nancy      |276 Michael Oval              |[email protected]|2    |/git/jsonl2postgres/data/1M.jsonl|               3|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.906|2023-11-24 21:46:52.904|               |1700862412906|                  |       1700862412850|
Combs    |Christopher|314 Griffin Loaf              |[email protected]       |3    |/git/jsonl2postgres/data/1M.jsonl|               4|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.906|2023-11-24 21:46:52.904|               |1700862412907|                  |       1700862412850|
Perry    |Richard    |9258 Dawson Manors Apt. 575   |[email protected]         |4    |/git/jsonl2postgres/data/1M.jsonl|               5|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904|               |1700862412907|                  |       1700862412850|
Brennan  |Erica      |94620 Nicholas Avenue Apt. 600|[email protected]          |5    |/git/jsonl2postgres/data/1M.jsonl|               6|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904|               |1700862412907|                  |       1700862412850|
Molina   |Brandon    |9311 Juan Mission Suite 872   |[email protected]     |6    |/git/jsonl2postgres/data/1M.jsonl|               7|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904|               |1700862412908|                  |       1700862412850|
Pratt    |Alyssa     |7118 Michael Hollow Suite 103 |[email protected]  |7    |/git/jsonl2postgres/data/1M.jsonl|               8|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904|               |1700862412908|                  |       1700862412850|
Wood     |Robert     |884 Gaines Estates            |[email protected]      |8    |/git/jsonl2postgres/data/1M.jsonl|               9|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.908|2023-11-24 21:46:52.904|               |1700862412908|                  |       1700862412850|
Berry    |Kara       |20916 Paul Lights Apt. 284    |[email protected]     |9    |/git/jsonl2postgres/data/1M.jsonl|              10|2023-11-24 21:44:54.000|2023-11-24 21:46:52.688|2023-11-24 21:46:52.908|2023-11-24 21:46:52.904|               |1700862412909|                  |       1700862412850|

pg_dump and psql

time pg_dump -U postgres -t melty.users | psql -U postgres postgres2

10,000

real    0m0.081s
user    0m0.025s
sys     0m0.029s

100,000

real    0m0.294s
user    0m0.064s
sys     0m0.021s

1,000,000

real    0m2.242s
user    0m0.110s
sys     0m0.270s

MeltanoLabs/tap-postgres and MeltanoLabs/target-postgres:

10,000

real    0m6.899s
user    0m6.167s
sys     0m0.392s

100,000

real    0m37.223s
user    0m42.277s
sys     0m1.705s

1,000,000

real    5m43.368s
user    6m57.207s
sys     0m13.646s

transferwise/pipelinewise-tap-postgres and transferwise/pipelinewise-target-postgres:

10,000

real    0m3.147s
user    0m1.524s
sys     0m0.099s

100,000

real    0m12.502s
user    0m15.688s
sys     0m1.751s

1,000,000

real    1m38.532s
user    2m24.671s
sys     0m16.911s

@visch
Copy link
Member Author

visch commented Nov 28, 2023

I like what you did here, I think we should make some kind of script that we can run to test this. Ideally we'd probably make a whole performance infrastructure to test over time and track but that seems a bit overkill.

Could you make a script python or something, drop it in the ./scripts dir here maybe performance_testing.py or something that we can run that does all of these tests for us and outputs the results in a similar format.

I want to see what happens with the new performance update you made, and then it'd be nice to be able to "objectively" have folks run the test to see if we improved things or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants