Use Too Much Memory, Killed by System #1

oldcai · 2018-02-06T20:20:28Z

I'm using this project to split the .sql file to make the pg_dump dumped file in an order that backup programs can deduplicate the existing data.
The dumped file is more than 1000GB, which is a kind of big. So I guess the data may be sorted in memory, so it's easy to use out.

akaihola · 2020-06-24T09:16:00Z

You're right, the data for individual tables from COPY SQL commands is sorted in memory. The total size of the SQL file doesn't matter as long as individual tables fit in RAM.

One way to solve this on Unix-like systems would be to pipe COPY data into the Unix sort command instead of sorting Python lists in memory. I'd be fine with that approach, but I guess that would still leave Windows users out of luck, and we'd need separate code paths for different OS's.

akaihola · 2020-06-24T09:19:35Z

Also, the intended use for pg_dump_splitsort is to allow storing database dumps efficiently in version control. Storing a terabyte's worth of SQL in Git definitely doesn't make sense – @oldcai did you have another use case in mind here?

oldcai · 2020-06-25T18:49:23Z

Not yet. I was trying to make database backup incremental.
In that case, the bigger the file is, the more bytes need to save.

akaihola · 2020-07-04T08:01:11Z

@oldcai, I'm experimenting on a Python-based merge sort solution in #13.

akaihola self-assigned this Sep 12, 2021

akaihola linked a pull request Sep 12, 2021 that will close this issue

Support large tables which don't fit in RAM #13

Merged

akaihola added the enhancement label Sep 12, 2021

akaihola added this to the 1.0.1 milestone Apr 20, 2024

akaihola mentioned this issue Apr 20, 2024

Support large tables which don't fit in RAM #13

Merged

akaihola closed this as completed in #13 Apr 22, 2024

akaihola modified the milestones: 1.0.1, 1.1.0 Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Too Much Memory, Killed by System #1

Use Too Much Memory, Killed by System #1

oldcai commented Feb 6, 2018

akaihola commented Jun 24, 2020

akaihola commented Jun 24, 2020

oldcai commented Jun 25, 2020

akaihola commented Jul 4, 2020

Use Too Much Memory, Killed by System #1

Use Too Much Memory, Killed by System #1

Comments

oldcai commented Feb 6, 2018

akaihola commented Jun 24, 2020

akaihola commented Jun 24, 2020

oldcai commented Jun 25, 2020

akaihola commented Jul 4, 2020