Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Memory Usage #40

Open
azurefreecovid opened this issue Jun 9, 2020 · 3 comments
Open

High Memory Usage #40

azurefreecovid opened this issue Jun 9, 2020 · 3 comments

Comments

@azurefreecovid
Copy link

azurefreecovid commented Jun 9, 2020

Hi there,

Love this library, just found it and it seems to work exactly as I want, except for one issue. It just ran my box out of memory and caused it to crash.

I'm trying to check a fairly large batch of files (about 3.7TB worth or 1,206,600 files) and bitrot really chews through the RAM (causing my box to crash). All up it seems to need 4.3GB of RAM to run, which does seem like a lot.

I'd prefer not to split my checks into multiple smaller sets if at all possible, but obviously I can't have my system crashing.

Any ideas on what I can do to fix this issue?

My system is:
AMD64
Debian Stretch
Python3.8

@azurefreecovid azurefreecovid changed the title Memory Usage Seems High High Memory Usage Jun 9, 2020
@ambv
Copy link
Owner

ambv commented Jun 11, 2020

Sadly I don't think this is actionable for our little project.

Honestly plowing through 3.7TB of data was never my intended use case for this. The problem is likely not the size of data but the sheer amount of files. There's been performance updates over the years that brought some data into memory to not have to reach for them in our SQLite database all the time. That makes stuff much faster but requires more memory.

I think that if we removed this optimization, you'd get lower memory usage but you would wait for the checksums to calculate for a veeery long time, making the tool useless regardless.

If a process using too much memory is causing your entire box to crash, you should check what's up with that. It's a userspace application, this shouldn't happen no matter what it does.

@azurefreecovid
Copy link
Author

Thanks for the reply, really appreciate it.

Sadly I don't think this is actionable for our little project.

Totally understand, no problems at all.

Honestly plowing through 3.7TB of data was never my intended use case for this. The problem is likely not the size of data but the sheer amount of files. There's been performance updates over the years that brought some data into memory to not have to reach for them in our SQLite database all the time. That makes stuff much faster but requires more memory.

I think that if we removed this optimization, you'd get lower memory usage but you would wait for the checksums to calculate for a veeery long time, making the tool useless regardless.

If a process using too much memory is causing your entire box to crash, you should check what's up with that. It's a userspace application, this shouldn't happen no matter what it does.

Unfortunately it is a feature of Linux based systems. When the system runs out of RAM the kernel will start killing things, hopefully the right things. On headless boxes where your webgui's and other system access is through Docker based containers they can end up be killed, and hence the box crashes (at least as far as the user is concerned) and has to be power cycled to be recovered.

Thanks again for the software. I think I have a solution (as of this afternoon), which is to give the box some swap space. Which is essentially writing the SQLite database back to disk, but in a much less efficient manner. Anyway I think it will solve the problem and speed is not a concern to me (if it takes a week to run that is fine) so it should be ok.

Keep up the great work.

@ambv
Copy link
Owner

ambv commented Jun 13, 2020

Unfortunately it is a feature of Linux based systems. When the system runs out of RAM the kernel will start killing things, hopefully the right things. On headless boxes where your webgui's and other system access is through Docker based containers they can end up be killed, and hence the box crashes (at least as far as the user is concerned) and has to be power cycled to be recovered.

I'm aware of OOM killer. Big companies like Facebook and Google disable it in their fleets because it is unpredictable. You can do it, too:

# echo 2 > /proc/sys/vm/overcommit_memory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants