Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abnormally high memory usage in mainframe on staging #311

Open
jonathan-d-zhang opened this issue Aug 19, 2024 · 3 comments
Open

Abnormally high memory usage in mainframe on staging #311

jonathan-d-zhang opened this issue Aug 19, 2024 · 3 comments
Assignees

Comments

@jonathan-d-zhang
Copy link
Contributor

Prod mainframe uses ~100MiB while staging uses ~250MiB. I suspect this is due to SQLAlchemy caching the distributions field that was recently added.

@jonathan-d-zhang jonathan-d-zhang self-assigned this Aug 19, 2024
@jonathan-d-zhang
Copy link
Contributor Author

jonathan-d-zhang commented Aug 22, 2024

SQLAlchemy's identity map would be collected as soon as the session was closed, so I don't think my original guess is correct. I also found https://docs.pydantic.dev/latest/concepts/json/#caching-strings, but by quick estimation a completely full cache with 63 character strings would be at most 2 MiB.

Also, when attempting to increase the memory usage by requesting a large time span with GET /package?since=<30 minutes ago> (to simulate the situation described in vipyrsec/bot#255), the memory usage jumped up to ~350MiB, but did not go down even after a few minutes. This leads me to believe that mainframe is somehow holding onto the memory used to build the response. I'm using memray to test.

@Robin5605
Copy link
Contributor

Also, when attempting to increase the memory usage by requesting a large time span with GET /package?since=<30 minutes ago> (to simulate the situation described in vipyrsec/bot#255), the memory usage jumped up to ~350MiB, but did not go down even after a few minutes. This leads me to believe that mainframe is somehow holding onto the memory used to build the response. I'm using memray to test.

After some investigation, it doesn't seem like this issue is necessarily introduced by this PR. I've run memray against the main branch, then requested GET /package?since=<3 days ago> for a very large response body, and saw this memory usage graph:
image

A good chunk of memory is still being hung onto

@Robin5605
Copy link
Contributor

image

This is about 91MB of memory being retained a while after the response was complete. Unsure why it's being held onto.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

No branches or pull requests

2 participants