Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle outOfMemory errors and exceptions #32

Open
mjbriggs opened this issue Aug 15, 2019 · 5 comments
Open

Handle outOfMemory errors and exceptions #32

mjbriggs opened this issue Aug 15, 2019 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@mjbriggs
Copy link
Collaborator

When an individual server instance has to use more than three parser models, it results in an outOfMemory error from the server. On the user's end, all they see is a web page that is stuck loading documents. There should be a way for us to gracefully handle such situations. Ideally we would find a way to work around the high memory needs of this project.

@mjbriggs mjbriggs added the bug Something isn't working label Aug 15, 2019
@mjbriggs
Copy link
Collaborator Author

Even if we can have our deployment server beefed up enough to handle the high memory usage. I still think that this is worthwhile if we want to have individual developers testing their code by performing multiple website searches.

@reynoldsnlp
Copy link
Owner

I agree this would be good.

Models are only loaded once they are requested, so if you only want to work on the Arabic part of the app, you can keep your memory requirements low by not querying English or Russian, for example. I could imagine the possibility of unloading a model to try a different language, but I don't know how feasible or worthwhile that would be.

Regardless, the front end should be sensitive to these kinds of failures whenever possible.

@mjbriggs
Copy link
Collaborator Author

mjbriggs commented Aug 22, 2019

I have not done any work on this issue. I will say that the integration tests are able to successfully load each model when testing web search operations. This works because junit creates a new instance for each @test annotation, so the running process only uses one model at a time and then the memory is freed. Although in it's current state FLAIR doesn't handle these exceptions well, we can use junit to test different languages without a problem.

@reynoldsnlp
Copy link
Owner

I was just looking at the jar files for the NLP models, and all told they take up less than 3GB, although they will probably be bigger when uncompressed into memory:

$ find . -name "*.jar" | xargs du -ach
3.5M	./stanford-parser/3.4.1/stanford-parser-3.4.1.jar
8.0K	./stanford-parser/2.0.2/stanford-parser-2.0.2.jar
2.7M	./stanford-parser/3.2.0/stanford-parser-3.2.0.jar
511M	./stanford-corenlp-russian-models/master-SNAPSHOT/stanford-corenlp-russian-models-master-SNAPSHOT.jar
8.0K	./stanford-parser-models/2.0.2/stanford-parser-models-2.0.2.jar
9.9M	./stanford-corenlp/master-SNAPSHOT/stanford-corenlp-master-SNAPSHOT.jar
64M	./stanford-corenlp/3.9.2/stanford-corenlp-3.9.2-models-arabic.jar
177M	./stanford-corenlp/3.9.2/stanford-corenlp-3.9.2-models-german.jar
8.7M	./stanford-corenlp/3.9.2/stanford-corenlp-3.9.2.jar
993M	./stanford-corenlp/3.9.2/stanford-corenlp-3.9.2-models-english.jar
991M	./stanford-corenlp/3.8.0/stanford-corenlp-3.8.0-models-english.jar
124M	./stanford-corenlp/3.8.0/stanford-corenlp-3.8.0-models-german.jar
7.7M	./stanford-corenlp/3.8.0/stanford-corenlp-3.8.0.jar
4.5M	./stanford-corenlp/3.2.0/stanford-corenlp-3.2.0.jar
867M	./parser-models/2015-12-11/parser-models-2015-12-11.jar
3.7G	total

We need to profile this to see what is taking up so much space. Maybe something like VisualVM (tomcat example here? Or maybe yourkit, which has a free license for open-source? Something that will tell us exactly which objects are taking up how much space.

Maybe the models really are just that big, but it would be nice to be sure of that. ;-p

@reynoldsnlp
Copy link
Owner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants