Remove NLTK dependency #85

feanil · 2024-07-15T18:19:56Z

An LMS web worker process uses over 300 MB of RAM once initialized. Over 10% of that is from loading nltk 1, which we only use in one place–to parse chemical equation inputs in ProblemBlocks (see chemcalc.py 1 in the openedx-chem repo).

It's not clear why the grammar is specified using NLTK instead of pyparsing. It could have been to get around some limitation that pyparsing had twelve years ago, or it could have just been that the author was just more familiar with NLTK and could hack the code out faster that way.

Task: Remove our dependency on NLTK by changing the parser implementation in this repo? It would likely involve some digging, and exact backwards compatibility would be extremely important.

More details about the process of getting this info: https://discuss.openedx.org/t/reducing-memory-usage-nltk/13406

ormsbee added the performance Relates to improving latency/throughput or reducing resource usage label Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove NLTK dependency #85

Remove NLTK dependency #85

feanil commented Jul 15, 2024

Remove NLTK dependency #85

Remove NLTK dependency #85

Comments

feanil commented Jul 15, 2024