Slow performance #39

bkuczenski · 2016-07-14T20:17:26Z

Hi- I've been using pylzma to handle large(ish) 7z files ranging from 50MB-1.0GB compressed. I am trying to access individual files from the archive, one at a time, and I noticed that performance can be highly variable, and is very slow in comparison to ZipFile.

Below I compared performance for two archives containing the same files (I created the ZIP by extracting the 7z file and recompressing it with zip):

http://nbviewer.jupyter.org/github/bkuczenski/lca-tools/blob/master/doc/7z%20profiling.ipynb

On the one hand, the ZIP file is almost 6x as large as the 7Z file; on the other hand, 7z access seems 10x-100x slower.

My question: is there a way for me to improve the performance of py7zlib? is there a better way to use the archive to reference single files? Or is there a technical limitation that prevents this?

n.b. the performance is no different if I keep the archive open between successive retrievals. It is consistent for the same file over multiple trials (some are fast, others are slow- in this case all the files are about the same size so that's not the issue).

Thanks for any feedback.

The text was updated successfully, but these errors were encountered:

bkuczenski · 2016-08-29T18:43:09Z

This turns out to be due to high memory requirements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance #39

Slow performance #39

bkuczenski commented Jul 14, 2016 •

edited

Loading

bkuczenski commented Aug 29, 2016

Slow performance #39

Slow performance #39

Comments

bkuczenski commented Jul 14, 2016 • edited Loading

bkuczenski commented Aug 29, 2016

bkuczenski commented Jul 14, 2016 •

edited

Loading