Rewrite ODS support based on loxun XMLWriter module #244

bdauvergne · 2016-07-02T08:34:18Z

It uses constant memory and is a lot faster than odf and odf3 packages as the document is not built in memory prior to serialization. OpenDocument is a simple format that should not need many thousand lines of code and gigabytes of memory to export a simple table of tens of thousand of lines.

A temporary file is needed as zipfile does not support streaming directly into it, if it's a problem I can do it in memory with BytesIO augmenting a little bit the memory consumption.

With the current implementation it's nearly impossible to export a 100 000 lines table to ODS in a constrained memory environment (VM with 1 Gb of memory).

It uses constant memory and is a lot faster than odf and odf3 packages.

chfw · 2018-08-03T22:06:49Z

bdauvergne · 2018-08-04T07:44:21Z

This code is new, I produced it on my employer (Entr'ouvert) time, it's freely inspired by this package (http://git.entrouvert.org/wcs.git/tree/wcs/qommon/ods.py) also from Entr'ouvert which use ElementTree and so do not have bounded memory consumption for this you need a streaming XmlWriter like API.

chfw · 2018-08-04T08:03:07Z

Thanks for your reply.

I planned to copy your code to produce a specialised ods writer for pyexcel, as pyexcel-odsw. As you mentioned in this PR, odfpy and ezodf does not use constant memory in writing an ods. I hope you will be OK with my copying.

For your information, messy-tables had a better performing ods reader and it inspired pyexcel-odsr. So your code is the missing puzzle to complete ods story: performant writer + performant reader.

bdauvergne · 2018-08-04T09:06:06Z

No problem, just keep the copyright.

chfw · 2018-08-04T23:04:55Z

tablib/packages/ods.py

+        self.status = self.INSHEET
+        self.xmlwriter.endTag()
+
+    def add_cell(self, content, hint=None):


Just an observation here. It is not a bug or anything.

add_cell does not support other cell data types, such as: int, float but unicode string.

Yep, small amerliorations are still possible, I would do it if i had information from the maintainer that a possible integration is possible soon.

Sent an invitation to you.

…n from https://github.com/kennethreitz/tablib/pull/244 and adapted for pyexcel

frallain · 2018-08-29T11:53:14Z

@bdauvergne Why not add loxun in the requirements.txt as it is available on pypi at https://pypi.org/project/loxun/ instead of copy pasting the whole file in the tablib project?

bdauvergne · 2018-09-05T09:02:35Z

Just thought it was the tablib way, it contains (contained?) so much external dependencies, I did not know they were all not packaged on pypi.

Rewrite ODS support based on loxun XMLWriter module

0070fce

It uses constant memory and is a lot faster than odf and odf3 packages.

Merge branch 'master' into develop

69265de

chfw reviewed Aug 4, 2018

View reviewed changes

chfw referenced this pull request in pyexcel/pyexcel-odsw Aug 4, 2018

✨ use loxun XMLWriter module and Entr'ouvert ods writer. This is take…

a6aea03

…n from https://github.com/kennethreitz/tablib/pull/244 and adapted for pyexcel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite ODS support based on loxun XMLWriter module #244

Rewrite ODS support based on loxun XMLWriter module #244

bdauvergne commented Jul 2, 2016

chfw commented Aug 3, 2018

bdauvergne commented Aug 4, 2018

chfw commented Aug 4, 2018 •

edited

Loading

bdauvergne commented Aug 4, 2018

chfw Aug 4, 2018

bdauvergne Aug 6, 2018

chfw Aug 6, 2018

frallain commented Aug 29, 2018

bdauvergne commented Sep 5, 2018

Rewrite ODS support based on loxun XMLWriter module #244

Are you sure you want to change the base?

Rewrite ODS support based on loxun XMLWriter module #244

Conversation

bdauvergne commented Jul 2, 2016

chfw commented Aug 3, 2018

bdauvergne commented Aug 4, 2018

chfw commented Aug 4, 2018 • edited Loading

bdauvergne commented Aug 4, 2018

chfw Aug 4, 2018

Choose a reason for hiding this comment

bdauvergne Aug 6, 2018

Choose a reason for hiding this comment

chfw Aug 6, 2018

Choose a reason for hiding this comment

frallain commented Aug 29, 2018

bdauvergne commented Sep 5, 2018

chfw commented Aug 4, 2018 •

edited

Loading