- Refactor Table and Row caching, removing useless
_caching
attribute from many classes. - Remove module scriptutils.
- Refactoring of Table and Row caching.
- Replace in
container.py
and scriptsshow.py
andstyles.py
the previous functions of scriptutils.
odfdo.scriptutils.py
removed.
- Allow XML export of base64 encoded images (preparing for flat ODF export).
- Update XML propertires to ODF 1.2.
- Refactoring of Document.add_file() and export to XML format.
- The
Meta
class which manages themeta.xml
part has two new methodsas_dict()
andas_json()
to export its content. - Improved "pretty" export of documents.
- Add methods:
Meta.as_dict()
,Meta.as_json()
,MetaTemplate.as_dict()
,MetaAutoReload.as_dict()
,MetaHyperlinkBehaviour.as_dict()
.
- Small XML file formatting changes when saving with "pretty=True".
Fix some small rendering issues for Markdown export.
- Better Markdown export for strike style, non break space, successive tags, line breaks, footnotes
- Change in
str(Paragraph)
which now includes a'\n'
at the end of the string. - The
odfdo-to-md
script is renamed toodfdo-markdown
and should be functional. Markdown export of .odt files supports all standard formatting features (including tables) except quoted text (no clear semantic equivalent in the ODF standard). - Improved
__str__
methods for many classes: Document.body, Paragraph, Span, Link, Unit, Note, Annotation. - Some added methods:
Document.get_parent_style()
,Document.get_list_style()
,Style.get_list_style_properties()
,Style.get_text_properties()
. - The new
Element.inner_text
property is now the preferred way to access an element's inner text.
- Add methods:
Document.get_parent_style()
,Document.get_list_style()
,Style.get_list_style_properties()
,Style.get_text_properties()
. - Add propterty
Element.inner_text
.
- Script
odfdo-to-md
renamed toodfdo-markdown
. str(Paragraph)
now includes a'\n'
at the end of the string.- Output of the str method modified for many elements.
- New script
odfdo-to-md
to export text document in markdown format to stdout (experimental, do not export images links neither tables). - Fix
VarTime
initialization: class can now be initialized without mandatory time argument.
- Add script
odfdo-to-md
.
odfdo-folder
script now writes XML files with the "pretty" option by default.
- Fix
VarTime
initialization.
The HTML documentation in /doc
(mostly auto generated) contains now all recipes, sorted by relevance.
- Improvement of documentation.
- Fix a bug of
Paragraph.set_span()
when using an offset argument of zero (the Span was not created). Added 3 methods related to searching strings in paragraphs:search_first()
,search_all()
andtext_at()
. These methods permit to search some string with regex in a paragraph and get their position,text_at()
returns the text content at a given position. - Fix the "pretty" option of
Document.save()
. "pretty" is now the default for odfdo-folder.
Added a new recipe showing several methods to change the style of a paragraph or words in a pragraph with the use of Paragraph.style = style.name
and Paragraph.set_span()
.
- Added
Element.search_first()
,Element.search_all()
,Element.text_at()
. - Added
change_paragraph_styles_or_spans.py
recipe (issue #21).
odfdo-folder
script now writes XML files with the "pretty" option by default.
- Fix
Paragraph.set_span()
when using an offset argument of zero (issue #21). - Fix the "pretty" option of
Document.save()
(issue #28).
Fix a performance bug on huge .ods tables when number of rows is a large (several thousand). See issue #46 for a table of about ~83k. Table.traverse() on such a table is expected to be ~2 sec.
- Rewrite the method Table.traverse().
- Fix the performance bug on huge .ods tables (issue #46).
Add support for Python 3.13 final in test suite.
- Add support for Python3.13 in tox.ini
Add support for Python 3.13.0.rc3 in test suite.
- Add support for Python3.13.0.rc3 in tox.ini, add requirement for lxml version 5.3 or higher for Python 3.13.
When creating a Document() allow alias "odt" for "Text", "ods" for "spreadsheet".
Add a recipe showing how to remove parts from a text document.
-
Aliases "odt", "ods", "odp" and "odg" for Document creation.
-
Add recipe
delete_parts_of_a_text_document.py
.
Two changes in this version:
- Fix of the broken
Table.displayed
property. - Fix the way spaces are represented for better compliance with the ODF standard and word processors.
The Table.displayed
property was broken and is removed. The functionality is replaced by the Document.get_table_displayed
and Document.set_table_displayed
methods. This change should not affect anyone since the previous implementation was unusable.
In previous version 3 spaces were translated into 1 space followed by '<text:s text:c="2"/>'
unconditionally. However, the standard specifies that at the beginning and end of a paragraph spaces must be discarded by word processors, so 3 spaces should be coded '<text:s text:c="3"/>'
and a single space as '<text:s/>'
. This change should fix the bug of "disappearing" spaces at the beginning of paragraphs.
-
Methods Document.get_table_displayed(), Document.set_table_displayed(), Document.get_table_style().
-
The Spacer() class has 2 new properties: Spacer.length and Spacer.text.
-
XML generation of spaces at beginning and end of Paragraph content.
-
Update of dependency versions.
- Table.displayed property.
- Fix the "disappearing" spaces at the beginning of paragraphs bug.
Changed the default behavior for appending text to a Paragraph
: the behavior of the Paragraph.append_plain_text()
method is now the default. A "formatted"
argument is added, True
by default, which applies the recognition of "\n", "\t" or a sequence of several spaces and converts them to ODF tags (text:line-break
, text:tab
, text:s
)). To ignore this text formatting, set "formatted=False"
.
This change affects you if you create paragraphs from text containing line breaks or tabs and you don't want them to appear. In this case, add the argument "formatted=False"
Details:
-
Paragraph("word1 word2")
-
previous behavior:
- product XML:
'<text:p>word1 word2</text:p>'
- expected display:
word1 word2
(single space, the ODF standard does not recognize space sequences)
- product XML:
-
new behavior:
- product XML:
'<text:p>word1 <text:s text:c="4"/>word2</text:p>'
- expected display:
word1 word2
(5 spaces)
- product XML:
-
-
Paragraph("word1 word2", formatted=False)
- new behavior:
- product XML:
'<text:p>word1 word2</text:p>'
- expected display:
word1 word2
- product XML:
- new behavior:
-
Paragraph("word1\nword2")
-
previous behavior:
- product XML:
'<text:p>word1\nword2</text:p>'
- expected display:
word1 word2
(single space, the ODF standard does not recognize "\n" in XML content)
- product XML:
-
new behavior:
- product XML:
'<text:p>word1<text:line-break/>word2</text:p>'
- expected display:
word1 word2
- product XML:
-
-
Paragraph("word1\nword2", formatted=False)
- new behavior:
- product XML:
'<text:p>word1 word2</text:p>'
- expected display:
word1 word2
- product XML:
- new behavior:
On the same principle the "formatted"
argument is available for Pararaph.append(text)
, Header(text)
, Span(text)
.
The Paragraph.append_plain_text(text)
method is retained for compatibility with previous versions and has the same behavior as Paragraph.append(text, formatted=True)
, the default.
-
Paragraph()
,Paragraph.append()
and subclassesHeader()
andSpan()
have a new"formatted"
argument True by default that translates into ODF format "\n", "\t" and multiples spaces. -
Updating dependency versions.
- Fix parsing of Date and Datetime for a better compliance with ISO8601.
- Updating dependency versions.
-
Fix datetime encoding/decoding for ISO8601 compliance and different Python versions.
-
Move from strptime() to date.isoformat() for class Date and DateTime.
- Update dependencies and test suite, support of
lxml
version 5.3.0.
- Updating dependency versions.
-
Fix a type hint in element.py
-
Fix missing .venv in gitconfig
- New script
odfdo-userfield
to show or set the user-field content in an ODF file.
- Add script
odfdo-userfield
.
- Updating dependency versions.
-
Refactor to add property getter for some common methods. Original get_* method is still available and permits detailed requests with parameters.
- Body.tables -> Body.get_tables() - Element.tocs -> Element.get_tocs() - Element.toc -> Element.get_toc() - Element.text_changes -> Element.get_text_changes() - Element.tracked_changes -> Element.get_tracked_changes() - Element.user_defined_list -> Element.get_user_defined_list() - Element.images -> Element.get_images() - Element.frames -> Element.get_frames() - Element.lists -> Element.get_lists() - Element.headers -> Element.get_headers() - Element.spans -> Element.get_spans() - Element.paragraphs -> Element.get_paragraphs() - Element.sections -> Element.get_sections() - Table.rows -> Table.get_rows() - Table.cells -> Table.get_cells() - Table.columns -> Table.get_columns() - Row.cells -> Row.get_cells() - Document.parts -> Document.get_parts() - Container.parts -> Container.get_parts()
-
Refactor to add property getter/setter for some common methods. Original get_* and set_* methods are still available and permit detailed requests with parameters.
- Column.default_cell_style -> Column.get/set_default_cell_style()
- Added
Body.tables
- Added
Element.tocs
- Added
Element.toc
- Added
Element.text_changes
- Added
Element.tracked_changes
- Added
Element.images
- Added
Element.frames
- Added
Element.lists
- Added
Element.headers
- Added
Element.spans
- Added
Element.paragraphs
- Added
Element.sections
- Added
Column.default_cell_style
- Added
Table.rows
- Added
Table.cells
- Added
Table.columns
- Added
Row.cells
- Added
Document.parts
- Added
Container.parts
-
Refactor the Body access methods, creating relevant a Body class and related sub-classes. Moved some access method from the Element class to relevant Body sub-classes.
-
Refactor metadata methods to permit access throuh @property (the legacy get_* and set_* methods are still available).
-
Added a few metadata elements from the ODF standard (hyperlink-behaviour, auto-reload, template, print-dateprinted-by)
- Added
MetaAutoReload
class - Added
MetaHyperlinkBehaviour
class - Added
MetaTemplate
class - Added
DcCreatorMixin
class - Added
DcDateMixin
class - Added
Body
class - Added
Chart
class - Added
Database
class - Added
Drawing
class - Added
Image
class - Added
Presentation
class - Added
Spreadsheet
class - Added
Text
class (renaming the previous internalText
class toEText
)
Fix embedded chart analysis in documents, see recipe change_values_of_a_chart_inside_a_document.py
.
- Added
change_values_of_a_chart_inside_a_document.py
recipe
-
The "pretty" setting when saving the file always defaults to False. This setting should only be used for debugging purposes
-
meta.generator
can be used via a @property accessor -
(Internal change) move body() definition to xmlpart
-
(Internal change) refactoring for future XML feature
-
Fix parsing of Table when parent uses "table:table-rows" kind of wrapper
-
Fix a bug when a Cell contains the valid 'NaN' Decimal number
Improvement of the lxml
dependency support.
-
Added a
CHANGES.md
file -
Automatic tests for ubuntu-latest, macos-latest, windows-latest
-
Now supports a wider range of
lxml
versions:-
python 3.9: lxml version 4.8.0 to 4.9.4
-
python 3.10: lxml version 4.8.0 to 5.1.1
-
python 3.11: lxml version 4.9.4 to 5.2.0 and beyond
-
python 3.12: lxml version 4.9.4 to 5.2.0 and beyond
-
-
autogenerated documentation now uses
mkdocs
-
Use
sys.executable
to ensure all tests can pass in a github virtualenv on Windows. -
Remove import of
lxml
internal\_ElementUnicodeResult
and\_ElementUnicodeResult
classes.
Quick fix for the crash with new lxml
version 5.1.1
- Fix crash with `lxml` 5.1.1 by restricting version do 5.1.0
Add the method get_cell_background_color
to retrieve the background color of a cell in a table.
-
Tables: some users need to easily access the background color of cells, including cells without "value" content. That was requiring a complex parsing of styles. So a new method:
Document.get_cell_background_color(sheet_id, cell_coords)
. -
See the corresponding recipe
recipes/get_cell_background_color.py
for an exemple of usage. -
Tables: (related to previous). It is often useful to reduce the table size before working on it, especially if styles apply to whole rows. A method called
Table.rstrip()
already permitted to remove empty bottom rows and empty right columns. However, aCell
mays have no value but a style (color background for example), andrstrip()
was removing such cells. So an new clever method is provided:Table.optimize_width()
that shrink the table size, still keeping styled empty cells. -
To test the actual result of this method, you can use the new script
odfdo-table-shrink
which is basically a wrapper upon this method. (Note: all this stuff aims to facilitate some feature for the related github projectodsparsator
). -
repr()
method forCell
,Row
andColumn
. -
Ancillary methods related to above features.
Document(path)
now accepts astr
path starting with~
as the path relative to the user home.
-
Tables: (related to previous), change the
Cell.is_empty()
test. A cell is now considered as not empty if part of aspan
(a cell spanned on several rows or columns). This may induce some changes for parsing scripts. Before that, only the first cell of the span (which actually contains the value) was considered as non empty. Now other cells of the span are not empty (but contain a null value). -
Minor refactor of code, version updates of dependencies.
Add a recipe as example of programmatically setting text styles for headers and paragraphs, with basic font and color properties.
-
Add recipe
create_basic_text_styles
. -
All style fields related to color accept a color name from the CSS list of color.
- Updating dependency versions.
Internal maintenance release.
- Fix logo link on
Pypi
page.
-
Technical updates from
optparse
toargparse
. -
Updating dependency versions.
Internal maintenance release.
-
Use
pdoc
for autogenerated documentation. -
Refactor some recipes to use them in a test suit.
-
Code refactor, Updating dependency versions.
Minor performance improvement of script odfdo-headers
.
- Use better algorithm for script
odfdo-headers
.
New script odfdo-headers
to print the headers of a ODF
file.
- Add script
odfdo-headers
.
- Updating dependency versions.
New script odfdo-highlight
to highlight the text matching a pattern (regex) in a ODF
file.
- Add script
odfdo-highlight
.
- Updating dependency versions.
Fix the update method of Table of Content
and add a recipe to show how to update a TOC
.
- Add recipe
update_a_text_document_with_a_table_of_content
.
-
Refactor of TOC related code.
-
Updating dependency versions.
2024 release, updated ODF templates and better test suit.
-
Update
ODF
templates. -
Refactor many Python files for use of type hints.
-
Updates for year 2024, updating dependency versions.
Updade to lxml
version 5.
- Update `lxml` from version 4 to 5.
Add script odfdo-replace
to find a pattern (regex) in an ODF
file and replace by some string.
- Fix reading content from a
BytesIO
.
- Add script
odfdo-replace
.
Add recipes showing how to save/read document from io.BytesIO
.
- Add recipes
read_document_from_bytesio.py
andsave_document_as_bytesio.py
.
- Refactoring of code.