Error in latex_to_unicode #472

dlesbre · 2024-02-16T22:08:14Z

Describe the bug
The latex_to_unicode function can fail with a rather obsure type error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 65, in latex_to_unicode
    string = _replace_all_latex(string, itertools.chain(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 53, in _replace_all_latex
    string = _replace_latex(string, l.rstrip(), u)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 35, in _replace_latex
    if unicodedata.combining(unicod):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: combining() argument must be a unicode character, not str

The problem is most likely due to line like this one, where the encoding isn't a single unicode character:

python-bibtexparser/bibtexparser/latexenc.py

Lines 941 to 945 in e4c6eb6

    
           ("\u2008", "\\hphantom{,}"), 
        
           ("\u2009", "\\hspace{0.167em}"), 
        
           ("\u2009-0200A-0200A", "\\;"), 
        
           ("\u200A", "\\mkern1mu "), 
        
           ("\u2013", "\\textendash "),

(Although this isn't the only example)

Reproducing

Version: 1.4.1

Code:

from bibtexparser.latexenc import latex_to_unicode
latex_to_unicode("\\;")

Remaining Questions (Optional)
Please tick all that apply:

I would be willing to contribute a PR to fix this issue: my solution would be to put a try except block around the call to unicodedata.combining, assume false if it fails. I haven't submitted this directly because I don't know what these non-unicode characters are and why they are there. If their is a good reason there is probably a better way to handle them, if not they should probably be removed.
This issue is a blocker, I'd be grateful for an early fix.

Related issue: dlesbre/bibtex-autocomplete#12

The text was updated successfully, but these errors were encountered:

MiWeiss added bug v1 labels Feb 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in latex_to_unicode #472

Error in latex_to_unicode #472

dlesbre commented Feb 16, 2024 •

edited

Loading

Error in latex_to_unicode #472

Error in latex_to_unicode #472

Comments

dlesbre commented Feb 16, 2024 • edited Loading

dlesbre commented Feb 16, 2024 •

edited

Loading