Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special character \H removed from filepath #492

Open
stephanedebove opened this issue Oct 28, 2024 · 0 comments
Open

Special character \H removed from filepath #492

stephanedebove opened this issue Oct 28, 2024 · 0 comments
Labels

Comments

@stephanedebove
Copy link

Describe the bug

convert_to_unicode() function interprets \H strings in filepaths as special characters.

Code

This code:

with open(BIB_PATH, 'r', encoding='utf-8') as bib_file:
    parser = BibTexParser()
    parser.customization = convert_to_unicode
    bib_database = bibtexparser.load(bib_file, parser=parser)

running on a bib file containing this entry:

@article{Hagger2022,
  title = {Perceived Behavioral Control Moderating Effects in the Theory of Planned Behavior: {{A}} Meta-Analysis},
  file = {C:\Users\name\Documents\Zotero\storage\7J78GAC5\Hagger et al_2022_Perceived behavioral control moderating effects in the theory of planned.pdf}
}

will remove the "\H" from the filepath, and file path will become:

C:\Users\name\Documents\Zotero\storage\7J78GAC5a̋gger et al_2022_Perceived behavioral control moderating effects in the theory o f planned.pdf

Reproducing

Version: 1.4.2

Workaround
For now, I just rewrote the convert_to_unicode function to skip the file field:

def convert_to_unicode(record):
    for val in record:
        if val == "file":
            continue
@MiWeiss MiWeiss added the v1 label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants