Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handling of new blocks that aren't on a new line #411

Closed
2 tasks done
benlogan opened this issue Oct 3, 2023 · 6 comments
Closed
2 tasks done

handling of new blocks that aren't on a new line #411

benlogan opened this issue Oct 3, 2023 · 6 comments
Assignees
Labels

Comments

@benlogan
Copy link
Contributor

benlogan commented Oct 3, 2023

Describe the bug

V1 handled input files where the new blocks (indicated by @) aren't necessarily on a new line.
V2 attempts to ignore these, but instead get's into a real mess, because they end up being 'nested' within other entries. If they were properly ignored (i.e. parsed and dropped), then it would kind of be ok and consistent, but they make it into the raw data for each entry.
This leads to all sorts of problems. They will reappear, for example, when you write the library back out to a file later!
Very confusing.
IMHO, they should be treated as valid blocks and handled properly. I've tested a fix for this locally and it appears to work OK.

Reproducing

Version: latest

Code:

parse any file with inline new blocks, e.g.;
parsed_lib = bibtexparser.parse_file(filename)

Bibtex:

@article{KUEH2023S98,
title = {Myocardial Characterisation Using Delayed Dual-Energy Cardiac Computed Tomography},
journal = {Heart, Lung and Circulation},
volume = {32},
pages = {S98},
year = {2023},
note = {Abstracts for the Cardiac Society of Australia and New Zealand Annual Scientific Meeting (New Zealand) 2023, 15 - 17 June 2023, Auckland, New Zealand},
issn = {1443-9506},
doi = {https://doi.org/10.1016/j.hlc.2023.04.262},
url = {https://www.sciencedirect.com/science/article/pii/S1443950623004390},
author = {S.-H. Kueh and J. Benatar and R. Stewart}
}@INPROCEEDINGS{9837531,
  author={Hassan, Mona Bakri and Saeed, Rashid A. and Khalifa, Othman and Ali, Elmustafa Sayed and Mokhtar, Rania A. and Hashim, Aisha A.},
  booktitle={2022 IEEE 2nd International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA)}, 
  title={Green Machine Learning for Green Cloud Energy Efficiency}, 
  year={2022},
  volume={},
  number={},
  pages={288-294},
  doi={10.1109/MI-STA54861.2022.9837531}}@ARTICLE{9372936,
  author={Hu, Ning and Tian, Zhihong and Du, Xiaojiang and Guizani, Nadra and Zhu, Zhihan},
  journal={IEEE Transactions on Green Communications and Networking}, 
  title={Deep-Green: A Dispersed Energy-Efficiency Computing Paradigm for Green Industrial IoT}, 
  year={2021},
  volume={5},
  number={2},
  pages={750-764},
  doi={10.1109/TGCN.2021.3064683}}@ARTICLE{5445167,
  author={Kumar, Karthik and Lu, Yung-Hsiang},
  journal={Computer}, 
  title={Cloud Computing for Mobile Users: Can Offloading Computation Save Energy?}, 
  year={2010},
  volume={43},
  number={4},
  pages={51-56},
  doi={10.1109/MC.2010.98}}

Workaround
Did you identify a workaround? Yes - I will send a pull request...

Remaining Questions (Optional)
Please tick all that apply:

  • I would be willing to to contribute a PR to fix this issue.
  • This issue is a blocker, I'd be greatful for an early fix.
@benlogan
Copy link
Contributor Author

benlogan commented Oct 3, 2023

fixed via PR, I think - please check and merge...
(and thanks for this awesome library!)

@MiWeiss
Copy link
Collaborator

MiWeiss commented Oct 22, 2023

Hi

Thanks a lot for your bug report and also for opening the PR!

I can confirm that the library assumes that new entries start on a new line. I went through a range of standard bibtex documentation and - while they all seem to implicitly assume this - it does not seem to be stated anywhere explicetly, making this a valid bug. 🐛

Your PR still has breaking tests and I am afraid more changes would be necessary to implement the function correctly. Would you be able to dedicate more time to it? If not, I am greatful for a quick heads-up, allowing me to open this issue for other contributors.

@schnoddelbotz
Copy link

Hi

I just wanted to add that entries starting on a new line but have leading whitespace (before the @) are also (silently) ignored.
FWIW, bibtool ( https://github.com/ge-ne/bibtool ) accepts those entries.

@MiWeiss
Copy link
Collaborator

MiWeiss commented Nov 2, 2023

For reference, see also PR #412

@zepinglee
Copy link
Contributor

The original BibTeX skips everything else until finding an @ sign as the next entry or command. It doesn't necessarily appear at a new line.

https://github.com/TeX-Live/texlive-source/blob/e48a783fdaff9b28797bfe95eee30c59bfca3d67/texk/web2c/bibtex.web#L5406-L5417

zepinglee added a commit to zepinglee/python-bibtexparser that referenced this issue Nov 4, 2023
zepinglee added a commit to zepinglee/python-bibtexparser that referenced this issue Nov 4, 2023
@MiWeiss
Copy link
Collaborator

MiWeiss commented Nov 4, 2023

Thanks @zepinglee for opening #416. I have assinged the issue to you.

@MiWeiss MiWeiss closed this as completed in 92d38d8 Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants