Pathlib object handling for Universe, SingleFrameReaderBase and Toplogy parsers (Issue #3937) #4535

talagayev · 2024-03-27T13:11:55Z

Partially Fixes #3937. The issue mentioned the addition of support and testing for pathlib objects for SingleFrameReaderBase.

Currently the SingleFrameReaderBase is able to handle both pathlib and str as input for SingleFrameReaderBase to display this, this PR is focusing on tests that display the handling of pathlib and str as input for SingleFrameReaderBase .

Changes made in this Pull Request:

Addition of tests for pathlib object and str input for SingleFrameReaderBase in test_gro.py and test_lammps.py

Currently the tests are as mentioned for GRO and LAMMPS cases. SingleFrameReaderBase also recognizes INPCRD, CRD, NAMDBIN and DMS if given as a single input, so tests for these cases could also be added if required.

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

Developers certificate of origin

I certify that this contribution is covered by the LGPLv2.1+ license as defined in our LICENSE and adheres to the Developer Certificate of Origin.

📚 Documentation preview 📚: https://mdanalysis--4535.org.readthedocs.build/en/4535/

…se + removal of duplicate import

…eReaderBase

adding the implementation of tests for str and pathlib handling

pep8speaks · 2024-03-27T13:12:02Z

Hello @talagayev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file package/MDAnalysis/topology/base.py:

Line 119:1: W293 blank line contains whitespace

In the file testsuite/MDAnalysisTests/coordinates/base.py:

Line 548:1: W293 blank line contains whitespace
Line 549:47: W291 trailing whitespace
Line 553:80: E501 line too long (86 > 79 characters)

Comment last updated at 2024-12-17 20:40:44 UTC

github-actions · 2024-03-27T13:14:49Z

Linter Bot Results:

Hi @talagayev! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location	Outcome
main package	⚠️ Possible failure
testsuite	⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/10542931892/job/29210309319

Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

codecov · 2024-03-27T13:30:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.23%. Comparing base (73acc9b) to head (c181021).
Report is 9 commits behind head on develop.

❗ Current head c181021 differs from pull request most recent head 5dafd01

Please upload reports for the commit 5dafd01 to get more accurate results.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #4535      +/-   ##
===========================================
- Coverage    93.66%   93.23%   -0.44%     
===========================================
  Files          168       12     -156     
  Lines        21248     1079   -20169     
  Branches      3917        0    -3917     
===========================================
- Hits         19902     1006   -18896     
+ Misses         888       73     -815     
+ Partials       458        0     -458

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

orbeckst · 2024-03-29T03:36:26Z

@hmacdope would you be able to look at this PR or assign to someone else, please?

hmacdope · 2024-04-01T11:16:30Z

Apologies for the delay @orbeckst @talagayev, was away over easter. Reviewing now.

hmacdope

@talagayev you have the right idea of what to test, and your test implementations are good!

However to achieve better coverage and to make your tests more powerful you can instead add your tests to BaseReaderTest and _SingleFrameReader that cover the base API functionality for all relevant trajectory types. This is in estsuite/MDAnalysisTests/coordinates/base.py. :)

talagayev · 2024-04-01T17:49:21Z

@hmacdope

@talagayev you have the right idea of what to test, and your test implementations are good!

However to achieve better coverage and to make your tests more powerful you can instead add your tests to BaseReaderTest and _SingleFrameReader that cover the base API functionality for all relevant trajectory types. This is in estsuite/MDAnalysisTests/coordinates/base.py. :)

Ah, I think i see :) i would than adjust it so that the tests are in BaseReaderTest and _SingleFrameReader.

Does it than make also sense to move the ReaderBase tests from #3935 also in the same way in a separate PR? since currently from the tests they cover two cases, one in the testsuite/MDAnalysisTests/coordinates/test_dcd.py for PSF&DCD and testsuite/MDAnalysisTests/coordinates/test_xdr.py for GRO&XTC ?

talagayev · 2024-04-02T21:07:25Z

@hmacdope
I managed to get the test locally in _SingleFrameReader and it covers the cases, but trying to add it into BaseReaderTest leads to errors such as:

AttributeError: 'PosixPath' object has no attribute 'encode'

in the case of test_dcd.py. Here it has problems with the TestDCDReader with also having the same errors with the TRRReader and XTCReader , whereas it is fine in test_gro.py with the TestGROReader. Is it possible that TestDCDReader also does access BaseReaderTest, which leads to the error, since it possibly needs a different case of handling than TestGROReader. Does it make sense to leave the test for the SingleFrameReaderBase in _SingleFrameReader and make one for MultiframeReaderTest so that both cases are divided and tested separatly in their classes?

Adjusted the tests to be in base.py

talagayev · 2024-04-15T10:25:58Z

@hmacdope
Hey Hugo, I adjusted now the tests and moved them to base.py as you suggested. As I mentioned there were the problems with the cases when i applied the tests for the single frame pathway to formats such as DCD, XTC so the ones that are not SingleFrameReader formats and thus I had to set exceptions for these cases.
Could you take a look if everything is fine now? For the checks most of them failed on the codecov stage in the end, so i am not sure if it is an error from the codecov site. Thanks in advance :)

hmacdope · 2024-05-27T06:52:03Z

@talagayev sorry for the delay here. I will review ASAP.

talagayev · 2024-05-27T13:45:38Z

@hmacdope all good, no worries :)

hmacdope · 2024-05-27T23:55:36Z

@talagayev I have pushed some changes to your branch that fix the encode issue, so we shouldn't need to exclude some readers, and also added support for Pathlib.path in Universe and the topology parsers. I have adjusted CHANGELOG and PR title to match also.

As we are now co-authors on this I can no longer fairly review, so I will defer to one of the other @MDAnalysis/coredevs. Perhaps @tylerjereddy as the author of previous Pathlib support issues and PRs #3937 (if you have some spare cycles).

hmacdope · 2024-05-28T00:31:32Z

Ok looks like I messed some stuff up, Ill fix.

talagayev · 2024-05-28T16:01:05Z

@hmacdope Hm strange error, I try to find the error going through the pytests and running the developer version of MDAnalysis with this version and can't track down the error currently.

talagayev · 2024-08-06T18:41:33Z

package/MDAnalysis/coordinates/base.py

+        if isinstance(filename, NamedStream):
+            self.filename = filename
+        else:
+            self.filename = str(filename)


Here errors if we use self.filename = str(filename) in the test_mmtf.py
The lead to errors similar to this one:

FileNotFoundError: [Errno 2] No such file or directory: '<mmtf.api.mmtf_reader.MMTFDecoder object at 0x135416780>'

I managed to fix the error by adding:
else:
self.filename = str(filename)
if "MMTF" in self.filename:
self.filename = filename

basically putting a specific condition for MMTF cases, not the cleanest approach, but maybe this helps @hmacdope fixing the error in a better way :)

I tried to look more into it now, the "MMTF" is a specific case, the other errors appear mainly because of the parmed parser which has both problems with the base.py in the coordinates and topology. It is possible to use in the base.py in the coordinates:

elif hasattr(filename, 'topology'):
self.filename = filename

This helps to fix the errors, although it is quite similar as to just directly using self.filename = filename without any if/elif/else conditions. For the base.py in topology I will try to figure out if there is any option to fix it, since again having there only self.filename = filename also fixes the errors with the parmed parser

@talagayev I think the FileNotFoundError for mmtf above is because instead of a filename, sometimes a MMTFDecoder object is encountered. I think what you've got is fixing it, but it might be clearer to have a block like:

if isinstance(filename, NamedStream): self.filename = filename elif 'mmtf' in str(filename.__class__): # mmtf case, where we've avoided a mmtf import in detection self.filename = filename

Where you've enumerated all the special cases explicitly in each elif branch, rather than hiding some special cases inside other branches

Ah I see, yes the option that I got worked, but was definitely not the best option, but yes was connected with mmtf and yes elif 'mmtf' in str(filename.__class__): looks much better :)

talagayev · 2024-08-06T18:48:08Z

testsuite/MDAnalysisTests/coordinates/base.py

+            if isinstance(reader, MemoryReader):
+                skip_reason = "MemoryReader"
+            pytest.skip(f"Skipping test for Pathlib input with reason: {skip_reason}")
+        path = Path(reader.filename)


Here adding an additional Line:
path = path.as_posix())

Fixes the following Failed Pytest in 4 cases:
AttributeError: 'PosixPath' object has no attribute 'encode'

richardjgowers · 2024-08-13T07:56:43Z

package/CHANGELOG

@@ -39,6 +40,7 @@ Fixes
 * Fix groups.py doctests using sphinx directives (Issue #3925, PR #4374)

 Enhancements
+ * Handling of Pathlib.Path in SingleFrameReaderBase, Topology and Universe (Issue #3937)


This is a bit vague and doesn't actually tell me what is fixed or what is now possible. e.g. Is it that pathlib.Paths are now correctly handled?

Correct, originally it was only targeting pathlib object handling of SingleFrameReaderBase in specific cases, but with the adjustment of @hmacdope it would cover Pathlib.Paths for all cases if the errros that appear currently due to the changes in the base.py in topology and also the base.py in coordinates, which could be fixed with

elif hasattr(filename, 'topology'):
self.filename = filename

but this would just exclude the cases of topology so I am not sure if this would fix it and cover then all the Pathlib.Path cases

orbeckst · 2024-12-17T20:41:23Z

@hmacdope would it be worthwhile reviewing again? Can this PR be pushed across the finish line?

talagayev added 5 commits March 27, 2024 12:10

addition of tests for str and pathlib handling of SingleFrameReaderBa…

0dbca13

…se + removal of duplicate import

renamed tests to specify GRO input as testcase

c2c7c2c

Added testcase of LAMMPS str and pathlib handling in class SingleFram…

05ea8de

…eReaderBase

Update AUTHORS list

5c0383c

Update CHANGELOG

9a4932b

adding the implementation of tests for str and pathlib handling

talagayev added 2 commits March 27, 2024 14:18

Adjusting blank lines for PEP8

ff2a832

adjsuting blank lines for PEP8

d85defc

orbeckst assigned hmacdope Mar 29, 2024

hmacdope self-requested a review April 1, 2024 11:16

hmacdope requested changes Apr 1, 2024

View reviewed changes

talagayev added 6 commits April 15, 2024 10:25

removed singleframereader test in test_gro.py

87f7a95

removed singleframe_reader test in test_lammps.py

27e9685

added test for str and path input for singleframereader in base.py

567e6ef

Added Baseframe test for singleframes str and path input in base.py

5558dda

Merge pull request #2 from talagayev/talagayev-patch-1_singleframe

2953310

Adjusted the tests to be in base.py

Merge branch 'develop' into singleframereader_pathlib

c181021

hmacdope self-requested a review May 27, 2024 06:52

add pathlib checking to topology and universe

e06ae55

github-actions bot added Component-Core Component-Readers labels May 27, 2024

github-actions bot added the Component-Topology label May 27, 2024

hmacdope added 2 commits May 28, 2024 09:49

add test for topology Pathlib and fix changelog

0aa40cc

fix spacing chaneg in AUTHORS

5dafd01

hmacdope changed the title ~~Addition of tests for pathlib object handling of SingleFrameReaderBase (Issue #3937)~~ Pathlib object handling for SingleFrameReaderBase and Toplogy parsers (Issue #3937) May 27, 2024

hmacdope changed the title ~~Pathlib object handling for SingleFrameReaderBase and Toplogy parsers (Issue #3937)~~ Pathlib object handling for Universe, SingleFrameReaderBase and Toplogy parsers (Issue #3937) May 28, 2024

talagayev commented Aug 6, 2024

View reviewed changes

richardjgowers reviewed Aug 13, 2024

View reviewed changes

Merge branch 'develop' into singleframereader_pathlib

33d6b4e

github-actions bot removed Component-Readers Component-Core Component-Topology labels Aug 25, 2024

Merge branch 'develop' into singleframereader_pathlib

f785eea

orbeckst added Component-Readers Component-Core Component-Topology labels Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pathlib object handling for Universe, SingleFrameReaderBase and Toplogy parsers (Issue #3937) #4535

Pathlib object handling for Universe, SingleFrameReaderBase and Toplogy parsers (Issue #3937) #4535

talagayev commented Mar 27, 2024 •

edited by github-actions bot

Loading

pep8speaks commented Mar 27, 2024 •

edited

Loading

github-actions bot commented Mar 27, 2024 •

edited

Loading

codecov bot commented Mar 27, 2024 •

edited

Loading

orbeckst commented Mar 29, 2024

hmacdope commented Apr 1, 2024

hmacdope left a comment

talagayev commented Apr 1, 2024

talagayev commented Apr 2, 2024

talagayev commented Apr 15, 2024

hmacdope commented May 27, 2024

talagayev commented May 27, 2024

hmacdope commented May 27, 2024

hmacdope commented May 28, 2024

talagayev commented May 28, 2024

talagayev Aug 6, 2024

talagayev Aug 6, 2024

richardjgowers Aug 13, 2024

talagayev Aug 13, 2024

talagayev Aug 6, 2024

richardjgowers Aug 13, 2024

talagayev Aug 13, 2024

orbeckst commented Dec 17, 2024

Pathlib object handling for Universe, SingleFrameReaderBase and Toplogy parsers (Issue #3937) #4535

Are you sure you want to change the base?

Pathlib object handling for Universe, SingleFrameReaderBase and Toplogy parsers (Issue #3937) #4535

Conversation

talagayev commented Mar 27, 2024 • edited by github-actions bot Loading

PR Checklist

Developers certificate of origin

pep8speaks commented Mar 27, 2024 • edited Loading

Comment last updated at 2024-12-17 20:40:44 UTC

github-actions bot commented Mar 27, 2024 • edited Loading

Linter Bot Results:

codecov bot commented Mar 27, 2024 • edited Loading

Codecov Report

orbeckst commented Mar 29, 2024

hmacdope commented Apr 1, 2024

hmacdope left a comment

Choose a reason for hiding this comment

talagayev commented Apr 1, 2024

talagayev commented Apr 2, 2024

talagayev commented Apr 15, 2024

hmacdope commented May 27, 2024

talagayev commented May 27, 2024

hmacdope commented May 27, 2024

hmacdope commented May 28, 2024

talagayev commented May 28, 2024

talagayev Aug 6, 2024

Choose a reason for hiding this comment

talagayev Aug 6, 2024

Choose a reason for hiding this comment

richardjgowers Aug 13, 2024

Choose a reason for hiding this comment

talagayev Aug 13, 2024

Choose a reason for hiding this comment

talagayev Aug 6, 2024

Choose a reason for hiding this comment

richardjgowers Aug 13, 2024

Choose a reason for hiding this comment

talagayev Aug 13, 2024

Choose a reason for hiding this comment

orbeckst commented Dec 17, 2024

talagayev commented Mar 27, 2024 •

edited by github-actions bot

Loading

pep8speaks commented Mar 27, 2024 •

edited

Loading

github-actions bot commented Mar 27, 2024 •

edited

Loading

codecov bot commented Mar 27, 2024 •

edited

Loading