Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyXB is not longer under development and is preventing us from upgrading from Python 3.8 #40

Closed
makaylas opened this issue Oct 1, 2020 · 20 comments
Assignees

Comments

@makaylas
Copy link

makaylas commented Oct 1, 2020

rdflib is pinned to v4.2.2 which has a DeprecationWarning

rdflib/plugins/sparql/compat.py:8: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
    from collections import Mapping, MutableMapping  # was added in 2.6

-- Docs: https://docs.pytest.org/en/latest/warnings.html

rdflib should be upgraded to v5.0.0

@vchendrix
Copy link

Actually it seems that rdf has fixed this problem but the PyXB library has this same problem. It also seems that PyXB is not currenlty under development. The last commit was in 2017. Do you have any plans to address this?

../../anaconda3/envs/essdive-toolset-py8/lib/python3.8/site-packages/PyXB-1.2.6-py3.8.egg/pyxb/binding/content.py:807
  /Users/val/anaconda3/envs/essdive-toolset-py8/lib/python3.8/site-packages/PyXB-1.2.6-py3.8.egg/pyxb/binding/content.py:807: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
    class _PluralBinding (collections.MutableSequence):

-- Docs: https://docs.pytest.org/en/stable/warnings.html

@rogerdahl
Copy link
Collaborator

The author of PyXB is looking for someone to take over maintenance. pabigot/pyxb#100

@gothub gothub self-assigned this Apr 2, 2021
@gothub
Copy link

gothub commented Apr 12, 2021

@mbjones @datadavev @csjx After building a local version of d1_python with the PyXB-X fork (of pyxb), I confirmed that the collections.abc issue has been resolved as described here.
Note that most of the other 40 forks of pyxb are either even or behind the main pyxb repo.

Since the main pyxb repo will not be continuing forward, these are a couple of options:

  • build and release d1_python with renalreg/PyXB-B fork
  • create and maintain our own DataONEorg fork
  • implement a solution other than pyxb

Are there other options to consider?

@mbjones
Copy link
Member

mbjones commented Apr 13, 2021

I think migrating to PyxB-X makes sense, as it is simply a better-maintained version of the same library. We may at some point in the future have to jettison PyxB altogether, but it would be a large undertaking. The other alternative is to stick to versions of Python that the old PyxB still works on, as there isn't a very large practical difference in the capabilities of the python versions involved. But if we must use the latest python, PyxB-X seems like a reasonable path to me. Have all of the tests passed for both PyxB-X and our d1-python stack?

@rogerdahl
Copy link
Collaborator

rogerdahl commented Apr 14, 2021 via email

@mbjones
Copy link
Member

mbjones commented Apr 14, 2021

Thanks @rogerdahl that is really helpful. But it does muddy the water as to our choice a little. Seems like we have the following options:

  • Option 1: Stay with python 3.8 and continue using pyxb until a security issue causes a need to move
    • Pros: simple, minimal work, uses stable version that most others are using
    • Cons: prevents clients from using recent python 3.9 versions and later, highly unlikely to get security and other critical updates
  • Option 2: Migrate to pyxb-x
    • Pros: supports multiple versions of python
    • Cons: few people using the package per se, unclear maintenance and feature trajectory
  • Option 3: Refactor to remove PyxB dependency and use ElementTree
    • Pros: supports multiple versions of python, removes dependency on abandoned package
    • Cons: requires unfunded development work, means manually maintaining DataONE type classes

Are there other options, or pros/cons I missed? I think I still prefer option 2, but any of those could work. And we could change our mind if the situation shifts. I would love to get people's votes/input, especially @datadavev and other users of the python library like @vchendrix. Would ESS-DIVE want to support this maintenance work?

@amoeba
Copy link

amoeba commented Apr 14, 2021

I'll think? these are also valid options, correct me if I'm wrong:

  • Option 4: Fork pyxb under DataONEorg, patch and maintain
    • Pros: Fewest PyXB code changes from the upstream, full control over changes
    • Cons: Increased maintenance burden on us, hard for other projects to install/depend on PyXB as it's not published
  • Option 5: Vendor pyxb in d1_python
    • Pros: Fewest PyXB code changes from the upstream, full control over changes
    • Cons: Increased maintenance burden on us

I'll add that the con to both approaches (increased maintenance burden on us) is already on us as we've having repeated conversations about PyXB and have been for quite some time.

@rogerdahl rogerdahl changed the title Upgrade rdflib to remove DeprecationWarning PyXB is not longer under development and is preventing us from upgrading from Python 3.8 Dec 30, 2021
@vchendrix
Copy link

After reading all of the comments, I think removing the PyXB dependency from d1_python all together seems to be better for the long term. Especially, if the only thing it is being used for is a few D1 xml data structures. Our only dependency on d1_python is to build SystemMetacata and ResourceMap XML.

@mbjones
Copy link
Member

mbjones commented Jun 28, 2022

Thanks for the thoughts, @vchendrix. One note is that its not just a few XML data structures, the types library generates 61 of them under Java using jaxb, and 61 type classes under python using PyxB. That's a fair number of manually maintained classes (each needing a serializer/deserializer and validity checking in both directions). See the listings below.

JaxB classes from d1_common
.
├── v1
│   ├── AccessPolicy.java
│   ├── AccessRule.java
│   ├── Checksum.java
│   ├── ChecksumAlgorithmList.java
│   ├── DescribeResponse.java
│   ├── Event.java
│   ├── Group.java
│   ├── Identifier.java
│   ├── Log.java
│   ├── LogEntry.java
│   ├── MonitorInfo.java
│   ├── MonitorList.java
│   ├── Node.java
│   ├── NodeList.java
│   ├── NodeReference.java
│   ├── NodeReplicationPolicy.java
│   ├── NodeState.java
│   ├── NodeType.java
│   ├── ObjectFactory.java
│   ├── ObjectFormat.java
│   ├── ObjectFormatIdentifier.java
│   ├── ObjectFormatList.java
│   ├── ObjectInfo.java
│   ├── ObjectList.java
│   ├── ObjectLocation.java
│   ├── ObjectLocationList.java
│   ├── Permission.java
│   ├── Person.java
│   ├── Ping.java
│   ├── Replica.java
│   ├── ReplicationPolicy.java
│   ├── ReplicationStatus.java
│   ├── Schedule.java
│   ├── Service.java
│   ├── ServiceMethodRestriction.java
│   ├── Services.java
│   ├── Session.java
│   ├── Slice.java
│   ├── Subject.java
│   ├── SubjectInfo.java
│   ├── SubjectList.java
│   ├── Synchronization.java
│   ├── SystemMetadata.java
│   └── TypeFactory.java
├── v1_1
│   ├── ObjectFactory.java
│   ├── QueryEngineDescription.java
│   ├── QueryEngineList.java
│   └── QueryField.java
└── v2
    ├── Log.java
    ├── LogEntry.java
    ├── MediaType.java
    ├── MediaTypeProperty.java
    ├── Node.java
    ├── NodeList.java
    ├── ObjectFactory.java
    ├── ObjectFormat.java
    ├── ObjectFormatList.java
    ├── OptionList.java
    ├── Property.java
    ├── SystemMetadata.java
    └── TypeFactory.java
PyxB classes from d1_common
class ChecksumAlgorithm(pyxb.binding.datatypes.string):
class CrontabEntry(pyxb.binding.datatypes.token):
class CrontabEntrySeconds(pyxb.binding.datatypes.token):
class Event(pyxb.binding.datatypes.string, pyxb.binding.basis.enumeration_mixin):
class NodeState(pyxb.binding.datatypes.NMTOKEN, pyxb.binding.basis.enumeration_mixin):
class NodeType(pyxb.binding.datatypes.NMTOKEN, pyxb.binding.basis.enumeration_mixin):
class NonEmptyString(pyxb.binding.datatypes.string):
class Permission(pyxb.binding.datatypes.string, pyxb.binding.basis.enumeration_mixin):
class ReplicationStatus(
class ObjectFormatIdentifier(NonEmptyString):
class NonEmptyString800(NonEmptyString):
class ServiceName(NonEmptyString):
class ServiceVersion(NonEmptyString):
class NonEmptyNoWhitespaceString800(NonEmptyString800):
class AccessPolicy(pyxb.binding.basis.complexTypeDefinition):
class AccessRule(pyxb.binding.basis.complexTypeDefinition):
class ChecksumAlgorithmList(pyxb.binding.basis.complexTypeDefinition):
class Group(pyxb.binding.basis.complexTypeDefinition):
class LogEntry(pyxb.binding.basis.complexTypeDefinition):
class NodeReplicationPolicy(pyxb.binding.basis.complexTypeDefinition):
class NodeList(pyxb.binding.basis.complexTypeDefinition):
class ObjectFormat(pyxb.binding.basis.complexTypeDefinition):
class ObjectInfo(pyxb.binding.basis.complexTypeDefinition):
class ObjectLocation(pyxb.binding.basis.complexTypeDefinition):
class ObjectLocationList(pyxb.binding.basis.complexTypeDefinition):
class Person(pyxb.binding.basis.complexTypeDefinition):
class Ping(pyxb.binding.basis.complexTypeDefinition):
class Replica(pyxb.binding.basis.complexTypeDefinition):
class ReplicationPolicy(pyxb.binding.basis.complexTypeDefinition):
class Services(pyxb.binding.basis.complexTypeDefinition):
class Session(pyxb.binding.basis.complexTypeDefinition):
class Slice(pyxb.binding.basis.complexTypeDefinition):
class Synchronization(pyxb.binding.basis.complexTypeDefinition):
class SubjectInfo(pyxb.binding.basis.complexTypeDefinition):
class SubjectList(pyxb.binding.basis.complexTypeDefinition):
class SystemMetadata(pyxb.binding.basis.complexTypeDefinition):
class Checksum(pyxb.binding.basis.complexTypeDefinition):
class Log(Slice):
class Node(pyxb.binding.basis.complexTypeDefinition):
class NodeReference(pyxb.binding.basis.complexTypeDefinition):
class ObjectFormatList(Slice):
class ObjectList(Slice):
class ServiceMethodRestriction(SubjectList):
class Schedule(pyxb.binding.basis.complexTypeDefinition):
class Subject(pyxb.binding.basis.complexTypeDefinition):
class Service(pyxb.binding.basis.complexTypeDefinition):
class Identifier(pyxb.binding.basis.complexTypeDefinition):
class QueryEngineDescription(pyxb.binding.basis.complexTypeDefinition):
class QueryEngineList(pyxb.binding.basis.complexTypeDefinition):
class QueryField(pyxb.binding.basis.complexTypeDefinition):
class MediaTypeProperty(pyxb.binding.basis.complexTypeDefinition):
class MediaType(pyxb.binding.basis.complexTypeDefinition):
class SystemMetadata(_ImportedBinding_dataoneTypes_v1.SystemMetadata):
class NodeList(pyxb.binding.basis.complexTypeDefinition):
class Node(_ImportedBinding_dataoneTypes_v1.Node):
class Property(pyxb.binding.basis.complexTypeDefinition):
class ObjectFormat(_ImportedBinding_dataoneTypes_v1.ObjectFormat):
class ObjectFormatList(_ImportedBinding_dataoneTypes_v1.Slice):
class Log(_ImportedBinding_dataoneTypes_v1.Slice):
class LogEntry(pyxb.binding.basis.complexTypeDefinition):
class OptionList(pyxb.binding.basis.complexTypeDefinition):

@iannesbitt
Copy link

iannesbitt commented Dec 22, 2022

I'll add a point here and say that if we want to support running d1_python on the Apple M1 chipset out-of-the-box, we need to be able to remove the python==3.7 pin for this package, as versions python<=3.9 are not built for arm64.

@vchendrix
Copy link

Circling back to see if this issue has any traction.

@mbjones
Copy link
Member

mbjones commented Feb 15, 2023

Traction but no resources for it. Did you have any comments on the 61 classes that would need to be manually rewritten and maintained if we eliminated PyxB? We have a similar situation on the java side with JaxB.

@vchendrix
Copy link

Traction but no resources for it. Did you have any comments on the 61 classes that would need to be manually rewritten and maintained if we eliminated PyxB? We have a similar situation on the java side with JaxB.

The 61 classes is overwhelming to me. We currently use all the classes related to SystemMetadata and ResourceMaps which I appreciate and don't what do have to implement. There is a lot of logic there.

Maybe the following option mentioned by @amoeba? I think this means that pyxb would be forked and added as a submodule to d1_python which would get packaged together and published in pypi?

Option 5: Vendor pyxb in d1_python

  • Pros: Fewest PyXB code changes from the upstream, full control over changes
  • Cons: Increased maintenance burden on us

@rogerdahl
Copy link
Collaborator

Now that there is a real need to move to a more recent version of Python, I think the way to go would be to fork the last version of PyXB, and update it. When I looked at this a few years ago, only a few one-line changes were required in order to get PyXB working on the current version of Python, probably 3.9. I'd take a look through the many forks of PyXB to see which tweaks may be required: https://github.com/pabigot/pyxb/network/members

@datadavev
Copy link
Member

The GeoScienceAustralia for of PyXB-X looks promising. https://github.com/GeoscienceAustralia/PyXB-X
Evaluating along with related updates for python 3.9+

@srstsavage
Copy link

Is this fixed by #78?

@mbjones
Copy link
Member

mbjones commented Nov 29, 2023

yes, @srstsavage -- it is. On my superficial look, it seems like it was not tagged as a release, so we need to update the docs and tag the release as 3.6.0, and publish to pypi. cc @iannesbitt @rogerdahl @datadavev

@srstsavage
Copy link

@mbjones We're looking at reinvigorating the effort to dockerize GMN. Any chance of getting that 3.6.0 release in the near future?

@rogerdahl
Copy link
Collaborator

@mbjones I've tagged the 3.5.2 release on GH now. This release is already on PyPI. I think you're good to go on containerizing, @srstsavage.

@mbjones
Copy link
Member

mbjones commented May 3, 2024

thanks @rogerdahl -- that's so appreciated.

In terms of docker, I'm sure there would be others that would be interested in running this via docker, so @srstsavage please do consider contributing such changes. We should track dockerization work in issue #77

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants