Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python 3.9] Overhaul RMG-RMS Python-Julia dependencies (juliaup and pythoncall), Add Tests for Arkane Network Algorithms #2687

Draft
wants to merge 68 commits into
base: main
Choose a base branch
from

Conversation

JacksonBurns
Copy link
Contributor

Replace #2635, targeting a smaller improvement in Python minor version to keep better compatibility with our dependencies.

@JacksonBurns
Copy link
Contributor Author

JacksonBurns commented Jun 27, 2024

I'll update this comment with the path forward for each platform as the errors roll in.

All Platforms

There are Cython errors all over the place, namely because we have incremented Cython to:
cython 3.0.10
from
cython 0.29.33
...oh boy

M1 Macs

The following packages are outright missing, but without any conflicts:

    - nothing provides requested rmg::symmetry
    - nothing provides requested rmg::pyrms
    - nothing provides requested rmg::pydqed >=1.0.3
    - nothing provides requested rmg::pydas >=1.0.3
    - nothing provides requested conda-forge::muq
    - nothing provides requested conda-forge::julia =1.9.1*
    - nothing provides requested rmg::diffeqpy
    - nothing provides julia >=1.5.0 needed by pyjulia-0.6.0-pyhd8ed1ab_0

Anything in the RMG channel we can build for M1 macs, if it comes to that. muq and julia and pyjulia are out of our hands, though.

Ubuntu

We will need to rebuild the rmg channel packages (or just rebase this branch onto #2640, to at least remove a couple of them)

Intel Macs

Same as Ubuntu.

@rwest
Copy link
Member

rwest commented Jun 27, 2024

Julia and Pyjulia would be taken care of by #2640 (which would be unblocked if we could get past Python 3.7), so that leave muq? (and the ones that are our responsibility)

@rwest
Copy link
Member

rwest commented Jun 27, 2024

Is this faring any better than 3.10 or 3.11? at least it solved the boost - rdkit - muq issue? or do we need to await results on non-M1-platforms?

@JacksonBurns
Copy link
Contributor Author

Local testing shows that this resolves the muq + rdkit issue. I tried leaving in our packages (rmg channel) to see if the conflict resolution would work, to no avail. I have removed them now to let it (hopefully) suggest solutions to conflicts (if there are any) and then we should have an idea of the path forward.

@JacksonBurns
Copy link
Contributor Author

JacksonBurns commented Jun 27, 2024

@rwest initial tests are done. M1 macs are missing a number of packages that are outside of our control. I think that is a dead end.

Ubuntu and MacOS need the rmg channel packages. If we rebase this on #2640 we can avoid building those we plan to immediately remove afterward (I want to do this).

All platforms will need Cython errors fixed.

@JacksonBurns
Copy link
Contributor Author

Following some offline discussion, @hwpang and I have agreed that I will rebase this PR onto #2640 and then merge it into the same. This will:

  • prevent me having to build packages which we will immediately remove, simplifying this PR: Complete conda-recipes Overhaul conda-recipes#18
  • require resolving the Cython compilation errors. This may or may not be very difficult - subsequent work in this PR will focus on addressing the scope of these changes.
  • still require building pydas and pydqed, which may or not be difficult, but will handled in separate PRs on their respective repositories.

@hwpang
Copy link
Contributor

hwpang commented Jul 3, 2024

I'm running the test after relaxing the python version requirement in RMS from version="3.9" to version=">=3.9" to see if this was the problem. I wasn't aware that version="3.9" doesn't imply version="3.9*".

Ah yeah this is a quirk of conda I think

The latest CI has an interessting array of failures:

Going to push a commit to just try and get a better stacktrace and maybe some output during the failure

I was able to make the MacOS to not install rmgmolecule. However, now we run into a new error:

ERROR: LoadError: InitError: ReadOnlyMemoryError()
during initialization of module PythonPlot
in expression starting at /Users/runner/miniconda3/envs/rmg_env/.julia/packages/ReactionMechanismSimulator/C6833/src/Plotting.jl:1
in expression starting at /Users/runner/miniconda3/envs/rmg_env/.julia/packages/ReactionMechanismSimulator/C6833/src/ReactionMechanismSimulator.jl:1
Matplotlib is building the font cache; this may take a moment.

@JacksonBurns
Copy link
Contributor Author

Ok so interesting developments - the Linux build continues to fail when calling out to Lpsolve but the Intel Mac (which didn't use to even install RMS at all, so this is surprising) passes that test, later failing during this unit test because of something I did with Cython, I think:

 >   def same_object(object1, object2, _check_identical, _only_check_label,
                 _generate_initial_map, _strict, _save_order):
E                TypeError: same_object() takes exactly 7 positional arguments (2 given)

rmgpy/reaction.py:1545: TypeError

The M1 error... no idea. Our current docs tell users to just have the M1 hardware emulate the old arch so that they can run old arch binaries - perhaps we do the same in the CI?

@hwpang
Copy link
Contributor

hwpang commented Jul 4, 2024

Ok so interesting developments - the Linux build continues to fail when calling out to Lpsolve but the Intel Mac (which didn't use to even install RMS at all, so this is surprising) passes that test, later failing during this unit test because of something I did with Cython, I think:

 >   def same_object(object1, object2, _check_identical, _only_check_label,
                 _generate_initial_map, _strict, _save_order):
E                TypeError: same_object() takes exactly 7 positional arguments (2 given)

rmgpy/reaction.py:1545: TypeError

The M1 error... no idea. Our current docs tell users to just have the M1 hardware emulate the old arch so that they can run old arch binaries - perhaps we do the same in the CI?

Yeah that sounds fine to me.

@JacksonBurns
Copy link
Contributor Author

@hwpang ARM mac is now installing Intel packages, but this error pops up during RMS installation:

Looking for: ["matplotlib[version='>=1']", "conda-forge::python[version='>=3.8,<4',build=*cpython*]"]


Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Python path configuration:
  PYTHONHOME = '/Users/runner/miniconda3/envs/rmg_env:/Users/runner/miniconda3/envs/rmg_env'
  PYTHONPATH = '/Users/runner/work/RMG-Py/RMG-Py/RMG-Py:'
  program name = '/Users/runner/miniconda3/envs/rmg_env/bin/python'
  isolated = 0
  environment = 1
  user site = 1
  safe_path = 0
  import site = 1
  is in build tree = 0
  stdlib dir = '/Users/runner/miniconda3/envs/rmg_env/lib/python3.12'
  sys._base_executable = '/Users/runner/miniconda3/envs/rmg_env/bin/python'
  sys.base_prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.base_exec_prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.platlibdir = 'lib'
  sys.executable = '/Users/runner/miniconda3/envs/rmg_env/bin/python'
  sys.prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.exec_prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.path = [
    '/Users/runner/work/RMG-Py/RMG-Py/RMG-Py',
    '/Users/runner/miniconda3/envs/rmg_env/.julia/packages/ReactionMechanismSimulator/NVrJU/deps',
    '/Users/runner/miniconda3/envs/rmg_env/lib/python312.zip',
    '/Users/runner/miniconda3/envs/rmg_env/lib/python3.12',
    '/Users/runner/miniconda3/envs/rmg_env/lib/python3.12/lib-dynload',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

@JacksonBurns
Copy link
Contributor Author

Stack overflow says it's because we are setting PYTHONPATH https://stackoverflow.com/a/65740883

@hwpang
Copy link
Contributor

hwpang commented Jul 8, 2024

Stack overflow says it's because we are setting PYTHONPATH https://stackoverflow.com/a/65740883

Thanks. The only PYTHONPATH it has is the PYTHONPATH we set for the RMG-Py, and we didn't set any other PYTHONPATH. Any ideas on what we could try?

Python path configuration:
  PYTHONHOME = '/Users/runner/miniconda3/envs/rmg_env:/Users/runner/miniconda3/envs/rmg_env'
  PYTHONPATH = '/Users/runner/work/RMG-Py/RMG-Py/RMG-Py:'

@hwpang
Copy link
Contributor

hwpang commented Jul 8, 2024

So unsetting the PYTHONPATH didn't solve the problem we encountered with ARM mac. It failed with the same error:

Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Python path configuration:
  PYTHONHOME = '/Users/runner/miniconda3/envs/rmg_env:/Users/runner/miniconda3/envs/rmg_env'
  PYTHONPATH = (not set)
  program name = '/Users/runner/miniconda3/envs/rmg_env/bin/python'
  isolated = 0
  environment = 1
  user site = 1
  safe_path = 0
  import site = 1
  is in build tree = 0
  stdlib dir = '/Users/runner/miniconda3/envs/rmg_env/lib/python3.12'
  sys._base_executable = '/Users/runner/miniconda3/envs/rmg_env/bin/python'
  sys.base_prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.base_exec_prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.platlibdir = 'lib'
  sys.executable = '/Users/runner/miniconda3/envs/rmg_env/bin/python'
  sys.prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.exec_prefix = '/Users/runner/miniconda3/envs/rmg_env'
  sys.path = [
    '/Users/runner/miniconda3/envs/rmg_env/lib/python312.zip',
    '/Users/runner/miniconda3/envs/rmg_env/lib/python3.12',
    '/Users/runner/miniconda3/envs/rmg_env/lib/python3.12/lib-dynload',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

Current thread 0x00000001fe960c00 (most recent call first):
  <no Python frame>
Stacktrace:
  [1] pkgerror(msg::String)
    @ Pkg.Types ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Types.jl:70
  [2] (::Pkg.Operations.var"#67#74"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String})()
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1157
  [3] withenv(::Pkg.Operations.var"#67#74"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, ::Pair{String, String}, ::Vararg{Pair{String}})
    @ Base ./env.jl:257
  [4] (::Pkg.Operations.var"#117#122"{String, Bool, Bool, Bool, Pkg.Operations.var"#67#74"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.PackageSpec})()
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1825
  [5] with_temp_env(fn::Pkg.Operations.var"#117#122"{String, Bool, Bool, Bool, Pkg.Operations.var"#67#74"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.PackageSpec}, temp_env::String)
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1706
  [6] (::Pkg.Operations.var"#115#120"{Dict{String, Any}, Bool, Bool, Bool, Pkg.Operations.var"#67#74"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.Context, Pkg.Types.PackageSpec, String, Pkg.Types.Project, String})(tmp::String)
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1[795](https://github.com/ReactionMechanismGenerator/RMG-Py/actions/runs/9844707391/job/27178685242?pr=2687#step:12:796)
  [7] mktempdir(fn::Pkg.Operations.var"#115#120"{Dict{String, Any}, Bool, Bool, Bool, Pkg.Operations.var"#67#74"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.Context, Pkg.Types.PackageSpec, String, Pkg.Types.Project, String}, parent::String; prefix::String)
    @ Base.Filesystem ./file.jl:766
  [8] mktempdir(fn::Function, parent::String)
    @ Base.Filesystem ./file.jl:762
  [9] mktempdir
    @ ./file.jl:762 [inlined]
 [10] sandbox(fn::Function, ctx::Pkg.Types.Context, target::Pkg.Types.PackageSpec, target_path::String, sandbox_path::String, sandbox_project_override::Pkg.Types.Project; preferences::Dict{String, Any}, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1753
 [11] build_versions(ctx::Pkg.Types.Context, uuids::Set{Base.UUID}; verbose::Bool)
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1138
 [12] build_versions
    @ ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1056 [inlined]
 [13] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, new_git::Set{Base.UUID}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform)
    @ Pkg.Operations ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1399
 [14] add
    @ ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1377 [inlined]
 [15] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::@Kwargs{io::Base.PipeEndpoint})
    @ Pkg.API ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/API.jl:278
 [16] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.PipeEndpoint, kwargs::@Kwargs{})
    @ Pkg.API ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/API.jl:159
 [17] add(pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.API ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/API.jl:148
 [18] add(pkg::Pkg.Types.PackageSpec)
    @ Pkg.API ~/miniconda3/envs/rmg_env/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/Pkg/src/API.jl:146
 [19] top-level scope
    @ none:1
Error: Process completed with exit code 1.

It has also caused the previously working Ubuntu installation to fail. So I'm gonna drop that commit. Would appreciate any ideas to try. @JacksonBurns

@rwest
Copy link
Member

rwest commented Jul 9, 2024

This looks promising, right? What @hwpang pushed yesterday compiled and installed everything and started to run all the unit tests?
Then it seg-faulted on a line status = lpsolve('solve', lp)?
(But maybe that's just where we were last week?)

Update: it could feasibly be the underlying lp_solve package causing issues not just the lpsolve55 which is the wrapper/API.

@JacksonBurns
Copy link
Contributor Author

Apologies for the delay, travel and also @hwpang is Dr. @hwpang now 🎉!

I am not sure what else to try RE: ARM Mac is busted. Running ARM Packages natively failed at the Julia installation, emulating Intel packages failed during Julia installation also but for a different, possibly PYTHONPATH related reason. I will be back in front of an ARM Mac a week from now, which might make debugging easier, but I would appreciate help here.

Intel Mac is broken on GitHub because it is too slow to install RMS (except when sometimes it does, for some reason, only to fail due to a Cython error I think I caused). Ubuntu is failing because lpsolve crashes the whole test run, and frustratingly refuses to even give a helpful trace.

For now, maybe we try and figure out why lpsolve is failing?

@rwest
Copy link
Member

rwest commented Jul 22, 2024

How hard is it to remove the need for LPSolve?.... #2623 and #2596 (comment)

@hwpang
Copy link
Contributor

hwpang commented Jul 22, 2024

How hard is it to remove the need for LPSolve?.... #2623 and #2596 (comment)

@rwest This is sort of a chicken and egg problem, because replacing lpsolve with scipy.optimize.milp requires python 3.9, and this PR that upgrades RMG to python 3.9 is failing due to lpsolve :(

@hwpang
Copy link
Contributor

hwpang commented Jul 22, 2024

An alternative thought, maybe we can rebase #2623 on this PR and see if that resolves the lpsolve issue?

@rwest
Copy link
Member

rwest commented Jul 23, 2024

Worth a try. We have both the chickens and the eggs as open PRs. We should try putting them together.

@JacksonBurns
Copy link
Contributor Author

I will (1) make a copy of @xiaoruiDong's branch in #2623 and push it to ReactionMechanismGenerator/RMG-Py and then (2) rebase that branch on this branch and then (3) open a PR from that branch into this one

@JacksonBurns
Copy link
Contributor Author

@hwpang's #2596 is also required in combination with #2694 in order to completely remove lpsolve. I will (1) make a copy of that branch on RMG-Py (2) rebase it onto #2694 (3) tack on a commit to remove lpsolve (4) open a PR into #2694

Merge order would then be Isodesmic into CLAR, CLAR into this, and this into main.

@JacksonBurns JacksonBurns changed the title Overhaul RMG-RMS Python-Julia dependencies (juliaup and pythoncall), Add Tests for Arkane Network Algorithms, Upgrade to Python 3.9 [Python 3.9] Overhaul RMG-RMS Python-Julia dependencies (juliaup and pythoncall), Add Tests for Arkane Network Algorithms Jul 24, 2024
@rwest
Copy link
Member

rwest commented Jul 29, 2024

I lost track of the other PRs. Is this idea of removing lpsolve looking promising?

@JacksonBurns
Copy link
Contributor Author

I lost track of the other PRs. Is this idea of removing lpsolve looking promising?

Yes - the combination of #2694 and #2695 will allow us to remove lpsolve (as well as roll in some bug fixes), which should allow merging this PR.

@JacksonBurns
Copy link
Contributor Author

Don't fix merge conflicts in this PR, I have already done so in #2694

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants