Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GitHub Actions workflows #849

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

fabcor-maxiv
Copy link
Contributor

Use conda environments for better consistency

@fabcor-maxiv fabcor-maxiv self-assigned this Feb 5, 2024
@fabcor-maxiv fabcor-maxiv force-pushed the update-github-actions-workflows branch 11 times, most recently from 1e209ea to 466aac3 Compare February 5, 2024 15:56
@marcus-oscarsson
Copy link
Member

Hey @fabcor-maxiv, is this still draft ?

@fabcor-maxiv
Copy link
Contributor Author

@marcus-oscarsson Yes, this led me to uncover some more inconsistencies that I need to find a solution for. And as you can see the pytest workflow fails (for not entirely clear reasons). In short, having both conda and Poetry is problematic, at least in my opinion. Obviously it does not seem to be blocking anyone from their work, so I want to take my time with this.

@marcus-oscarsson
Copy link
Member

@fabcor-maxiv the issue I think is this Cannot install python-ldap. and I think its because the system libraries for ldap is missing. Its basically these libldap2-dev libsasl2-dev slapd ldap-utils.

@fabcor-maxiv
Copy link
Contributor Author

fabcor-maxiv commented Feb 21, 2024

@fabcor-maxiv the issue I think is this Cannot install python-ldap. and I think its because the system libraries for ldap is missing. Its basically these libldap2-dev libsasl2-dev slapd ldap-utils.

@marcus-oscarsson

The way I understand this is as follows:

  • There are no wheels for python-ldap (and as far as I can tell, it is the only dependency in this case, there are wheels for everything else we depend on). So python-ldap needs to be built locally, which requires a compilation step, which requires a compiler, system libraries, and headers.
  • So we add conda to the mix, because with conda it is easy to install this dependency that is hard to install with Poetry. And once python-ldap is installed, we can let Poetry do the rest. But...
  • From what I have observed, this does not actually work, because Poetry rightfully ignores the python-ldap installed by conda:
    1. We had pinned different versions of python-ldap in conda-environment-dev.yml and in poetry.lock, which I fixed in Clean conda-environment-dev.yml #851
    2. There is an issue in the conda package that I reported here: Distributions contain direct_url.json file conda-forge/python-ldap-feedstock#28 -- As far as I can tell, it is not specific to python-ldap it is very likely an issue with a lot of conda packages.

Which means that pre-installing python-ldap with conda is probably useless, and we could even get rid of conda entirely since I am not sure why we need conda for if not for python-ldap.

@marcus-oscarsson
Copy link
Member

It seems like you've understood the situation very well.

We used to use apt-get or similar to install the libraries required by ldap, to make things a bit easier to install we then added these into the conda environment which have been working well with pip and I have not noticed any issue with poetry but thats maybe due to some other fortunate circumstance.

It would of course be very nice if we could remove the conda environment altogether. Looking at the error message from the test run it looks like the issue still is the installation of python-ldap.

Why do you think its right that poetry ignores the python-ldap installed by conda

I guess we could make python-ldap optional and the sites use LDAP have to make sure that the right libraries are in place before installing.

@rhfogh
Copy link
Collaborator

rhfogh commented Feb 21, 2024

I may have misunderstood things, but my impression was that when using Conda, Poetry, too, installed its code in the Conda environment. If Conda is removed, where will Poetry then install? Hopefully not in the main OS - it would be a real mess to have new installations override previous ones, aprticularly when you are testing different branches on the same machine.

@fabcor-maxiv
Copy link
Contributor Author

fabcor-maxiv commented Feb 21, 2024

I may have misunderstood things, but my impression was that when using Conda, Poetry, too, installed its code in the Conda environment. If Conda is removed, where will Poetry then install? Hopefully not in the main OS - it would be a real mess to have new installations override previous ones, aprticularly when you are testing different branches on the same machine.

By default Poetry always uses a virtual environment. If Poetry can not find an active environment (be it a conda, venv, virtualenv, or anything else), then it will create one. When you create the conda environment, activate this conda enviroment, then run a Poetry command such as poetry install, then Poetry does install the packages in the conda environment.

The logic of managing environments is different with Poetry than with conda. With Poetry alone, I am not sure how I would handle one virtual environment per git branch. It is probably possible but not straightforward. It is not something I do or want to do so I have never investigated it.

@fabcor-maxiv
Copy link
Contributor Author

fabcor-maxiv commented Feb 21, 2024

Why do you think its right that poetry ignores the python-ldap installed by conda

Documented in these tickets:

We used to use apt-get or similar to install the libraries required by ldap, to make things a bit easier to install we then added these into the conda environment which have been working well with pip and I have not noticed any issue with poetry but thats maybe due to some other fortunate circumstance.

It is likely that you still have the (apt-get) system dependencies installed on your system(s), or you have the built wheels of python-ldap in some cache somewhere.

I guess we could make python-ldap optional and the sites use LDAP have to make sure that the right libraries are in place before installing.

Oh, right, LDAP itself might not even be used by all facilities. But that is a different topic, kind of orthogonal, and we find ourselves at the intersection of both. Even if we were to get rid of python-ldap entirely, I believe it would still be better to sort out the conda vs. Poetry topic.

It would of course be very nice if we could remove the conda environment altogether.

Everyone should be free to use conda if they want to (we will likely keep using conda for our deployments at MAXIV), but I do not think it is strictly necessary to have our (official, documented) installation procedures mention conda. We should at least make sure that everything works fine without conda and our documentation should reflect that.

But...

We could also say clearly that: "yes we want to keep going with conda for reasons X and Y". I would be fine with that, and I have some ideas on how to make it work with Poetry, but for sure it would add a bit of complexity, and so far I do not know of any such X or Y reason.

For now my understanding is that we only get the impression of things going well but they do not actually go well as can be seen in this failing job where Poetry tries to install python-ldap and fails to do so, even though we had already installed python-ldap in the conda environment.

@marcus-oscarsson
Copy link
Member

marcus-oscarsson commented Feb 21, 2024

Seems like you did a fair amount of digging :), very nice.

So the issue is the python-ldap package and seemingly some routine within the conda build and not that its installed by conda per say. I misinterpreted what you posted as that poetry rightfully ignores installed conda packages, which I found a bit strange.

For the rest I completely agree, making python-ldap optional is not a solution to the issue and I also rather see it solved. Removing the conda environment yml would make the install procedure even simpler which is always good. Then, anyone is (as today) free to use whatever virtual environment solution they prefer. We are using conda and it mostly works very well (but can sometimes give you some real trouble).

My opinion however, is that we should as a community recommend or at least suggest a virtual environment solution that works. Most developers and sites will need one and I do think its good if its documented somewhere, why not in the install section ? (or perhaps at least somewhat in a related section)

@fabcor-maxiv
Copy link
Contributor Author

fabcor-maxiv commented Feb 21, 2024

A couple more things to take into account...

Poetry is a development tool. So doing deployment in production with poetry install is not really something one should do (it's like installing gcc and your IDE on the production machine). I am a bit uncomfortable with our (mxcubeweb) documentation suggesting this.

Also Poetry should never be installed in the same environment it is meant to be installing packages in. But currently, if one uses our conda-environment-dev.yml then this is exactly what happens. Poetry gets installed in the conda environment and then Poetry installs mxcubecore in that same conda environment. We end up having mxcubecore installed side by side with Poetry. This is not how it is supposed to be. In the environment there should be only mxcubecore and its runtime dependencies (including Python).

we should as a community recommend or at least suggest a virtual environment solution that works

Yes, that is what I want to do as well. It will take me some time to come up with a good solution. And also it is not my current priority. But as far as I know it is not a blocker for anyone, so it should be fine. Maybe before the next code/doc camp would be nice, though.

@marcus-oscarsson
Copy link
Member

I don't think there is any urgency to this it seems to work quite alright as it is even if its not ideal. Anytime you have to spend on this is very much appreciated :)

Do we mention something about how to install/deploy in a production environment ?. We talked about it but, the documentation we have is more for development and testing not so much for deployment its so far left up to each institute to deal with. It would be great if we could come up with something though.

But just out of curiosity and its not a discussion for this PR, but how do you deploy or install a python package without pip or poetry.

@fabcor-maxiv
Copy link
Contributor Author

Do we mention something about how to install/deploy in a production environment ?. We talked about it but, the documentation we have is more for development and testing not so much for deployment its so far left up to each institute to deal with. It would be great if we could come up with something though.

True. I guess what I wanted to say is that since the only documentation regarding installation that we have is for development, people might assume that it is applicable for production as well. I should try to write something about production deployment when I get to it (maybe at the code camp).

But just out of curiosity and its not a discussion for this PR, but how do you deploy or install a python package without pip or poetry.

I do not think I said pip should not be used. I would recommend pip or conda. Also, interesting fact, nowadays, pip is able to install in any environment, not necessarily the one pip itself is installed in. So we should be able to pip-install mxcube in an environment that is completely empty (not even pip), which would probably be the ideal case for production.

@marcus-oscarsson
Copy link
Member

I see, I somehow thought that one could use poetry to install from pypi as one does with pip but it does indeed not seem to be the case, so for that matter I but an equal between the two. We have so far been using pip for our deployment so I did not even try with poetry for that purpose.

@marcus-oscarsson
Copy link
Member

Is this one sill in progress ?

@fabcor-maxiv
Copy link
Contributor Author

Do we mention something about how to install/deploy in a production environment ?. We talked about it but, the documentation we have is more for development and testing not so much for deployment its so far left up to each institute to deal with. It would be great if we could come up with something though.

True. I guess what I wanted to say is that since the only documentation regarding installation that we have is for development, people might assume that it is applicable for production as well. I should try to write something about production deployment when I get to it (maybe at the code camp).

Done:

@fabcor-maxiv
Copy link
Contributor Author

Is this one sill in progress ?

I still plan to work on this. I need to find the time.

@marcus-oscarsson
Copy link
Member

Ok, sounds good, time is a precious resource these days :)

Use conda environments for better consistency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants