Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: update backend dependency management to pdm + dockerfile #5184

Merged
merged 24 commits into from
May 10, 2023

Conversation

spwoodcock
Copy link
Member

Closes #4642

Looks like this issue has been open for a while.

Great suggestion by @frafra to use PDM. Thank you.

Notes:

  • I updated the Dockerfile.backend to reflect the changes.
  • I also added the --no-cache flag to the final stage apk install command to save some space.
  • I pinned the Python version number for version solving to >=3.7,<=3.8.
  • The values for pyproject.toml were taken from the repo. Let me know if they need to change.

@spwoodcock
Copy link
Member Author

This is only a partial solution, as there are multiple references to requirements.txt throughout the repo.

python-venv is also used, but is now redundant with using pdm.

I have a commit on this branch that I have not pushed yet, removing requirements.txt and modifying any docs / workflows with reference to it (14 files).
Let me know if you would like me to push.

@Aadesh-Baral
Copy link
Contributor

Aadesh-Baral commented Jun 16, 2022

Thanks for the PR @swoodcoc. I think it will be easier to review if you push it and maybe you can also describe what the user needs to do to set up the project (like instead of pip install -r requirements.txt) and how this change benefits the project.

@spwoodcock
Copy link
Member Author

No worries, I'll add it now.
I was worried it would increase the scope, as it also affects CI and deployment instructions.

I have updated all references to requirements.txt and using venv that I could find in the repo.

As for the benefit:

  • PDM uses a dependency solver to check for compatibility between all dependencies.
  • If a package is updated in the future that is incompatible with other package version, then it should be flagged during build.
  • I also think PEP 582 dependency management is a lot simpler and reduces the need for virtual environments.

@spwoodcock
Copy link
Member Author

So it actually looks like you can't pip install dependencies directly from a pyproject.toml (I have never had to before).
There is a related issue pypa/pip#8049

I wanted to keep the edits to CI etc simple by pip installing the already solved dependency list.

To achieve this we can

pip install toml && python -c 'import toml; c = toml.load("pyproject.toml")
print("\n".join(c["project"]["dependencies"]))' | pip install -r /dev/stdin

I will update.

@spwoodcock
Copy link
Member Author

Everything should be complete to the best of my knowledge.

There seem to be quite a few Dockerfiles for the backend to maintain - I updated them all, as I wasn't sure which ones are redundant or still used by external processes.

@spwoodcock
Copy link
Member Author

Also on a side note: the Dockerfile using quay.io/hotosm/base-python-image as a base uses Python 3.8.7, while the Dockerfile.backend uses 3.7. Perhaps a refactor is required to manage a consistent Python version.

The dependency install worked without issue (no conflicts) using Python 3.9, so I feel like both could be updated to that anyway.

Copy link
Contributor

@Aadesh-Baral Aadesh-Baral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some dependencies failed to install while setting up locally without docker but no such issues while using docker, other than that this looks good to me. Also, we may need to update the doc mentioning that the python version set in pyproject.toml should be preinstalled on the system before setting up locally without docker.

@spwoodcock
Copy link
Member Author

Thanks for the feedback, I'm more than happy to go through the docs and update as requested.

As for the package installs, are you able to pip install the packages, but not pdm install? I would be surprised if this is the case.

I imagine the reason install failed is because of missing libraries on your Linux system, such as proj-dev.

A local install requires all the libs packaged into the docker image to be installed on your machine:

RUN apk update && \
apk add \
postgresql-dev \
gcc \
g++ \
python3-dev \
musl-dev \
libffi-dev \
geos-dev \
proj-util \
proj-dev \
make

The Dockerfile above uses Alpine, so the packages would be named differently for example on a Debian based distro (e.g. Ubuntu). proj-dev --> libproj-dev etc.

@spwoodcock
Copy link
Member Author

I rebased and added the suggested changes to the docs.

@sonarcloud
Copy link

sonarcloud bot commented Jul 21, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@eternaltyro eternaltyro self-assigned this Apr 13, 2023
@eternaltyro
Copy link
Collaborator

Related to #5705

@spwoodcock
Copy link
Member Author

I could update this PR and rebase with the current develop, if of interest.

@eternaltyro
Copy link
Collaborator

Thank you, @spwoodcock . We are currently swamped and unable to review your upgrades immediately. Can you please give us a week or two to get develop branch cleaned-up and in order? I will keep you posted or you can follow the issues here on GH.

@eternaltyro
Copy link
Collaborator

@ramyaragupathy @Aadesh-Baral let us discuss this in our next meeting and go over timelines and effort involved.

@spwoodcock
Copy link
Member Author

Of course @eternaltyro, no rush at all! Send me a message over any platform when you have time for this 👍

Copy link
Collaborator

@eternaltyro eternaltyro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spwoodcock I have left some comments here - mostly prompts and reminders to discuss these. Everything else looks good. This is a significant quality upgrade to the Backend infrastructure. Thank you very much for taking the time and effort to do this.

@Aadesh-Baral Let's discuss this once we meet next week. Please hold merging even after current merge conflicts have been resolved. cc: @dakotabenjamin @ramyaragupathy

Dockerfile Outdated
Comment on lines 3 to 4
LABEL hotosm.org.maintainer="HOT Sysadmin <[email protected]>"
LABEL hotosm.org.description="Builds docker image for TM Backend"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand MAINTAINER label is deprecated and we can simply use LABELS for it. But what's the reason for using hotosm.org.maintainer as key? Docker suggests tld.domain.sub.maintainer but I'm not sure if it is standard. If it is, we can switch this to org.hotosm.maintainer.

Documentation about this is sparse though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any standards around label keys as far as I know, we can definitely change it 👍

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
scripts/docker/Dockerfile.backend Outdated Show resolved Hide resolved
scripts/docker/tasking-manager/Dockerfile Show resolved Hide resolved
@spwoodcock
Copy link
Member Author

I'll try to find time to revise this PR soon (particularly the dockerfile). There are a few refinements I could include, plus Python 3.10.

@eternaltyro
Copy link
Collaborator

I'll try to find time to revise this PR soon (particularly the dockerfile). There are a few refinements I could include, plus Python 3.10.

Thank you, @spwoodcock . Let me know once you do, and I'll give it another review.

.circleci/config.yml Outdated Show resolved Hide resolved
@spwoodcock
Copy link
Member Author

I should get around to updating this at the weekend - got a pretty full work week ahead 👍

docs_old/setup-development.md Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
@sonarcloud
Copy link

sonarcloud bot commented May 10, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 12 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@spwoodcock
Copy link
Member Author

I just rebased with develop and added the requirements.txt file back in for now.

The file can be deleted after merging #5642 and updating the pyproject.toml.

Ready to merge 👍

@eternaltyro eternaltyro merged commit 584f169 into hotosm:develop May 10, 2023
@eternaltyro
Copy link
Collaborator

Merged. All good so far. Thanks @spwoodcock for your time and labour. It's much appreciated 🙏🏼

eternaltyro added a commit that referenced this pull request May 17, 2023
The earlier PR #5184 switched the default TM_LOG_DIR from `logs` to
`/home/appuser` which is not present in circleCI. Switched it out for
/tmp/logs to make tests work.
@eternaltyro
Copy link
Collaborator

PEP 582 has been rejected! - https://discuss.python.org/t/pep-582-python-local-packages-directory/963/430

We will continue to use venv for now until our move to containers is done.

@spwoodcock
Copy link
Member Author

Yes it has, but that doesn't mean it's useless.
Here is a great summary: https://chriswarrick.com/blog/2023/01/15/how-to-improve-python-packaging/

PDM defaults to venv based installs. PEP582 support is optional, but is a much better alternative IMO.

The rejection shouldn't affect this PR.
PDM is used to export requirements for the dockerfile and to install via venv for running locally.

@eternaltyro
Copy link
Collaborator

Yes it has, but that doesn't mean it's useless. Here is a great summary: https://chriswarrick.com/blog/2023/01/15/how-to-improve-python-packaging/

PDM defaults to venv based installs. PEP582 support is optional, but is a much better alternative IMO.

I agree. But we have to play nice with standards and conventions that have buy in from the community also.

The rejection shouldn't affect this PR. PDM is used to export requirements for the dockerfile and to install via venv for running locally.

Agreed. I am only concerned about PEP582; the rest of the features PDM brings are still useful. Going forward, we will use venv, preferably managed by PDM and conforming to PDM defaults. All this is moot anyway with containers taking over.

@spwoodcock spwoodcock deleted the feature/backend-deps-pdm branch May 18, 2023 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update backend dependencies management
4 participants