-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce logic for optional AI Service for optimized software stack for Python environments. #1084
Comments
Thoth sounds like quite a cool tool. It's not clear to me how it fits in with repo2docker though. The aim of repo2docker is to reproducibly build an environment from a git repo, using whatever dependency specifications are provided. How do you envisage Thoth fitting in to this, and what are the pros and cons of integrating it in repo2docker vs running it as a standalone tool? |
Little note on reproducibility: it can be obtained if you have software stacks pinned down with all versions, direct and transitive ones, and runtime environment (OS and hardware). So unless the users provide
Considering the reason explained above Thoth can support actual reproducible builds. Thoth has security features, it can check that what you are trying to install has correct hashes, and that nothing was modified between Pipfile and Pipfile.lock for example. Thoth can also install dependencies because it uses micropipenv under the hood: #1083 |
Proposed change
Project Thoth [1] uses Artificial Intelligence to analyze and recommend software stacks for Python applications optimized for their specific environment and recommendation (software stack provided as
Pipfile, Pipfile.lock
format as supported bypypa
https://pipenv.pypa.io/en/latest/). Thoth wants to help developers (including data scientists) to manage dependency easily, allowing them to state their requirements and for reproducibility and shareability of their projects.The AI service is enabled with a
.thoth.yaml
(https://github.com/thoth-station/thamos#using-custom-configuration-file-template) that states runtime environment and recommendation type (https://thoth-station.ninja/recommendation-types/). This can help others to find out what Python interpreters, OS and Hardware were used to create a certain software stack.Project Thoth [1] has several integrations:
thamos
[2] to handle any action with.thoth.yaml
and interaction with Thoth service.jupyterlab-requirements
[3] in order to make them reproducible and shareable: https://github.com/thoth-station/jupyterlab-requirements#usage. This could help also tools like repo2docker to identify the pinned-down software stack coming with a certain notebook.For this feature request, I would focus on
thamos library
[2]. This could be an optional feature that could be enabled by users that want to have that (adding.thoth.yaml
to their repo).Thoth has also a provenance check feature to verify where the packages come from before installing them that could be optionally enabled as well.
Alternative options
Who would use this feature?
All users interested in having an optimized software stack to use for their projects.
How much effort will adding it take?
It should not take too much time to introduce optional Thoth logic that can be enabled by repo2docker only if users have
.thoth.yaml
in their repo.Who can do this work?
I'm available to write the part with
thamos library
[2] to optionally enable Thoth service, I would need some support on testing those changes.References
The text was updated successfully, but these errors were encountered: