Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support direct runner (Colab support) #749

Open
Tracked by #830
RobbeSneyders opened this issue Jan 2, 2024 · 4 comments
Open
Tracked by #830

Support direct runner (Colab support) #749

RobbeSneyders opened this issue Jan 2, 2024 · 4 comments
Assignees
Labels
Infrastructure Infrastructure and deployment

Comments

@RobbeSneyders
Copy link
Member

RobbeSneyders commented Jan 2, 2024

Fondant pipelines currently cannot be executed in Google Colab, which would be a great way for users to try out Fondant. This is due to the limitation of Google Colab to run docker.

We should investigate the best way to support this. Two options are:

  • Creating a VenvRunner which executes each component in a virtual environment. For local custom components, this should be doable, but this currently won't work for reusable components. This would require changes to how we package and share reusable components, since currently only the Docker container and component spec are shared, while we would need the original source files.
  • Using udocker as a docker replacement. It's not completely a drop-in replacement though, so we should validate how feasible this is. I did a quick PoC and was able to execute a Fondant container using udocker directly. More changes would be needed to let Fondant use udocker.
@RobbeSneyders
Copy link
Member Author

@RobbeSneyders RobbeSneyders added the Infrastructure Infrastructure and deployment label Jan 2, 2024
@RobbeSneyders RobbeSneyders moved this from Backlog to Breakdown in Fondant development Jan 2, 2024
@GeorgesLorre
Copy link
Collaborator

GeorgesLorre commented Jan 3, 2024

What is the exact limitation of colab that makes us unable to use docker ? Is it sudo rights ?

@RobbeSneyders
Copy link
Member Author

I believe the issue is that colab is already running in a docker container itself. Docker in docker isn't really possible unless the host is configured in a specific way as far as I know. And even then, you would be connecting to the host docker, which is not something Google wants to enable (understandably).

@RobbeSneyders RobbeSneyders moved this from Breakdown to Backlog in Fondant development Jan 29, 2024
@GeorgesLorre
Copy link
Collaborator

I'm updating this ticket to better reflect the scope:

We want to support direct execution of components via a direct runner:

  • this will provide colab support for local running of pipelines (since docker is unsupported)
  • this will power eager execution with a very fast way of running components

This does mean that this is a new type of runner and there are some things to be solved:

  • installation of dependencies (and some kind of cache?)
  • support for reusable components
  • Needs to work in Colab since it is a prime way for users to start experimenting with Fondant
  • ...

Some ideas on how to proceed:

  • virtual environment runner where every component is ran in a new venv
  • udocker (see above)
  • leverage kfp's subprocessrunner
  • ...

@GeorgesLorre GeorgesLorre changed the title Support running Fondant pipelines in Google Colab Support direct runner (Colab support) Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Infrastructure Infrastructure and deployment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants