-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU versioning #1
Comments
Yes, rstudio installs 2.7 and I think shiny may need it to (but possibly only when building from source?). I think that means we should install python in a virtualenv or with miniconda and set up Tensorflow + friends to use that environment by default? I think the only twist with CUDA versions is whether we want to support different hardware. Different CUDA versions are compatible with different hardware, so it may be worthwhile to do something like 3.6.2-cuda9, and 3.6.2-cuda10. |
Sounds good. Is there any documentation of hardware dependency side? I thought cuda 10 was compatible with most older nvidia processors still, and it looks like at least some of the new packages won't run on old cuda (maybe including current tensorflow?). For python virtualenv setup, I played with that a whole bunch (though I think some env var things are now improved in tensorflow R package -- it used to have some funny behavior where it liked having it's own virtualenv separate from reticulate). so I'm partial to the config I have in https://github.com/rocker-org/ml/blob/master/ubuntu/shared/install_python.sh and https://github.com/rocker-org/ml/blob/master/ubuntu/shared/config_R_cuda.sh but of course open to discussion. |
Will the TensorFlow version factor in here too, e.g. r3.6.2-tf2.1.0-cuda10? Do you plan to make a strategic subset of R version x TF version x CUDA version?
I agree. For what it's worth, here is what I have been using to set up a local venv/Miniconda build for an RStudio Cloud project. Maybe it could be one possible backup? install.packages("keras")
reticulate::install_miniconda("miniconda")
Sys.setenv(WORKON_HOME = "virtualenvs")
reticulate::virtualenv_create("r-reticulate", python = "miniconda/bin/python")
keras::install_keras(
method = "virtualenv",
conda = "miniconda/bin/conda",
envname = "r-reticulate",
version = "2.3.1", # keras
tensorflow = "1.13.1",
restart_session = FALSE
) |
I do think we need to support different cuda images, I currently have I've been running a few projects on my GPU machine using the 10.0 and 10.2 images, and I find each one needs it's own python virtualenv anyway to support very particular versions of Tensorflow (one is only TF 2.1.0, several need TF 2.0.0, and a few need TF 1.14.0) I've found this pretty easy to manage with virtualenvs (though I haven't tried using it with I think we'll still ship an ML image with a tensorflow installation in place 'out-of-the-box', probably matching the version that the R I'm not sure I've found anything I'm currently working on that needs cuda 10.0 (or worse, say cuda 9.0, though I guess if I had anything that still pinned at tensorflow 0.12.0 we would need 9.0 cuda). Cuda lib updates are a bit of a bear since it is easy to create hardware mismatches, see notes here: rocker-org/ml#28 |
Which versions of CUDA will we support, and how will we indicate which version we are on?
NVIDIA images have a lot of tags, and do all minor version releases. We're almost surely going to be pip installing python binaries for
tensorflow
, which are only available for certain CUDA versions anyway.It's probably best we just have a rule that pins the CUDA version to something else (e.g. the R version being the obvious choice, like we do for everything else). We would then probably follow the same sliding version scale that they are using at
tensorflow
, where they are on 10.0 for now.(I suppose there's also the related question of which python version we use for the ML stack, though I think we can safely go all python 3. though I think the rstudio build recipe somehow installs python 2.7 anyway....)
The text was updated successfully, but these errors were encountered: