-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in h2o.getConnection(): No active connection to an H2O cluster. #57
Comments
Looks like the "JAVA not found" error from Also, some diagnostic messages read: h2o::h2o.init()
#> Note: In case of errors look at the following log files:
#> C:\Users\siava\AppData\Local\Temp\Rtmp08HkaV\filec5072151a2b/h2o_siava_started_from_r.out
#> C:\Users\siava\AppData\Local\Temp\Rtmp08HkaV\filec50cc92881/h2o_siava_started_from_r.err Could you re-run |
Plenty Appreciated ...
h2o_siava_started_from_r.err
h2o_siava_started_from_r.out
|
This might be caused by an h2o-3 bug (h2oai/h2o-3#16360). Unfortunately, we don't have automated test for Windows and the bug is caused by trying to read a file from R that is opened by the java process. (This works on macOS and linux but not on Windows.) If that is the case, you could downgrade h2o-3 to 3.44.0.x or wait for next fix release (I already have a PR fixing that: h2oai/h2o-3#16369). |
I switched to a linux machine and tested my own code. I still get the exact same problem. I decided to test leaving out the bits of code one-by-one. Anyway, in short. Removing the following line from .Rprofile fixed it:
The fact that I have to remove parallelization capabilities as whole is not good (say, There is nothing odd in the log file, though on the Linux system, I noticed a warning:
Then again, it would be odd if both Ubuntu and Windows used libxml 209!!! So I am guessing whatever it is, it has to do with parallel and future, or rather with for.each! very puzzling ... NOTE 1: |
Ah, that makes sense that removing
|
In any case, the error message here should be more informative! |
Ah well, the problem persists in Windows. I guess I have to resolve to using linux for this. In Windows, you got to remove |
@siavash-babaei H2O-3 is written for big data and is using MapReduce under the hood so it should use parallel processing if you don't disable it. IIRC H2O runs within JVM with the exception of XGBoost so if you train XGBoost models you should not set H2O to use all memory (that's the memory that will be allocated for JVM heap) so we have a recommendation to set only 2/3 of RAM for H2O. Model training is usually more memory demanding than inference so if you train on bigger data it's wise to do it sequentially since it will make it much less likely to run out of memory. If you train on smaller data, it's possible to use If the problem persists on Windows, you can use older H2O version (e.g. the one on CRAN) and that might help with the |
Solved. It doesn't have anything to do with R/H2O version, Java, or Linux/Windows. It was right there but strangely I didn't see the obvious: You need to have a cluster object and every node needs to be initialized. The future package doesn't have a function to return the cluster object nor the ability to initialize nodes properly with complex code, but the parallel package does. To use the model parallelism for tidymodels and data parallelism for h2o and agua, you need to mix-up parallel, future, and doFuture packages. The parallel packages is used to detect cores, make a cluster, and then load-up and initialize every node. The future and doFuture packages are used to plan a future and register the for.each backend, which is required to enable model-parallelism in tidymodels. Tidymodels specifically uses the future package while agua can work with future, parallel, or mc backends (please correct if wrong). # H2O for machine learning
library(h2o)
# Initialize local H2O server and start
local_h2o <- h2o::h2o.init(startH2O = TRUE)
local_h2o |> print()
# The agua package provides tidymodels interface to the H2O platform and the
# H2O R package
library(agua)
# Adaptive Parallelism - Decided by H2O
h2o_thread_spec <- agua::agua_backend_options(parallelism = 0)
# To be used when using grid search, racing, or any of the iterative search
# methods in tidymodels.
grid_ctrl <- tune::control_grid(
allow_par = TRUE,
save_pred = TRUE,
save_workflow = TRUE,
event_level = "second",
# chooses between "resamples" and "everything" automatically
parallel_over = NULL,
backend_options = h2o_thread_spec
)
# Parallel computation libraries
library(future)
library(doFuture)
# Speed up computation with parallel processing (optional)
n_cores <- parallel::detectCores(logical = TRUE)
cluster <- parallel::makeCluster(spec = n_cores)
doFuture::registerDoFuture()
strategy <- future::plan(strategy = future::cluster, workers = cluster)
# load, initialize, and check status of h2o on a cluster node
node_h2o_init <- function() {
library(h2o)
library(agua)
# doesn't start a new server if you've already started one
cluster_init <- h2o.init(startH2O = FALSE)
c(cluster_init = cluster_init, cluster_status = h2o.clusterIsUp())
}
# initialize and check each cluster node
cluster |>
parallel::clusterCall(cl = _, fun = node_h2o_init) |>
purrr::list_c() |
Thanks @siavash-babaei . We should consider adding a vignette for this, probably combined with #41 |
With
tune::tune_grid
, I bump into the "Warning: All models failed ...", which is due to "Error inh2o.getConnection()
: No active connection to an H2O cluster ...". Although the output ofh2o.getConnection()
immediately before and aftertune_grid
suggest everything should be OK. The code is pretty much the standard example on reference site.Latest RStudio is run with both normal and administrative privileges (same issue - although we should not need admin rights) on Windows 11 Pro x64 machine. JRE (1.8.0_421) and JDK (22) are installed, and all Java paths (JRE_HOME, JDK_HOME, JAVA_HOME = JRE_HOME) are all A-OK! I went so-far as to use rJava to manually initialize a Java VM instance with no effect. The command
demo(h2o.kmeans)
runs wihout a problem and produces expected results.The only other thing of note is that running
agua::h2o_start()
produces weird output that it should not be. I get "permission denied error" on user folder which should be accessible even without admin rights (folder-access issue is the same with h2o package) and I get "no Java error" that shouldn't be there. Could this be the root of the issue: "h2o cluster connection is there as far as h2o package is concerned but agua (which adds supports for tidymodels) does not recognize Java and the existing h2o connection"?!The text was updated successfully, but these errors were encountered: