Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raydp.init_spark fails #350

Open
wctmanager opened this issue Jun 2, 2023 · 3 comments
Open

raydp.init_spark fails #350

wctmanager opened this issue Jun 2, 2023 · 3 comments

Comments

@wctmanager
Copy link

wctmanager commented Jun 2, 2023

Created docker image as described at https://github.com/oap-project/raydp/tree/master/docker the only change is it's based on rayproject/ray:latest-py38 (on py38 and not the default py37). Created image was deployed with helm charts described https://docs.ray.io/en/latest/cluster/kubernetes/getting-started.html#kuberay-quickstart. I use Azure Kubernetes Service (AKS) and access my k8s cluster there remotely.

Then
import ray
import raydp
ray.init("ray://x.x.x.x:10001")
goes fine and connects to the ray cluster
but
spark = raydp.init_spark(app_name='RayDP Example',
num_executors=1,
executor_cores=1,
executor_memory='1G')
creates
Traceback (most recent call last):
File "python/ray/_raylet.pyx", line 870, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 921, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 877, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 881, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 821, in ray._raylet.execute_task.function_executor
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/_private/function_manager.py", line 670, in actor_method_executor
return method(__ray_actor, *args, **kwargs)
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 460, in _resume_span
return method(self, *_args, **_kwargs)
File "/opt/conda/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 460, in _resume_span
File "/opt/conda/lib/python3.8/site-packages/raydp/spark/ray_cluster_master.py", line 56, in start_up
File "/home/ray/anaconda3/lib/python3.8/site-packages/py4j/java_gateway.py", line 1321, in call
return_value = get_return_value(
File "/home/ray/anaconda3/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value
raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.deploy.raydp.RayAppMaster.setProperties.
: java.lang.NullPointerException
at java.util.Hashtable.put(Hashtable.java:460)
at java.util.Properties.setProperty(Properties.java:166)
at java.lang.System.setProperty(System.java:812)
at org.apache.spark.deploy.raydp.RayAppMaster$.$anonfun$setProperties$1(RayAppMaster.scala:336)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:400)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:728)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:728)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:728)
at org.apache.spark.deploy.raydp.RayAppMaster$.setProperties(RayAppMaster.scala:335)
at org.apache.spark.deploy.raydp.RayAppMaster.setProperties(RayAppMaster.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:750)

Any ideas? Thank you very much.

@pang-wu
Copy link
Contributor

pang-wu commented Jun 4, 2023

What version of ray and raydp you are using?

@wctmanager
Copy link
Author

I tried to run it with docker image built from ray:2.4.0 (currently latest) and 2.2.0 both for py38. raydp used in the image is as in the current Dockerfile file - latest which is currently 1.5.0. Same versions were used on the client side. Thank you for your help.

@wctmanager
Copy link
Author

It looks like that raydp v1.5.0 is based on ray 2.1.0 (in core/raydp-main/pom.xml), so I tried to build an image with ray:2.1.0. Then raydp.init_spark works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants