H3 functions missing methods on Databricks: `java.lang.NoSuchMethodError: com.uber.h3core.H3Core...` #1137

remibaar · 2023-12-01T20:08:23Z

Expected behavior

When following the install information to use Sedona with Databricks.
When using ST_H3CellIDs

I expect to get the H3 indices of the given polygon.

As example I run this SQL query:

SELECT ST_H3CellIDs(ST_GeomFromText('POLYGON((0 0, 10 0, 10 10, 0 10, 0 0))'), 12, FALSE)

Actual behavior

I get this error:

NoSuchMethodException: java.lang.NoSuchMethodError: com.uber.h3core.H3Core.polygonToCells(Ljava/util/List;Ljava/util/List;I)Ljava/util/List;

Steps to reproduce the problem

I used both the pip installation route, and the pure SQL on Databricks.

Both result in the same error.

Settings

Environment Azure Databricks
Databricks runtime: 13.3 LTS

Operating System: Ubuntu 22.04.2 LTS
Java: Zulu 8.70.0.23-CA-linux64
Scala: 2.12.15
Python: 3.10.12
R: 4.2.2
Delta Lake: 2.4.0

Thoughts

I thought H3 might not be included in the shaded version. So I also tried to add the h3-4.1.1.jar to the init script. But this also doesn't solve the issue.

I finally used these scripts:

Download jars

# Create JAR directory for Sedona
mkdir -p /dbfs/FileStore/sedona/jars

# Remove contents of directory
rm -f /dbfs/FileStore/sedona/jars/*.jar

# Download the dependencies from Maven into DBFS
curl -o /dbfs/FileStore/sedona/jars/geotools-wrapper-1.5.0-28.2.jar "https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.0-28.2/geotools-wrapper-1.5.0-28.2.jar"

curl -o /dbfs/FileStore/sedona/jars/sedona-spark-shaded-3.4_2.12-1.5.0.jar "https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.4_2.12/1.5.0/sedona-spark-shaded-3.4_2.12-1.5.0.jar"

curl -o /dbfs/FileStore/sedona/jars/h3-4.1.1.jar "https://repo1.maven.org/maven2/com/uber/h3/4.1.1/h3-4.1.1.jar"

Create init script

# Create init script
cat > /dbfs/FileStore/sedona/scripts/sedona-init.sh <<'EOF'
#!/bin/bash
#
# File: sedona-init.sh
# 
# On cluster startup, this script will copy the Sedona jars to the cluster's default jar directory.
# In order to activate Sedona functions, remember to add to your spark configuration the Sedona extensions: "spark.sql.extensions org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions"

cp /dbfs/FileStore/sedona/jars/*.jar /databricks/jars

EOF

All the other functions of Sedona do work. So Sedona is installed properly, I am only unable to use the H3 functions.

Did I miss a step in the set-up? I checked the documentation multiple times, but couldn't find any clue. I hope someone can help me out.

The text was updated successfully, but these errors were encountered:

remibaar · 2023-12-01T20:28:54Z

After some further investigation I see the Databricks runtime also contains H3 functionality. For this it uses com.uber h3 version 3.7.0. Could this be conflicting with the version 4.1.1 which is being used by Sedona? It would explain it as polygonToCells is not available in version 3.x of H3.

remibaar · 2023-12-01T21:59:18Z

I managed to solve the issue! Indeed it was related to the version of H3 that was being installed in the Databricks runtime.

By adjusting the init script, I remove the older H3 jar from the Databricks jars. This solves the issue.
This is the code for my new init script:

%sh

# Create init script
cat > /dbfs/FileStore/sedona/scripts/sedona-init.sh <<'EOF'
#!/bin/bash
#
# File: sedona-init.sh
# 
# On cluster startup, this script will copy the Sedona jars to the cluster's default jar directory.
# In order to activate Sedona functions, remember to add to your spark configuration the Sedona extensions: "spark.sql.extensions org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions"

# Remove default H3 version of databricks, as it is not compatible with Sedona > 1.5.0
rm -f /databricks/jars/*com.uber*h3*.jar

# Copy jars
cp /dbfs/FileStore/sedona/jars/*.jar /databricks/jars

EOF

Note: This will break the builtin H3 functionality of Databricks. But I believe the H3 functions of Sedona supersedes those of the built-in H3 of Databricks. The builtin H3 functions will now throw a NoClassDefFoundError

I will keep this issue open, because I am going to create a PR for a change in the docs.
https://github.com/apache/sedona/blob/master/docs/setup/databricks.md

jiayuasu · 2023-12-02T03:39:24Z

The main reason is that we shaded the uber-h3 jar into sedona-spark-shaded which leads to conflicts. Another alternative to fix this is that: use sedona-spark jar which does not shade anything, and manually download all dependency jars of Sedona: https://github.com/apache/sedona/blob/master/pom.xml#L139

remibaar · 2023-12-02T11:07:26Z

Another alternative to fix this is that: use sedona-spark jar which does not shade anything, and manually download all dependency jars of Sedona

Please correct me if I am wrong. With this method you also will not be able to use both the H3 of Sedona and the H3 of Databricks. Because they use different major versions (Sedona uses 4.1.1, Databricks uses 3.7.0), which are incompatible.

My personal recommendation would be to remove the H3 3.7.0 jar from the Databricks runtime. This disables the H3 functions of Databricks, but allows the use of the H3 functions of Sedona.
In my opinion the H3 functions of Sedona are more feature complete.

For example one of the features I need is the fullCover of the ST_H3CellIDs function. Which is not available at the Databricks implementation, but is at Sedona

jiayuasu · 2023-12-03T08:37:05Z

@remibaar Makes sense to me. Would you please update the doc of Sedona website and create a PR? I am happy to accept it!

jacob-talroo · 2024-05-09T21:47:55Z

On Databricks, I too am getting NoSuchMethodError: 'java.util.List com.uber.h3core.H3Core.polygonToCells(java.util.List, java.util.List, int)'. This is weird since I am on a Databricks cluster that should NOT support H3 - it neither SQL nor Photon.

The main reason is that we shaded the uber-h3 jar into sedona-spark-shaded which leads to conflicts. Another alternative to fix this is that: use sedona-spark jar which does not shade anything, and manually download all dependency jars of Sedona: https://github.com/apache/sedona/blob/master/pom.xml#L139

I think that the issue is that h3 is not shaded currently? It appears that Sedona is using the package com.uber. If it was shaded, wouldn't it use a different package name?

I think the current "shaded" JAR might currently just be an Uber Jar - not to be confused with the company behind H3. To shade, I think we need some relocations.

jiayuasu · 2024-05-10T19:05:29Z

@jacob-talroo if you are not planning to use Databricks' h3 functions, maybe you can delete the h3 jars from databricks jars folder, like what described above? rm -f /databricks/jars/*com.uber*h3*.jar And please use sedona-spark-shaded.

Sedona h3 has been used extensively on AWS EMR and Glue. Maybe relocations solve the databricks problem but will cause problems on other platforms.

jacob-talroo · 2024-05-10T21:18:59Z

Thank you - that work around is working for now.

Does the comment about the shaded jar actually being an Uber JAR make sense?

oliverangelil · 2024-05-24T20:30:17Z

@jiayuasu @remibaar have the docs been updated to mention this workaround (i.e. adding rm -f /databricks/jars/*com.uber*h3*.jar to the init file)? Could you link the page here.

remibaar changed the title ~~H3 functions missing methods: java.lang.NoSuchMethodError: com.uber.h3core.H3Core~~ H3 functions missing methods: java.lang.NoSuchMethodError: com.uber.h3core.H3Core... Dec 1, 2023

remibaar changed the title ~~H3 functions missing methods: java.lang.NoSuchMethodError: com.uber.h3core.H3Core...~~ H3 functions missing methods on Databricks: java.lang.NoSuchMethodError: com.uber.h3core.H3Core... Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

H3 functions missing methods on Databricks: `java.lang.NoSuchMethodError: com.uber.h3core.H3Core...` #1137

H3 functions missing methods on Databricks: `java.lang.NoSuchMethodError: com.uber.h3core.H3Core...` #1137

remibaar commented Dec 1, 2023

remibaar commented Dec 1, 2023

remibaar commented Dec 1, 2023 •

edited

Loading

jiayuasu commented Dec 2, 2023

remibaar commented Dec 2, 2023 •

edited

Loading

jiayuasu commented Dec 3, 2023

jacob-talroo commented May 9, 2024

jiayuasu commented May 10, 2024 •

edited

Loading

jacob-talroo commented May 10, 2024

oliverangelil commented May 24, 2024

H3 functions missing methods on Databricks: java.lang.NoSuchMethodError: com.uber.h3core.H3Core... #1137

H3 functions missing methods on Databricks: java.lang.NoSuchMethodError: com.uber.h3core.H3Core... #1137

Comments

remibaar commented Dec 1, 2023

Expected behavior

Actual behavior

Steps to reproduce the problem

Settings

Thoughts

remibaar commented Dec 1, 2023

remibaar commented Dec 1, 2023 • edited Loading

jiayuasu commented Dec 2, 2023

remibaar commented Dec 2, 2023 • edited Loading

jiayuasu commented Dec 3, 2023

jacob-talroo commented May 9, 2024

jiayuasu commented May 10, 2024 • edited Loading

jacob-talroo commented May 10, 2024

oliverangelil commented May 24, 2024

H3 functions missing methods on Databricks: `java.lang.NoSuchMethodError: com.uber.h3core.H3Core...` #1137

H3 functions missing methods on Databricks: `java.lang.NoSuchMethodError: com.uber.h3core.H3Core...` #1137

remibaar commented Dec 1, 2023 •

edited

Loading

remibaar commented Dec 2, 2023 •

edited

Loading

jiayuasu commented May 10, 2024 •

edited

Loading