-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turning a raster (geotiff) into a H3 table? #1047
Comments
@JimShady I wonder what exactly you mean for With the new H3 functions added in Sedona, the following are now possible:
|
Your second example is, I think, the type of operation I am thinking of. Obviously a raster is essentially a grid of values. So I'm thinking to use a H3 resolution that is similar to the raster resolution, and then extract the values from the raster into a H3 table. In your example above, how is the H3 resolution chosen in example 2 please? |
I'm wondering really if Apache-Sedona has a similar function to this: Because I've tried to use this mosaic function and the performance is really slow. |
Here is the full doc of ST_H3CellIds in Sedona 1.5.0 (https://github.com/apache/sedona/blob/master/docs/api/sql/Function.md#st_h3cellids). You can choose a level (resolution) of H3. Will be released next week Speaking of the raster_to_grid function, we don't have the same function because exploding a raster grid to individual rows will likely crash the memory of Sedona/Spark and it might NOT be the best way to explore/manipulate raster images. But we provide the following functions to allow you to directly manipulate pixels of bands
Is raster_to_grid really needed? If you are comfortable describing your use case, we can probably give you better alternatives. |
I've a few use cases. |
Hi @jiayuasu . Sorry for the delay in getting back to you. What I am trying to do is something like:
WHERE raster6 > 10 AND raster5 < 4 THEN (raster1 + raster2 + raster3) / raster4 Then I would write out the result to a new geotiff file. Is that kind of thing possible? I think so but I'm struggling with the syntax. All the examples in the documentation presume that we are dealing with rasters with multiple bands. For example here. I understand it. But my rasters are separate files, not bands of the same file. But my actual situation is this: |
@JimShady This is definitely possible in Sedona 1.5.0. We need to do this in 3 steps in PySpark: Please first use
Now the exciting part comes:
The resulting DF should be
The resulting df should be like this
Note that: now this raster is a single raster with 6 bands.
We are exploring the possibility to support manipulating multiple rasters (not only bands) in the Jiffle script. If we add the support in the future, we can get rid of Step 2 |
I'm stuck a little with Step 2 I think. This is my progress:
But I'm getting this error:
|
@JimShady The quote is wrong in the last query. Try In addition, not sure if in Spark, value like this |
I am getting there I think. This is my complete code now.
It seems to run fine until I try to save to file. I get " java.lang.OutOfMemoryError: Java heap space":
|
@JimShady It seems that your data is pretty large. Can you try to give more RAM to your executor? |
I'll try that when I get to work. Is there no "memory safe" way to do this though? Maybe I should write tables instead of views? |
Any thoughts on this @jiayuasu ? |
@JimShady This is actually quite big because the |
I see. Do you have any suggestions as to how this should be approached using Apache-Sedona then? Or is this sort of operation not possible? |
@jiayuasu -- 1.6GB x 10 = 160GB. Given my cluster has 256GB of memory I'd have thought that it would be ok? Do you have any ideas about to to do this? Should the raster be tiled maybe? |
Hello,
Been following development of Apache-Sedona with interest. Thanks for the development.
I was wondering if there are now sufficient tools/functions to use Apache-Sedona to convert a geotiff into a H3 spark table? How would I go about this?
Thanks.
The text was updated successfully, but these errors were encountered: