Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[draft] tar_terra_rast: implement {gdalraster} SOZip preserve_metadata method #123

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

brownag
Copy link
Contributor

@brownag brownag commented Nov 28, 2024

This is a draft/example PR to adds an additional option "gdalraster_sozip" for the preserve_metadata argument to tar_terra_rast().

In the future it would be nice if the GDAL C API for working with SOZip files was exposed through terra--which could be a future option i.e. "terra_sozip" or just "sozip".

Using this utility requires GDAL >= 3.7. I have not added any version check to this yet.
Some additional questions I have pertain to the ability to write to ZIP with multiple threads. I have not tinkered with this on HPC/multiple worker type setups.

Comment on lines 152 to 154
gdalraster_sozip = function(path) {
terra::rast(paste0("/vsizip/{", path, "}/"), basename(path))
},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about file.path here, or is this syntax going to be generalisable to all OS's?

Copy link
Owner

@njtierney njtierney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this, @brownag ! A couple of small changes requested, and I'm also wondering if/how we should think about tests for this?


tmp <- withr::local_tempdir()

dir.create(file.path(tmp, dirname(path)), recursive = TRUE)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest moving file.path(tmp, dirname(path)) out into a separate variable, since it gets referred to a few times later on in this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok this is done. Sorry for some commit noise on this, made a mistake that broke tests, but I think it is resolved now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #127 - the fix(es) for that issue are maybe also required here.

R/tar-terra-rast.R Outdated Show resolved Hide resolved
@Aariq
Copy link
Collaborator

Aariq commented Dec 2, 2024

What do you think about using just preserve_metadata = "sozip"? Simpler to type and the "engine" behind this option can change if better tooling becomes available

@brownag
Copy link
Contributor Author

brownag commented Dec 2, 2024

What do you think about using just preserve_metadata = "sozip"? Simpler to type and the "engine" behind this option can change if better tooling becomes available

That is a good idea.

However I think this doesn't actually resolve all of our metadata issues as it appears .aux.json is not read by terra when specifying the path this way (see tests added in 02ca360 following the "zip" example). Perhaps this PR was premature, but I wanted to try and kick the tires on it a bit. Will have some time to look closer at things this week, it may be possible for us to read from the .aux.json and apply metadata to the result SpatRaster ourselves in lieu of terra::rast() doing it automatically.

@Aariq
Copy link
Collaborator

Aariq commented Dec 2, 2024

However I think this doesn't actually resolve all of our metadata issues as it appears .aux.json is not read by terra when specifying the path this way (see tests added in 02ca360 following the "zip" example).

It might be good to get clarity on whether this is something that can be fixed in terra before putting too much more energy into this. It seems possible that it might not ever be fixed because GDAL is doing the reading of the /vsizip/ path and terra (with base R functions? some other mechanism?) is reading in the aux.json file. (see rspatial/terra#1629)

If this is something you can figure out how to do (add the aux.json metadata back in to the SpatRaster after a /vsizip/ path is read) it might make more sense as a contribution to terra.

 - use driver that stores metadata externally
 - include dataset and band user tags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants