Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve arbitrary dataset performance #91

Open
ajnisbet opened this issue Dec 6, 2023 · 1 comment
Open

Improve arbitrary dataset performance #91

ajnisbet opened this issue Dec 6, 2023 · 1 comment

Comments

@ajnisbet
Copy link
Owner

ajnisbet commented Dec 6, 2023

Gdal is slow at reading VRTs with lots of files.

Unfortunately, using a VRT file is the only way opentopodata supports datasets without SRTM-style file naming, which is a bad experience for users.

Some options [of things to do for datasets which have more than one file and no SRTM style filenames]:

  • Build our own spatial index in memory.
    • ✅ Fastest response times
    • ✅ Supports mixed projection (will support any combination of raster files: just need to pick a common projection for the spatial index)
    • ❌ Will massively increase startup time (as have to read the header of every file to determine bounds)
    • ❌ I'll have to make decisions about e.g., how to order files with overlapping bounds, rather than relying on a standard.
  • Make a tool for nested vrts
    • ✅ No modifications to opentopodata needed.
    • ✅ The tool would be useful for other projects like gpxz
    • ❌ More work than a utility function in opentopodata
    • ❌ It's pretty hacky (might have issues with e.g., filesystem paths)
    • ❌ Slowest response time solution (gdal still has to read 2 or 3 VRTs, and do dozens of bounds comparisons)
    • ❌ VRT restrictions (no mixed projections)
  • Parse VRTs in opentopodata (detect when a dataset contains a single vrt file; manually parse it on startup; use that to build a spatial index or honestly just a list of bounds would be faster than gdal)
    • ✅ Transparent for user
    • ✅ Can fallback to regular (slow) mode if any issues are found.
    • ✅ Fast startup and response time
    • ❌ Parsing a VRT might be more complicated than I'm imagining
    • ❌ VRT restrictions (no mixed projections)
@ajnisbet
Copy link
Owner Author

ajnisbet commented Dec 6, 2023

Would resolve #90

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant