-
-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] r.buildvrt: significant read performance issues with virtual raster maps #4345
Comments
Do I have to look here: https://github.com/OSGeo/grass/blob/main/lib/raster/vrt.c#L47 |
Ok. The documentation says VRTs can be build also over linked raster data with r.external. I tried now various GRASS GIS versions (7.8.8, 8.0.0, 8.2.1, all using docker) and all show the same performance issue with GDAL-linked data, especially on file systems with latency. The individual linked files are read quite fast, but combined in a VRT things get really slow... @metzm do you have any idea if this could be fixed somehow, or is it a format limitation that we rather document in the manual? I would be wiling to put down some effort here, but I lack C-skills and I would need some help to fix it; if possible at all... |
According .r.buildvrt module man page Reading the whole VRT is slower than reading the equivalent single raster map. Only reading small parts of the VRT provides a performance benefit. |
Thanks, @tmszi for looking into this. The performance difference is not related to reading parts vs. entire VRT, but related to VRT with GDAL linked data vs. VRT with native GRASS data... |
Does perhaps @rouault have a hint here? |
not really, I'm not familiar with what r.univar does. It would be best to try first to reproduce using only GDAL command line utilities, like gdal_translate |
Thanks, @rouault ! Will do that. Line 171 in 2356520
|
Oooooh I now read in https://grass.osgeo.org/grass84/manuals/r.buildvrt.html that a "A GRASS virtual raster can be regarded as a simplified version of GDAL's virtual raster format" . So I'm mostly incompetent to comment on GRASS VRT specificities. What is likely is that GRASS VRT might perhaps lack is the functionality of having a pool of opened VRT sources like GDAL does, which saves opening&closing them when doing repeated pixel request in neighbouring windows of interest. Just guessing in the dark... |
That again, @rouault ! Sounds like a viable alternative / workaround! I will try that! |
In this particular case, there might be a mix of different reasons causing poor performance. The reasons here seem to be NFS + GDAL-linked raster maps + GRASS vrt, which in their combination might amplify performance degradation. The two main reasons might be
These two reasons combined with NFS could easily cause the observed performance degradation. In this case I suggest to create a GDAL VRT and link that into GRASS. However, the fastest method should be to have GRASS native rasters (maybe in a mapset on a NFS mount) and optionally build a GRASS vrt with the native GRASS rasters. As so often, it's a compromise between data duplication and IO optimization. |
Thanks @metzm for your insights! Then I would suggest we close this issue once the known-issue for this corner case is documented in the manual. Using GDAL VRTs for GDAL linked data works actually quite well, facilitated with: |
Describe the bug
I am experiencing significant performance issues with virtual rasters build with r.buildvrt over GDAL-linked (r.external) raster maps (source is in GeoTiff format) on NFS. After more testing it seems the NFS file system amplifies the issue but but there are significant performance issues also on local file systems and also with raster maps in native GRASS format...
Running r.univar on two GDAL-linked raster maps that cover my computational region takes less than a second. Using the same computational region, one r.univar run on a virtual raster of the same two raster map is by orders of magnitude slower (30 seconds to minutes).
Below you find a script to run performance tests on different file systems and with different formats. While VRT maps with raster maps in native GRASS format are sometimes faster than r.external linked GeoTiffs, performance is way worse compared to r.univar on the individual raster maps (= no VRT). So it seems the issue is reading GDAL linked raster maps through GRASS VRTs.
In debug=2 mode I see waaaaay more calls to:
when running r.univar on a VRT compared to reading the same maps not going through VRT. That is probably the main root cause...
Hints on how to identify or find possible remedy in the code would be very welcome...
To reproduce
Expected behavior
VRT raster maps should be at least comparable in read performance
System description
version=8.3.1
date=2023
revision=exported
build_date=2023-10-26
build_platform=x86_64-pc-linux-gnu
build_off_t_size=8
libgis_revision=8.3.1
libgis_date=2023-10-26T09:06:16+00:00
proj=9.1.1
gdal=3.6.4
geos=3.11.1
sqlite=3.37.2
Additional context
GRASS GIS version: 8.5.dev behaves the same...
The text was updated successfully, but these errors were encountered: