Docs: clarify when the parquet reader will read from object store when using cached metadata #10909
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Part of #10580
Rationale for this change
While working on #10701 it was quite unclear to me why the parquet reader was doing a second object store request even when I had passed it pre-existing
ParquetMetadata
It turns out the it was because the cached
ParquetMetadata
didn't have the page index strutures loaded, and so the parquet exec will load them on demand if required.What changes are included in this PR?
Are these changes tested?
Note: I documented this in arrow-rs too: apache/arrow-rs#5887
Are there any user-facing changes?