Skip to content

Commit 8f76ac5

Browse files
authored
Docs: clarify when the reader will read from object store when using cached metadata (#10909)
1 parent cc60278 commit 8f76ac5

File tree

1 file changed

+9
-2
lines changed
  • datafusion/core/src/datasource/physical_plan/parquet

1 file changed

+9
-2
lines changed

datafusion/core/src/datasource/physical_plan/parquet/reader.rs

+9-2
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
// under the License.
1717

1818
//! [`ParquetFileReaderFactory`] and [`DefaultParquetFileReaderFactory`] for
19-
//! creating parquet file readers
19+
//! low level control of parquet file readers
2020
2121
use crate::datasource::physical_plan::{FileMeta, ParquetFileMetrics};
2222
use bytes::Bytes;
@@ -33,12 +33,19 @@ use std::sync::Arc;
3333
///
3434
/// The combined implementations of [`ParquetFileReaderFactory`] and
3535
/// [`AsyncFileReader`] can be used to provide custom data access operations
36-
/// such as pre-cached data, I/O coalescing, etc.
36+
/// such as pre-cached metadata, I/O coalescing, etc.
3737
///
3838
/// See [`DefaultParquetFileReaderFactory`] for a simple implementation.
3939
pub trait ParquetFileReaderFactory: Debug + Send + Sync + 'static {
4040
/// Provides an `AsyncFileReader` for reading data from a parquet file specified
4141
///
42+
/// # Notes
43+
///
44+
/// If the resulting [`AsyncFileReader`] returns `ParquetMetaData` without
45+
/// page index information, the reader will load it on demand. Thus it is important
46+
/// to ensure that the returned `ParquetMetaData` has the necessary information
47+
/// if you wish to avoid a subsequent I/O
48+
///
4249
/// # Arguments
4350
/// * partition_index - Index of the partition (for reporting metrics)
4451
/// * file_meta - The file to be read

0 commit comments

Comments
 (0)