-
pub struct RecordBatch {
schema: SchemaRef,
columns: Vec<Arc<dyn Array>>,
/// The number of rows in this RecordBatch
///
/// This is stored separately from the columns to handle the case of no columns
row_count: usize,
} RecordBatch has row_count and by default it returns 1000 rows at a time. When reading a parquet from a parquet file using an sql query. Is there anyway to configure this? let config = SessionConfig::new();
let config = SessionConfig::with_batch_size(config, 5000);
let ctx = SessionContext::with_config(config);
I've tried changing the ExecutionOptions batch_size but it still returns a 1000 rows per batch. I could not find any other options that can configure this. However changing the query to |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
I think the |
Beta Was this translation helpful? Give feedback.
I think the
batch_size
https://docs.rs/datafusion/17.0.0/datafusion/config/struct.ExecutionOptions.html#structfield.batch_size is the correct setting. I don't think the size is guaranteed. The 1000 rows is likely coming from the underlying reader (how are you registeringalltypes_plain
?)