Skip to content

Commit 3bc7714

Browse files
authored
memory pool example (#12849)
1 parent 3b6aac2 commit 3bc7714

File tree

1 file changed

+25
-1
lines changed
  • datafusion/execution/src/memory_pool

1 file changed

+25
-1
lines changed

datafusion/execution/src/memory_pool/mod.rs

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,11 +68,35 @@ pub use pool::*;
6868
/// Note that a `MemoryPool` can be shared by concurrently executing plans,
6969
/// which can be used to control memory usage in a multi-tenant system.
7070
///
71+
/// # How MemoryPool works by example
72+
///
73+
/// Scenario 1:
74+
/// For `Filter` operator, `RecordBatch`es will stream through it, so it
75+
/// don't have to keep track of memory usage through [`MemoryPool`].
76+
///
77+
/// Scenario 2:
78+
/// For `CrossJoin` operator, if the input size gets larger, the intermediate
79+
/// state will also grow. So `CrossJoin` operator will use [`MemoryPool`] to
80+
/// limit the memory usage.
81+
/// 2.1 `CrossJoin` operator has read a new batch, asked memory pool for
82+
/// additional memory. Memory pool updates the usage and returns success.
83+
/// 2.2 `CrossJoin` has read another batch, and tries to reserve more memory
84+
/// again, memory pool does not have enough memory. Since `CrossJoin` operator
85+
/// has not implemented spilling, it will stop execution and return an error.
86+
///
87+
/// Scenario 3:
88+
/// For `Aggregate` operator, its intermediate states will also accumulate as
89+
/// the input size gets larger, but with spilling capability. When it tries to
90+
/// reserve more memory from the memory pool, and the memory pool has already
91+
/// reached the memory limit, it will return an error. Then, `Aggregate`
92+
/// operator will spill the intermediate buffers to disk, and release memory
93+
/// from the memory pool, and continue to retry memory reservation.
94+
///
7195
/// # Implementing `MemoryPool`
7296
///
7397
/// You can implement a custom allocation policy by implementing the
7498
/// [`MemoryPool`] trait and configuring a `SessionContext` appropriately.
75-
/// However, mDataFusion comes with the following simple memory pool implementations that
99+
/// However, DataFusion comes with the following simple memory pool implementations that
76100
/// handle many common cases:
77101
///
78102
/// * [`UnboundedMemoryPool`]: no memory limits (the default)

0 commit comments

Comments
 (0)