-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split load partition and search partition into two different steps and parallelize based on different configs #3459
base: main
Are you sure you want to change the base?
Conversation
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3459 +/- ##
==========================================
+ Coverage 78.80% 78.84% +0.03%
==========================================
Files 251 251
Lines 92834 92887 +53
Branches 92834 92887 +53
==========================================
+ Hits 73156 73234 +78
+ Misses 16702 16677 -25
Partials 2976 2976
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense to me, just need to do some benchmark to see the search performance
@@ -215,3 +226,53 @@ pub trait VectorIndex: Send + Sync + std::fmt::Debug + Index { | |||
/// the index type of this vector index. | |||
fn sub_index_type(&self) -> (SubIndexType, QuantizationType); | |||
} | |||
|
|||
#[async_trait] | |||
pub trait ParallelSearchInPartitionFunctions : Send + Sync{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably name it to PartitionSearcher
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
fn io_parallelism(&self) -> usize; | ||
} | ||
|
||
pub async fn parallel_search_in_partitions<T: Send + Sync + std::fmt::Debug>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is nice and it looks it can reduce the cold search latency, it would be nice if you can add some benchmark for this, that can also help us figure out what the perf impact is! see rust/lance/benches
for more details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is nice. we can now control dispatch of cpu intensive tasks for IVF all in one place. (on warming path)
.map_err(|e| { | ||
DataFusionError::Execution(format!( | ||
"Failed to calculate KNN: {}", | ||
e | ||
)) | ||
}) | ||
.await | ||
.await?; | ||
concat_batches(&batches[0].schema(), &batches).map_err(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to concat the batches? iiuc we didn't do this before this?
@@ -215,3 +226,53 @@ pub trait VectorIndex: Send + Sync + std::fmt::Debug + Index { | |||
/// the index type of this vector index. | |||
fn sub_index_type(&self) -> (SubIndexType, QuantizationType); | |||
} | |||
|
|||
#[async_trait] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BubbleCal is this code used in V3 at all? I have async_trait
-phobia now lol. Because it makes some optimizations no longer possible
pre_filter.wait_for_ready().await?; | ||
let query = self.preprocess_query(partition_id, query)?; | ||
|
||
spawn_cpu(move || { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's move spawn_cpu
into parallel_search_in_partitions
. So the execution topology decision is made at a high layers.
No description provided.