Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment prefetcher #2613

Open
wants to merge 21 commits into
base: develop
Choose a base branch
from
Open

Segment prefetcher #2613

wants to merge 21 commits into from

Conversation

onursatici
Copy link
Contributor

@onursatici onursatici commented Mar 6, 2025

No description provided.

Copy link
Contributor

github-actions bot commented Mar 6, 2025

Benchmarks: TPC-H on NVME

Table of Results
name PR 0223df4 base 1da651f ratio (PR/base) unit
tpch_q01/arrow 45993094 45878200 1.0025 ns
tpch_q02/arrow 50045271 51316854 0.975221 ns
tpch_q03/arrow 31541848 31972362 0.986535 ns
tpch_q04/arrow 22984335 23982446 0.958382 ns
tpch_q05/arrow 50172952 51163165 0.980646 ns
tpch_q06/arrow 10142713 9855101 1.02918 ns
tpch_q07/arrow 78791777 76744656 1.02667 ns
tpch_q08/arrow 57472358 58821869 0.977058 ns
tpch_q09/arrow 71523790 71986571 0.993571 ns
tpch_q10/arrow 46133059 46904505 0.983553 ns
tpch_q11/arrow 24973618 25652803 0.973524 ns
tpch_q12/arrow 27509430 29521343 0.931849 ns
tpch_q13/arrow 15950679 17129633 0.931175 ns
tpch_q14/arrow 14301217 14198328 1.00725 ns
tpch_q15/arrow 26801080 26248691 1.02104 ns
tpch_q16/arrow 21870679 22234645 0.983631 ns
tpch_q17/arrow 61360329 63089730 0.972588 ns
tpch_q18/arrow 100463126 101567374 0.989128 ns
tpch_q19/arrow 27911990 27907861 1.00015 ns
tpch_q20/arrow 34748187 35521459 0.978231 ns
tpch_q21/arrow 117263814 117231968 1.00027 ns
tpch_q22/arrow 16185467 15328996 1.05587 ns
tpch_q01/parquet 120520469 130971366 0.920205 ns
tpch_q02/parquet 117065821 122466029 0.955904 ns
tpch_q03/parquet 110887023 114415920 0.969157 ns
tpch_q04/parquet 65054831 65654388 0.990868 ns
tpch_q05/parquet 124725506 122994993 1.01407 ns
tpch_q06/parquet 26851043 27838054 0.964545 ns
tpch_q07/parquet 141380358 145606526 0.970975 ns
tpch_q08/parquet 170441544 171302984 0.994971 ns
tpch_q09/parquet 223163764 221804627 1.00613 ns
tpch_q10/parquet 136137652 140733694 0.967342 ns
tpch_q11/parquet 58014913 56992506 1.01794 ns
tpch_q12/parquet 96229551 99311413 0.968968 ns
tpch_q13/parquet 153181027 157387560 0.973273 ns
tpch_q14/parquet 46074094 48073267 0.958414 ns
tpch_q15/parquet 64634391 68847427 0.938806 ns
tpch_q16/parquet 52528929 53021129 0.990717 ns
tpch_q17/parquet 133829547 140518236 0.9524 ns
tpch_q18/parquet 194660037 201291831 0.967054 ns
tpch_q19/parquet 76787456 80173651 0.957764 ns
tpch_q20/parquet 94171183 104222206 0.903562 ns
tpch_q21/parquet 189871110 198138607 0.958274 ns
tpch_q22/parquet 51700697 53684605 0.963045 ns
tpch_q01/vortex-file-compressed 37242007 39064620 0.953344 ns
tpch_q02/vortex-file-compressed 55844330 61057031 0.914626 ns
tpch_q03/vortex-file-compressed 31042326 35878231 0.865213 ns
tpch_q04/vortex-file-compressed 18543689 22212947 0.834814 ns
tpch_q05/vortex-file-compressed 46203428 50360326 0.917457 ns
tpch_q06/vortex-file-compressed 10106862 10035759 1.00708 ns
tpch_q07/vortex-file-compressed 71177885 71152121 1.00036 ns
tpch_q08/vortex-file-compressed 55002909 58987048 0.932457 ns
tpch_q09/vortex-file-compressed 71064750 71911534 0.988225 ns
tpch_q10/vortex-file-compressed 56626913 55644444 1.01766 ns
tpch_q11/vortex-file-compressed 27348665 25084056 1.09028 ns
tpch_q12/vortex-file-compressed 21780358 22036299 0.988385 ns
tpch_q13/vortex-file-compressed 26156443 26626746 0.982337 ns
tpch_q14/vortex-file-compressed 13528166 16166786 0.836788 ns
tpch_q15/vortex-file-compressed 28520307 31461162 0.906524 ns
tpch_q16/vortex-file-compressed 28752043 30584088 0.940098 ns
tpch_q17/vortex-file-compressed 57294023 59045393 0.970339 ns
tpch_q18/vortex-file-compressed 88599574 88076577 1.00594 ns
tpch_q19/vortex-file-compressed 29674821 31841926 0.931942 ns
tpch_q20/vortex-file-compressed 36859209 40210776 0.91665 ns
tpch_q21/vortex-file-compressed 88308757 92592010 0.953741 ns
tpch_q22/vortex-file-compressed 27636702 31815486 0.868656 ns

Copy link
Contributor

github-actions bot commented Mar 6, 2025

Benchmarks: Clickbench on NVME

Table of Results
name PR 0223df4 base 1da651f ratio (PR/base) unit
clickbench_q00/parquet 2375353 2557915 0.928629 ns
clickbench_q01/parquet 33543799 32049024 1.04664 ns
clickbench_q02/parquet 63194207 65740255 0.961271 ns
clickbench_q03/parquet 53041657 51449467 1.03095 ns
clickbench_q04/parquet 318832194 325894840 0.978328 ns
clickbench_q05/parquet 304638309 315031281 0.96701 ns
clickbench_q06/parquet 2322985 2323399 0.999822 ns
clickbench_q07/parquet 30735400 35152599 0.874342 ns
clickbench_q08/parquet 379599869 395388476 0.960068 ns
clickbench_q09/parquet 567391806 561938047 1.00971 ns
clickbench_q10/parquet 115277615 120778633 0.954454 ns
clickbench_q11/parquet 140198922 142438542 0.984277 ns
clickbench_q12/parquet 312727226 312390317 1.00108 ns
clickbench_q13/parquet 485517681 478820207 1.01399 ns
clickbench_q14/parquet 318796098 319695397 0.997187 ns
clickbench_q15/parquet 348956925 353777893 0.986373 ns
clickbench_q16/parquet 748568144 768210056 0.974432 ns
clickbench_q17/parquet 664590344 663541735 1.00158 ns
clickbench_q18/parquet 1475999139 1553262830 0.950257 ns
clickbench_q19/parquet 43037033 42063166 1.02315 ns
clickbench_q20/parquet 551456090 540536122 1.0202 ns
clickbench_q21/parquet 626175103 625458321 1.00115 ns
clickbench_q22/parquet 961832942 967487158 0.994156 ns
clickbench_q23/parquet 3858353114 3864000390 0.998538 ns
clickbench_q24/parquet 192573508 196023788 0.982399 ns
clickbench_q25/parquet 165941510 174614026 0.950333 ns
clickbench_q26/parquet 218361301 223360314 0.977619 ns
clickbench_q27/parquet 746990933 754602509 0.989913 ns
clickbench_q28/parquet 4388353654 4490987832 0.977147 ns
clickbench_q29/parquet 240865873 242882836 0.991696 ns
clickbench_q30/parquet 322223601 322516185 0.999093 ns
clickbench_q31/parquet 368442773 373997932 0.985147 ns
clickbench_q32/parquet 1707242134 1753330707 0.973714 ns
clickbench_q33/parquet 1500823784 1498549010 1.00152 ns
clickbench_q34/parquet 1496643101 1472960836 1.01608 ns
clickbench_q35/parquet 503655683 505379169 0.99659 ns
clickbench_q36/parquet 149495409 149206152 1.00194 ns
clickbench_q37/parquet 69352282 70101999 0.989305 ns
clickbench_q38/parquet 92382737 96119590 0.961123 ns
clickbench_q39/parquet 275483974 278612211 0.988772 ns
clickbench_q40/parquet 42983994 46083031 0.932751 ns
clickbench_q41/parquet 42449737 43361221 0.978979 ns
clickbench_q42/parquet 52951710 51401499 1.03016 ns
clickbench_q00/vortex-file-compressed 4042190 4412330 0.916112 ns
clickbench_q01/vortex-file-compressed 19419739 19628127 0.989383 ns
clickbench_q02/vortex-file-compressed 31941352 31998282 0.998221 ns
clickbench_q03/vortex-file-compressed 40350195 41593280 0.970113 ns
clickbench_q04/vortex-file-compressed 297741010 303078841 0.982388 ns
clickbench_q05/vortex-file-compressed 336381640 321075395 1.04767 ns
clickbench_q06/vortex-file-compressed 4053703 4259348 0.951719 ns
clickbench_q07/vortex-file-compressed 21513789 19828215 1.08501 ns
clickbench_q08/vortex-file-compressed 368533180 369284279 0.997966 ns
clickbench_q09/vortex-file-compressed 491077143 496709350 0.988661 ns
clickbench_q10/vortex-file-compressed 64973747 68856527 0.943611 ns
clickbench_q11/vortex-file-compressed 69306186 77755028 0.89134 ns
clickbench_q12/vortex-file-compressed 258260165 254428160 1.01506 ns
clickbench_q13/vortex-file-compressed 359313623 364452994 0.985898 ns
clickbench_q14/vortex-file-compressed 259972240 248485136 1.04623 ns
clickbench_q15/vortex-file-compressed 364463512 374344917 0.973603 ns
clickbench_q16/vortex-file-compressed 761663848 776901978 0.980386 ns
clickbench_q17/vortex-file-compressed 750097315 755262662 0.993161 ns
clickbench_q18/vortex-file-compressed 1293044137 1290771956 1.00176 ns
clickbench_q19/vortex-file-compressed 28552387 30302768 0.942237 ns
clickbench_q20/vortex-file-compressed 244472030 239188855 1.02209 ns
clickbench_q21/vortex-file-compressed 282587028 276671604 1.02138 ns
clickbench_q22/vortex-file-compressed 465389489 460296729 1.01106 ns
clickbench_q23/vortex-file-compressed 841648512 773083146 1.08869 ns
clickbench_q24/vortex-file-compressed 89019207 87143606 1.02152 ns
clickbench_q25/vortex-file-compressed 95657398 95990729 0.996527 ns
clickbench_q26/vortex-file-compressed 122366184 118823373 1.02982 ns
clickbench_q27/vortex-file-compressed 534729619 565038389 0.94636 ns
clickbench_q28/vortex-file-compressed 5036116870 5181550228 0.971932 ns
clickbench_q29/vortex-file-compressed 235911720 271526545 0.868835 ns
clickbench_q30/vortex-file-compressed 234089589 220405097 1.06209 ns
clickbench_q31/vortex-file-compressed 233705178 235514332 0.992318 ns
clickbench_q32/vortex-file-compressed 1321641576 1288460932 1.02575 ns
clickbench_q33/vortex-file-compressed 1246627210 1201841697 1.03726 ns
clickbench_q34/vortex-file-compressed 1259230385 1206333992 1.04385 ns
clickbench_q35/vortex-file-compressed 593807205 614798892 0.965856 ns
clickbench_q36/vortex-file-compressed 88145661 59073880 1.49213 ns
clickbench_q37/vortex-file-compressed 62309427 36091101 1.72645 ns
clickbench_q38/vortex-file-compressed 63264813 25052349 2.5253 ns
clickbench_q39/vortex-file-compressed 186918991 106362396 1.75738 ns
clickbench_q40/vortex-file-compressed 44668914 25493864 1.75214 ns
clickbench_q41/vortex-file-compressed 36918298 23184944 1.59234 ns
clickbench_q42/vortex-file-compressed 33778122 32922194 1.026 ns

Copy link
Contributor

github-actions bot commented Mar 6, 2025

Benchmarks: TPC-H on S3

Table of Results
name PR 0223df4 base 1da651f ratio (PR/base) unit
tpch_q01/parquet 256496086 259018588 0.990261 ns
tpch_q02/parquet 661926168 689823252 0.959559 ns
tpch_q03/parquet 411461013 427461083 0.96257 ns
tpch_q04/parquet 222507909 240765164 0.92417 ns
tpch_q05/parquet 569029335 574779118 0.989997 ns
tpch_q06/parquet 183692281 182620347 1.00587 ns
tpch_q07/parquet 608946111 623449640 0.976737 ns
tpch_q08/parquet 782042062 812644930 0.962342 ns
tpch_q09/parquet 694942481 694398116 1.00078 ns
tpch_q10/parquet 532012844 535751905 0.993021 ns
tpch_q11/parquet 273098730 289954110 0.941869 ns
tpch_q12/parquet 291436403 281701731 1.03456 ns
tpch_q13/parquet 394199885 392842115 1.00346 ns
tpch_q14/parquet 250966406 250670787 1.00118 ns
tpch_q15/parquet 476486284 476525759 0.999917 ns
tpch_q16/parquet 262725672 268808099 0.977373 ns
tpch_q17/parquet 386207919 402959123 0.95843 ns
tpch_q18/parquet 540249522 549294854 0.983533 ns
tpch_q19/parquet 277391945 275656407 1.0063 ns
tpch_q20/parquet 518112214 515901297 1.00429 ns
tpch_q21/parquet 618235341 602160861 1.02669 ns
tpch_q22/parquet 264368825 264812871 0.998323 ns
tpch_q01/vortex-file-compressed 127103105 144112018 0.881974 ns
tpch_q02/vortex-file-compressed 314836835 429859049 0.732419 ns
tpch_q03/vortex-file-compressed 202740322 288509000 0.702717 ns
tpch_q04/vortex-file-compressed 118648077 183838176 0.645394 ns
tpch_q05/vortex-file-compressed 260584008 315606108 0.825662 ns
tpch_q06/vortex-file-compressed 99656945 116466258 0.855672 ns
tpch_q07/vortex-file-compressed 314006479 390220806 0.804689 ns
tpch_q08/vortex-file-compressed 361954125 451942147 0.800886 ns
tpch_q09/vortex-file-compressed 385957655 404126522 0.955042 ns
tpch_q10/vortex-file-compressed 337331080 388778055 0.86767 ns
tpch_q11/vortex-file-compressed 127054005 165025146 0.769907 ns
tpch_q12/vortex-file-compressed 165649137 215244307 0.769587 ns
tpch_q13/vortex-file-compressed 205328860 216862084 0.946818 ns
tpch_q14/vortex-file-compressed 117797994 133623308 0.881568 ns
tpch_q15/vortex-file-compressed 269927144 304316651 0.886994 ns
tpch_q16/vortex-file-compressed 106793514 194740457 0.548389 ns
tpch_q17/vortex-file-compressed 164033881 216434921 0.75789 ns
tpch_q18/vortex-file-compressed 279097447 288497309 0.967418 ns
tpch_q19/vortex-file-compressed 156674434 192538225 0.813732 ns
tpch_q20/vortex-file-compressed 228473707 338538809 0.674882 ns
tpch_q21/vortex-file-compressed 336132523 494462621 0.679794 ns
tpch_q22/vortex-file-compressed 126805137 159175652 0.796637 ns

@@ -275,6 +275,22 @@ impl Layout {
.register_splits(self, field_mask, row_offset, splits)
}

pub fn required_segments(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docs to public functions

Copy link
Member

@robert3005 robert3005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass, I think I need to spend more time looking at ordering and cancellations

location: location.clone(),
callback: request.callback,
})
let inflight_segments: InflightSegments = Default::default();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just do

Suggested change
let inflight_segments: InflightSegments = Default::default();
let inflight_segments = InflightSegments::default();

Comment on lines +290 to +292
if items.is_empty() {
items.reserve(*this.requests_ready_chunks);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can pull this out of the loop to initialization of items

Comment on lines +309 to +311
if items.is_empty() {
items.reserve(*this.requests_ready_chunks);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then kill this

fn range(&self) -> Range<u64> {
self.location.offset..self.location.offset + self.location.length as u64
}
}

impl From<(SegmentId, Segment, oneshot::Receiver<()>)> for SegmentRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This from is way too cute, we should add a method directly on SegmentRequest

let nchunks = layout.nchildren() - (if layout.metadata().is_some() { 1 } else { 0 });
let mut offset = row_offset;
for i in 0..nchunks {
let child = layout.child(i, layout.dtype().clone(), format!("[{}]", i))?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let child = layout.child(i, layout.dtype().clone(), format!("[{}]", i))?;
let child = layout.child(i, layout.dtype().clone(), format!("[{i}]"))?;

};

let cancelled_segments: Vec<_> = {
let mut store = self.store.write().vortex_expect("poisoned lock");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut store = self.store.write().vortex_expect("poisoned lock");
let mut store = self.store.write()?;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants