Skip to content

Upgrade to datafusion 47 #3016

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 30, 2025
Merged

Upgrade to datafusion 47 #3016

merged 16 commits into from
Apr 30, 2025

Conversation

AdamGS
Copy link
Contributor

@AdamGS AdamGS commented Apr 15, 2025

Getting ready for the datafusion 47 release.

Current issues

@robert3005
Copy link
Contributor

They have interesting feature of apache/datafusion#15495 per partition stats which would be useful for us

@AdamGS AdamGS force-pushed the adamg/datafusion-47 branch from 4e9a4c3 to 702bb5e Compare April 15, 2025 14:19
@AdamGS AdamGS force-pushed the adamg/datafusion-47 branch from 702bb5e to 4385d01 Compare April 15, 2025 14:21
Copy link

cloudflare-workers-and-pages bot commented Apr 18, 2025

Deploying vortex-bench with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1ee98ab
Status: ✅  Deploy successful!
Preview URL: https://f86d97a0.vortex-bench.pages.dev
Branch Preview URL: https://adamg-datafusion-47.vortex-bench.pages.dev

View logs

Copy link

codspeed-hq bot commented Apr 18, 2025

CodSpeed Performance Report

Merging #3016 will not alter performance

Comparing adamg/datafusion-47 (1ee98ab) with develop (6c19dbd)

Summary

✅ 811 untouched benchmarks

@robert3005
Copy link
Contributor

it's out #3061

@AdamGS
Copy link
Contributor Author

AdamGS commented Apr 20, 2025

duckdb-rs still uses an older arrow version, I assume I can patch it here given we don't currently release vortex-duckdb but maybe the better solution long term is to through FFI-like objects (same is true for datafusion <> vortex boundary), that way we're not dependent on a specific crate dependency which shouldn't matter in these cases.

@robert3005
Copy link
Contributor

we have own fork of duckdb-rs so you can bump things there https://github.com/spiraldb/duckdb-rs for now but I agree that not having tight coupling is nice

@AdamGS
Copy link
Contributor Author

AdamGS commented Apr 20, 2025

that's a very good point :)

@AdamGS
Copy link
Contributor Author

AdamGS commented Apr 20, 2025

there's some submodule issue here that I don't want to debug over the weekend, I'm sure alex/joe will be able to fix it in two seconds on tuesday 🦆

@AdamGS AdamGS changed the title [WIP] Datafusion 47 upgrade Upgrade to datafusion 47 Apr 22, 2025
use reqwest::Url;

#[derive(Debug)]
pub struct SlowObjectStore {
Copy link
Contributor Author

@AdamGS AdamGS Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anyone ever uses it, so might as well remove it. If we ever want to go back to using it its always available here.

@AdamGS AdamGS marked this pull request as ready for review April 22, 2025 10:19
@AdamGS AdamGS requested a review from onursatici April 22, 2025 10:19
@AdamGS
Copy link
Contributor Author

AdamGS commented Apr 22, 2025

@a10y I would love to merge this, but the Azure fix wasn't released yet. Is that a blocker for you right now? I think their timeline is by the end of the month.

@AdamGS AdamGS requested a review from a10y April 22, 2025 10:21
@a10y
Copy link
Contributor

a10y commented Apr 22, 2025

I think if we pin the object_store version for vortex_Jin explicitly to 0.11.2 for now and then we can go back to workspace dep after they release? Then it shouldn't be an issue. JNI crate doesn't depend on datafusion

@AdamGS
Copy link
Contributor Author

AdamGS commented Apr 22, 2025

SGTM, I'll do that in a second

@AdamGS
Copy link
Contributor Author

AdamGS commented Apr 22, 2025

actually that doesn't work because other crates do depend on 0.11.2, and vortex-jni enables the object_store feature.
I'm not sure I have a good idea as to how to fix this issue, I'll keep thinking about it.

@AdamGS AdamGS enabled auto-merge (squash) April 30, 2025 09:58
@gatesn gatesn disabled auto-merge April 30, 2025 10:23
@gatesn gatesn merged commit f0323f2 into develop Apr 30, 2025
32 of 33 checks passed
@gatesn gatesn deleted the adamg/datafusion-47 branch April 30, 2025 10:23
@gatesn gatesn added the chore label May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants