Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial FFI crate + Java bindings #2516

Merged
merged 16 commits into from
Mar 7, 2025
Merged

feat: initial FFI crate + Java bindings #2516

merged 16 commits into from
Mar 7, 2025

Conversation

a10y
Copy link
Contributor

@a10y a10y commented Feb 25, 2025

This PR adds a new vortex-ffi crate, which builds a shared library that is then consumed by the vortex-jni java project. This PR previously included some initial Spark bindings but to prevent this getting any larger I've decided to offload that to another PR.

Some things to note

  • We're using JavaLanguageVersion.of(17) b/c that is the maximum version supported by Spark 3.5
  • I based the FFI bindings mostly off of vibes, some similar code I've written before to bridge HF Tokenizers library to Java, and arrow-rs's FFI functions. I have tests in a number of the FFI files but once we're in raw pointers land there be some unavoidable dragons.
  • The Java project is not hooked up to CI right now, but if you feel inclined to pull and run the tests locally, then as long as you have a JDK 17 installed and the TPCH data generated, you can run ./gradlew test from the java directory to run the basic scan test.

FLUPs:

  • Hook Java up to CI
  • Complete v0 Spark data source
  • FFI compress/write APIs
  • Iceberg connection

@a10y a10y changed the title [WIP] feat: Java bindings + Spark V2 datasource feat: initial FFI crate + Java bindings Feb 26, 2025
@a10y a10y marked this pull request as ready for review February 26, 2025 22:42
Copy link

codspeed-hq bot commented Feb 26, 2025

CodSpeed Performance Report

Merging #2516 will degrade performances by 26.52%

Comparing aduffy/java (1f173b9) with develop (8ba5f27)

Summary

⚡ 42 improvements
❌ 13 regressions
✅ 720 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
chunked_dict_fsst_into_canonical[(1000, 1000, 10)] 1.3 ms 1.2 ms +10.02%
chunked_dict_primitive_into_canonical[f32, (1000, 10, 10)] 186.4 µs 86.7 µs ×2.2
chunked_dict_primitive_into_canonical[f32, (1000, 10, 100)] 1,593.1 µs 717.1 µs ×2.2
chunked_dict_primitive_into_canonical[f32, (1000, 100, 10)] 186.1 µs 88.5 µs ×2.1
chunked_dict_primitive_into_canonical[f32, (1000, 100, 100)] 1,628.5 µs 733.4 µs ×2.2
chunked_dict_primitive_into_canonical[f32, (1000, 1000, 10)] 205.6 µs 105 µs +95.69%
chunked_dict_primitive_into_canonical[f32, (1000, 1000, 100)] 1,779.2 µs 896.6 µs +98.44%
chunked_dict_primitive_into_canonical[f64, (1000, 10, 10)] 212.2 µs 105.2 µs ×2
chunked_dict_primitive_into_canonical[f64, (1000, 10, 100)] 1,849.1 µs 901.2 µs ×2.1
chunked_dict_primitive_into_canonical[f64, (1000, 100, 10)] 215.9 µs 108.4 µs +99.05%
chunked_dict_primitive_into_canonical[f64, (1000, 100, 100)] 1,903.9 µs 933.6 µs ×2
chunked_dict_primitive_into_canonical[f64, (1000, 1000, 10)] 250.2 µs 141.2 µs +77.19%
chunked_dict_primitive_into_canonical[f64, (1000, 1000, 100)] 2.2 ms 1.3 ms +75.82%
chunked_dict_primitive_into_canonical[u32, (1000, 10, 10)] 184.1 µs 86.4 µs ×2.1
chunked_dict_primitive_into_canonical[u32, (1000, 10, 100)] 1,611.1 µs 717.2 µs ×2.2
chunked_dict_primitive_into_canonical[u32, (1000, 100, 10)] 188.1 µs 88.1 µs ×2.1
chunked_dict_primitive_into_canonical[u32, (1000, 100, 100)] 1,627.7 µs 733.2 µs ×2.2
chunked_dict_primitive_into_canonical[u32, (1000, 1000, 10)] 206.2 µs 104.6 µs +97.12%
chunked_dict_primitive_into_canonical[u32, (1000, 1000, 100)] 1,778.5 µs 896.3 µs +98.43%
chunked_dict_primitive_into_canonical[u64, (1000, 10, 10)] 217.1 µs 105 µs ×2.1
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@@ -0,0 +1,251 @@
#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello darkness my old friend

@danking
Copy link
Member

danking commented Feb 26, 2025

I tried cd java && ./gradlew test and it failed; am I doing something obviously wrong?

# RUST_BACKTRACE=1 ./gradlew test

> Task :vortex-jni:cargoCheck
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s

> Task :vortex-jni:cargoBuild
    Finished `release` profile [optimized] target(s) in 0.12s

thread '<unnamed>' panicked at /Users/danielking/projects/vortex/vortex-error/src/lib.rs:291:33:
element: Expected 1 buffers, got 0
Backtrace:
   0: std::backtrace::Backtrace::create
   1: vortex_fastlanes::for::serde::<impl vortex_array::vtable::serde::SerdeVTable<&vortex_fastlanes::for::FoRArray> for vortex_fastlanes::for::FoREncoding>::decode
   2: vortex_array::serde::ArrayParts::decode
   3: vortex_array::arrays::extension::serde::<impl vortex_array::vtable::serde::SerdeVTable<&vortex_array::arrays::extension::ExtensionArray> for vortex_array::arrays::extension::ExtensionEncoding>::decode
   4: vortex_array::serde::ArrayParts::decode
   5: <async_once_cell::InitFuture<T,F> as core::future::future::Future>::poll::{{closure}}
   6: vortex_layout::layouts::flat::eval_expr::<impl vortex_layout::reader::ExprEvaluator for vortex_layout::layouts::flat::reader::FlatReader>::evaluate_expr::{{closure}}
   7: <alloc::sync::Arc<dyn vortex_layout::reader::LayoutReader> as vortex_layout::reader::ExprEvaluator>::evaluate_expr::{{closure}}
   8: vortex_layout::layouts::chunked::eval_expr::<impl vortex_layout::reader::ExprEvaluator for vortex_layout::layouts::chunked::reader::ChunkedReader>::evaluate_expr::{{closure}}
   9: <alloc::sync::Arc<dyn vortex_layout::reader::LayoutReader> as vortex_layout::reader::ExprEvaluator>::evaluate_expr::{{closure}}
  10: vortex_layout::layouts::stats::eval_expr::<impl vortex_layout::reader::ExprEvaluator for vortex_layout::layouts::stats::reader::StatsReader>::evaluate_expr::{{closure}}
  11: <alloc::sync::Arc<dyn vortex_layout::reader::LayoutReader> as vortex_layout::reader::ExprEvaluator>::evaluate_expr::{{closure}}
  12: <futures_util::future::try_join_all::TryJoinAll<F> as core::future::future::Future>::poll
  13: vortex_layout::layouts::struct_::eval_expr::<impl vortex_layout::reader::ExprEvaluator for vortex_layout::layouts::struct_::reader::StructReader>::evaluate_expr::{{closure}}
  14: <alloc::sync::Arc<dyn vortex_layout::reader::LayoutReader> as vortex_layout::reader::ExprEvaluator>::evaluate_expr::{{closure}}
  15: <vortex_layout::scan::executor::threads::ExecutorTask<F,R> as vortex_layout::scan::executor::threads::Task>::run
  16: std::sys::backtrace::__rust_begin_short_backtrace
  17: core::ops::function::FnOnce::call_once{{vtable.shim}}
  18: std::sys::pal::unix::thread::Thread::new::thread_start
  19: __pthread_joiner_wake

stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: <core::result::Result<T,E> as vortex_error::VortexExpect>::vortex_expect::{{closure}}::panic_cold_display
   3: <core::result::Result<T,E> as vortex_error::VortexExpect>::vortex_expect::{{closure}}
   4: _FFIArrayStream_next
   5: <unknown>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: failed to initiate panic, error 5

> Task :vortex-jni:test FAILED

[Incubating] Problems report is available at: file:///Users/danielking/projects/vortex/java/build/reports/problems/problems-report.html

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':vortex-jni:test'.
> Process 'Gradle Test Executor 3' finished with non-zero exit value 134
  This problem might be caused by incorrect test process configuration.
  For more on test execution, please refer to https://docs.gradle.org/8.12/userguide/java_testing.html#sec:test_execution in the Gradle documentation.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.12/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD FAILED in 1s
7 actionable tasks: 3 executed, 4 up-to-date

@a10y
Copy link
Contributor Author

a10y commented Feb 27, 2025

It sounds like you need to regenerate the TPC-H files? So maybe there was a break sometime in the past day or so from when I branched

@a10y a10y force-pushed the aduffy/java branch 2 times, most recently from 77d5f2a to e6a5d1c Compare February 27, 2025 00:58
@a10y
Copy link
Contributor Author

a10y commented Feb 27, 2025

I've rebased to incorporate any format breaks from the past 24 hours

import com.sun.jna.Pointer;
import com.sun.jna.PointerType;

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll note that every type generally has an associated XYZ_free() function. My follow-on goal is to add nicer Java builders/wrappers that are AutoCloseable with @MustClose to make it harder to accidentally leak these things.

@danking
Copy link
Member

danking commented Feb 27, 2025

My error is gone with latest push!

@a10y a10y mentioned this pull request Feb 27, 2025
@a10y a10y force-pushed the aduffy/java branch 2 times, most recently from f209ae7 to 055da4f Compare March 5, 2025 19:38
@a10y
Copy link
Contributor Author

a10y commented Mar 5, 2025

Iceberg actually builds to Java 11 so had to bring target language level here down from 17 -> 11. Honestly didn't change much, only thing I was using was the switch expressions

Copy link
Member

@robert3005 robert3005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use AutoCloseable already to avoid further breaks, @MustClose and wrappers can come later imho. AutoCloseable is the correct interace to implement in java after java 7

@a10y a10y requested a review from robert3005 March 7, 2025 16:40
@a10y a10y merged commit 41f90a7 into develop Mar 7, 2025
26 of 27 checks passed
@a10y a10y deleted the aduffy/java branch March 7, 2025 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants