Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gRPC reflection support to arrow flight #7009

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jacobsimpson
Copy link

Which issue does this PR close?

Closes #1706.

Rationale for this change

I'd like to include the Flight service in the reflection service in one of my projects.

What changes are included in this PR?

  • Added generation of the necessary binary data to the protobuf compile steps.
  • Added constants representing the generated binary data in the compiled library.
  • Rustdoc for the new constants.
  • Added tonic-reflection to the dev dependencies so cargo test --doc and cargo test --doc --features "flight-sql-experimental" work.

Are there any user-facing changes?

New constants should be visible to library consumers. Should be backwards compatible.

Testing

Using a simple test service initialised using the reflection code in the rustdocs, grpcurl says:

$ grpcurl --plaintext localhost:50051 list
arrow.flight.protocol.FlightService
grpc.reflection.v1.ServerReflection```

Addresses apache#1706, adding a
constant to the gRPC generated code so it is possible to include the
Flight and Flight SQL APIs in a `tonic-reflection` service.

Testing:

```
grpcurl --plaintext localhost:8082 list
```
@github-actions github-actions bot added arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels Jan 22, 2025
Addresses apache#1706, adding a
constant to the gRPC generated code so it is possible to include the
Flight and Flight SQL APIs in a `tonic-reflection` service.

Testing:

```
grpcurl --plaintext localhost:8082 list
```
@@ -78,6 +78,7 @@ tracing-log = { version = "0.2" }
tracing-subscriber = { version = "0.3.1", default-features = false, features = ["ansi", "env-filter", "fmt"] }
tokio = { version = "1.0", default-features = false, features = ["macros", "rt", "rt-multi-thread"] }
tokio-stream = { version = "0.1", features = ["net"] }
tonic-reflection = "0.12.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should put the addition in this PR behind a feature flag and mark this dependency as optional?

Copy link
Contributor

@alamb alamb Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please

Update: I see this is only a dev dependency -- so I think it is ok not to feature flag it

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jacobsimpson

When I tried to re-create the checked in binrary (on a mac), it seems to be different

How did you generate the binary in this PR?

$ rm arrow-flight/src/sql/flight_sql_descriptor.bin
$ ./arrow-flight/regen.sh
...
$ git status
On branch add-file-descriptor-set
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   arrow-flight/src/sql/flight_sql_descriptor.bin

(I think this is the root cause of the CI failure as well: https://github.com/apache/arrow-rs/actions/runs/12917873056/job/36070514891?pr=7009)

@jacobsimpson
Copy link
Author

I used ./regen.sh to generate it. I'll have access to a Mac tomorrow and I can try this procedure again and look at the file differences.

In the mean time, following that test procedure locally I see:

$ uname -a
Linux silverfox 5.15.0-53-generic #59-Ubuntu SMP Mon Oct 17 18:53:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ cd arrow-rs/arrow-flight
$  git status
On branch add-file-descriptor-set
Your branch is up to date with 'origin/add-file-descriptor-set'.

nothing to commit, working tree clean
$ rm src/sql/flight_sql_descriptor.bin
$ ./regen.sh
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running `/home/jsimpson/src/arrow-rs/target/debug/gen`
cargo:rerun-if-changed=../format/Flight.proto
cargo:rerun-if-changed=../format
cargo:rerun-if-changed=../format/FlightSql.proto
cargo:rerun-if-changed=../format
$  git status
On branch add-file-descriptor-set
Your branch is up to date with 'origin/add-file-descriptor-set'.

nothing to commit, working tree clean

@jacobsimpson
Copy link
Author

I used ./regen.sh to generate it. I'll have access to a Mac tomorrow and I can try this procedure again and look at the file differences.

There might be a challenge here. From what I can tell, the file descriptor set is itself a serialized proto message.

protoc --decode_raw < src/flight_descriptor.bin

And serialized proto messages are not intended to be cannonical. The docs have a list of example conditions that can cause the serialization to change, however it is intentionally incomplete.

I'll continue to look into this, however at the moment it appears that we can't depend on one machine being able to produce the same file descriptor sets as another machine.

@alamb
Copy link
Contributor

alamb commented Jan 24, 2025

I'll continue to look into this, however at the moment it appears that we can't depend on one machine being able to produce the same file descriptor sets as another machine.

Can the descriptor be used by a machine on which it wasn't created 🤔

Maybe we would have to always regenerate this file on build (rather than have it checked in)

That comes with its own problems (e.g. increasing the reuqired depenednecies to build arrow-flight)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add gRPC reflection support to arrow flight
3 participants