Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve protobuf support for BigQuery Storage Write API #13873

Open
alvioshki opened this issue Nov 25, 2024 · 1 comment
Open

Improve protobuf support for BigQuery Storage Write API #13873

alvioshki opened this issue Nov 25, 2024 · 1 comment
Assignees
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@alvioshki
Copy link

The feature request in short: to implement a better support for/remove the protobuf restrictions for BigQuery Storage Write API in the C# client.

I've tried to migrate from BigQuery legacy streaming API - to use the BigQuery Storage Write API with a C# client library instead.
However the protobuf handling restrictions have prevented me from taking this path to the end:
https://cloud.google.com/bigquery/docs/write-api#proto_buffer_handling

I have existing proto3 contracts. They use package specifiers, imports for wrapper types and other features. However, these are restricted according to documentation. I've tried using it anyway, I have hit some limitations with imports, enums and default values (I've managed to worked around this one).

Changing the proto3 contracts is not an option for me, as I have too many of them. Writing custom proto2 contracts and mapping from one to another is a no-go as well. I've considered implementing a dynamic mapper, but I am not confident my investment would work in the end.

The documentation states that "The Java and Go clients support arbitrary protocol buffers, because the client library normalizes the protocol buffer schema.".

What does this mean? My guesss is that this means that these client libraries have built-in mechanisms to handle and adapt arbitrary protocol buffer schemas, allowing more flexibility compared to other languages. Perhaps adapt the schema to conform to the restrictions imposed by the BigQuery Write API, such as removing or adjusting package specifiers, flattening nested messages or enums to meet the top-level definition requirements, resolving external references and embedding them into a single schema, etc.

Given I understand it correctly, could this be implemented in the C# client?

I would like to know if such an improvement arrives. Until then I will be using the legacy streamin API.

@jskeet jskeet assigned amanda-tarafa and unassigned jskeet Nov 25, 2024
@jskeet jskeet added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Nov 25, 2024
@jskeet
Copy link
Collaborator

jskeet commented Nov 25, 2024

Thanks for the clear description. We'll talk to the BigQuery team internally and discuss the next steps.

@amanda-tarafa amanda-tarafa added priority: p2 Moderately-important priority. Fix may not be included in next release. api: bigquerystorage Issues related to the BigQuery Storage API. labels Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

3 participants