Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs improvement] Piping data between clusters (including topic keys, Avro topics) #52

Open
whatsupbros opened this issue May 18, 2022 · 0 comments

Comments

@whatsupbros
Copy link

whatsupbros commented May 18, 2022

Thanks again for this great CLI tool, made for humans!

I really love the feature of piping data between topics in different clusters with such a command:

zoe -c remote topics consume input --continuously | zoe -c local topics produce -t output --from-stdin --streaming

And it works perfectly fine for this simple case.

But would it be possible to elaborate more on some more complex, but still common use cases in the examples/documentation?

Piping Avro data between clusters

In this use case it is important to understand how exactly to migrate topic schemas together with data, when the schemas do not yet exist in the target cluster.
Ideally, this is to be done as transparent for the user as possible (zoe should be able to publish record schemas from the source cluster to target cluster Schema Registry automatically and without issues.
But even with the manual approach, it is not quite clear how to accompish this task, when there are multiple schemas for the source topic, and when there are records in the topic for more than one topic schema.
Currently, it doesn't seem to be possible, due to the fact that we can to request only the latest topic schema with zoe (see issue #50). Also, piping of the schemas doesn't work for now (see issue #49).
So, the only option left is to use curl or postman to publish all topic schemas upfront in the target cluster, and then start with piping of the data. However, I am not sure this is the proper way to do that all.

Piping data with keys between clusters

The trivial case, described in the examples, doesn't consider topic keys.
As I understood, the only way to print keys with zoe is to use --expose-metadata option for the consumer.
However, when you do so, you change the topic schema implicitely.
I tried to use such a command to "fix" the issue and to extract the key:

zoe --cluster remote topics consume my-topic --continiously --expose-metadata \
| zoe --cluster local topics produce --from-stdin --topic my-topic --subject my-topic-value --key-path ".__metadata__.key" --value-path "del(.__metadata__)" --streaming

But unfortunately it didn't work for me (see issue #51).
So, a recommendation on piping data between topics including with zoe would really beneficial.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant