-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Preset.io source (Superset SaaS) #8188
Conversation
Deployment failed with the following error:
|
0769851
to
15fe267
Compare
@mayurinehate could you take a look at this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @betodealmeida , thanks for the PR.
I'd like to share this guide on adding a metadata ingestion source - https://datahubproject.io/docs/metadata-ingestion/adding-source/ that'll help you through the process. Especially please refer steps 4,5 (making the new plugin discoverable) and 9 (adding new platform).
Also, I have a couple of initial comments on the PR -
- Let's move new source to separate file preset.py. (you said it !)
- For lint, I believe you'd need to inherit PresetConfig from SupersetConfig. If Preset Config does not need username/password - you can refractor common parts from existing SupersetConfig to BaseSupersetConfig and make SupersetConfig and PresetConfig inherit BaseSupersetConfig.
Lastly, excited to hear results from your end-to-end preset testing when you have them.
self.config = config | ||
self.report = StaleEntityRemovalSourceReport() | ||
|
||
login_response = requests.post( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we refractor get token piece into separate get_access_token
method in SupersetSource and only override create
and get_access_token
method in PresetSource ?
|
||
pipeline.run() | ||
pipeline.raise_from_status() | ||
golden_file = "golden_test_ingest.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need separate preset golden file as it will have platform preset in entity urns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can help you generate golden file from scratch - https://datahubproject.io/docs/metadata-ingestion/developing/#updating-golden-test-files
Let me know if you need help testing end-to-end, I can get you a Preset workspace to run tests. Thank you! |
Thanks for the review, @mayurinehate! I'll address the comments in the next couple days! |
Hello @betodealmeida, would you have any idea when you might have time to work on it? thanks a lot |
@betodealmeida @mayurinehate, @hsheth2 is there still work to do on this? We just ran up against this requirement and would be willing to do the last mile work to get this in. |
I think i have a rebased version of this more or less sussed out. I'm hitting a snag testing it out in the local env. the ingest job fails to pull install the preset subpackage becuase its calling to the acryl pip package. It feels like I missed a step or im operating off of old docs. I can push that PR once I get the functionality ready. I've incorporated the suggestions on this PR. |
The class creation more or less mirrors this PR. I did also add the logo and all the registration stuff in the typescript files. I am using the
I'm guessing there's maybe an env var or something im supposed to set that overrides this? subsequent runs faile with:
|
@byndcivilization when developing on sources, you'll need to use ingestion from the CLI e.g. |
@coderabbitai full review |
Preset.io is a commercial SaaS provider for Apache Superset. This PR introduces a new ingestion source for Preset "workspaces", which are essentially Superset instances. The only difference is the authentication mechanism that needs to be done against an API gateway using a key/secret pair.
I implemented the source as a derived class of
SupersetSource
in the same file; let me know if it would be better to have it in a separate file.I added a new integration test, but I haven't tested it end-to-end yet. I'm in the process of setting up a local development to test it with a real workspace. But I thought it would be good to get early feedback in the PR since this is my first time contributing to DataHub.
Checklist