Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] Support Neo4j Aura API #2

Open
aaronWass-neo opened this issue Mar 5, 2024 · 10 comments
Open

[REQUEST] Support Neo4j Aura API #2

aaronWass-neo opened this issue Mar 5, 2024 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@aaronWass-neo
Copy link

Is your feature request related to a problem? Please describe.

It would be nice to be able to create a Neo4j Aura instance as part of a nodestream pipeline. This would further reduce the barrier to entry for newcomers to graph databases. Neo4j Aura (Neo4j's fully managed DBaaS) exposes an API that allows you to run CRUD operations on your Aura instances and tenants that you have access to.

Describe the solution you'd like

My current thinking would be to accept the Aura API parameters when defining a target in nodestream.yaml like this:

targets:
  db-one:
    database: neo4j
    database_name: neo4j
    username: neo4j
    password: neo4j123
    uri: bolt://localhost:7687
  new-aura-instance:
    database: neo4j
    database_name: neo4j
    memory: 16G
    aura-instance-type: professional-db
    cloud-provider: aws
    region: us-west-1
    aura-tenant-id: !env AURA_TENANT_ID
    aura-client-id: !env AURA_CLIENT_ID
    aura-client-secret: !env AURA_CLIENT_SECRET

Then you could run a pipeline like this:

nodestream run assetPolicyPipeline --target new-aura-instance

This would mean adding the logic for the Aura API to nodestream/databases/neo4j

Describe alternatives you've considered

An alternative approach could be to create a separate nodestream plugin that would handle the Aura API logic. My current understanding of nodestream plugins is that they are primarily for defining ingestion and schema modeling like in nodestream-plugin-akamai. I would love any feedback on if creating a separate plugin for this Aura API management would be a superior alternative.

@aaronWass-neo aaronWass-neo added the enhancement New feature or request label Mar 5, 2024
@zprobst zprobst transferred this issue from nodestream-proj/nodestream Mar 5, 2024
@zprobst
Copy link
Contributor

zprobst commented Mar 5, 2024

@aaronWass-neo

I've transferred this issue to the neo4j specific repo. Overall, I like this suggestion and agree it does remove the barrier to entry. Here are some of my initial thoughts...

This would mean adding the logic for the Aura API to nodestream/databases/neo4j

I think a lot of the nuance in this is here. I do have a few things I think we need to iron out.

  1. How do we handle persisting persisting the username and password of the user's database? From what I see, we can make a POST request to the service to provision an instance that returns the user/pass. Do we "convert" the configuration for new-aura-instance into the appropriate configuration for aura in nodestream.yaml? Do we maintain some state file? How do we prevent people form easily checking in secrets?
  2. Does aura have specific "recommended" connection options or any specific settings in the driver that should be used in that environment that we'd need to remember to configure?
  3. Super minor, but are we okay with converting things like aura-instance-type to aura_instance_type? Most of the nodestream configuration standardizes around _ vs -.

My current understanding of nodestream plugins is that they are primarily for defining ingestion and schema modeling like in nodestream-plugin-akamai.

Plugins in nodestream are weird. We often talk about plugins like the one you describe because its what builds an "ecosystem" the most, but tons of things are pluggable in nodestream - right down to what file formats are handled. I think notably for this proposal commands are pluggable as well.

I would love any feedback on if creating a separate plugin for this Aura API management would be a superior alternative.

I don't think we need to have a completely separate plugin for aura as it related to api connectivity. Since we've moved the neo4j code here, I think this is a reasonable to retain this logic here. But, I do have a possible approach that may resolve some of the concerns I have. Just throwing it out to see what we think.

What if we plugged in some commands:

  • nodestream neo4j create-aura
  • nodestream neo4j remove-arua

Or maybe...

  • nodestream aura create
  • nodestream aura remove

...etc.

Then these commands could take CLI arguments to provision the database how the user wants and wait for it to come up before a user even runs a pipeline. It could also then add a preconfigured target that sets the values correctly. I know the idea is super rough but hopefully its enough to get the concept across?

Given these two suggestions, which do you think fits better with the aura product @aaronWass-neo?

@aaronWass-neo
Copy link
Author

I think a lot of the nuance in this is here. I do have a few things I think we need to iron out.

1.How do we handle persisting persisting the username and password of the user's database? From what I see, we can make a POST request to the service to provision an instance that returns the user/pass. Do we "convert" the configuration for new-aura-instance into the appropriate configuration for aura in nodestream.yaml? Do we maintain some state file? How do we prevent people form easily checking in secrets?

This is a good call out. My original thinking was to not store the credentials of the user database locally. We would just return the username and password for this new aura instance back to the user through the console. At this point they would be responsible to copy these credentials and setup a secure way to pass these back to create nodestream targets for this new database.

I really like your idea of having the aura commands in nodestream and auto-generating the target in nodestream.yaml for the newly created Aura instance. It feels potentially unsafe to store the user/pass for this new instance here without the user being aware that we are saving the Aura password locally. One option could be to have an option to store that password or not.

Using your suggested commands, you could do

  • nodestream aura create ...

which would create the target in nodestream.yaml with the user & password section blank. The user & password from the new Aura instance would be passed to the user via the console, and they would be responsible for determining if/how they want to add it to the target.

or

  • nodestream aura create ... --save_password

which would create the target in nodestream.yaml with the user and password filled in

Does aura have specific "recommended" connection options or any specific settings in the driver that should be used in that environment that we'd need to remember to configure?

It is a best practice to specify which user database to connect to when connecting to Aura, or any Neo4j database. In Aura the user database is named 'neo4j'. Nodestream already sets this by default, so we don't need to worry about that. There aren't any other Aura specific settings to keep in mind here.

Super minor, but are we okay with converting things like aura-instance-type to aura_instance_type? Most of the nodestream configuration standardizes around _ vs -.

Definitely!

What if we plugged in some commands:

nodestream neo4j create-aura
nodestream neo4j remove-arua
Or maybe...

nodestream aura create
nodestream aura remove
...etc.

Then these commands could take CLI arguments to provision the database how the user wants and wait for it to come up before a user even runs a pipeline. It could also then add a preconfigured target that sets the values correctly. I know the idea is super rough but hopefully its enough to get the concept across?

Given these two suggestions, which do you think fits better with the aura product @aaronWass-neo?

I really like this idea. I like nodestream aura create ... more than nodestream neo4j create-aura ...

Above I talked about the possibility of having two different options for this. One where we save the new aura instance password locally, and one where we just pass it to the user via the console. Do you like this idea? Other thoughts on the security aspect here?

We should be able to use a lot of what has been done in aura-cli for the API interface. Automatically setting up the target for this newly created instance, should further streamline this aura ingestion process.

@zprobst
Copy link
Contributor

zprobst commented Mar 8, 2024

Yeah, this is roughly my thinking. One difference was that we could configure password to come from an environment variable and the password would be output as you say. That way it sets them on a course of some form of secret management. How do you feel about that?

@aaronWass-neo
Copy link
Author

Yea I think that's a good idea.

@grantleehoffman
Copy link

I think we should be consistent with the current command noun/verb pattern so maybe nodestream create aura

@zprobst
Copy link
Contributor

zprobst commented Mar 8, 2024

I think we should be consistent with the current command noun/verb pattern so maybe nodestream create aura

Hmm interesting suggestion. There is some precedent though with the new nodestream migrations command like nodestream migrations make.

I think my thinking originally was to namespace like concepts so it follows noun verb. This leaves some of the commands as they are today as "shorthand" like "nodestream run" being shorthand for "nodestream pipeline run" but it feels like your thinking about it the opposite way which makes perfect sense in retrospect.

I'm not strongly for or against either way but it is obviously important to be consistent. I feel like I can cite examples both ways from other clis.

@aaronWass-neo
Copy link
Author

Something else here is that the aura api can perform actions on different nouns... For example in aura-cli you can perform actions on tenants, instances or snapshots.

So commands look like:

  • aura tenants list
  • aura instances create ...
  • aura snapshots restore ...

I don't think it is necessary to to have all of this functionality in nodestream, but remapping this might get confusing.

  • nodestream create aura instance
  • vs
  • nodestream aura instances create

I don't have a strong preference one way or the other. Maybe the first option does flow more naturally.

@aaronWass-neo
Copy link
Author

I added the first iteration of this on my fork here

I went with create aura terminology. It currently takes all of the parameters through the command line, which we can definitely improve on.

poetry run nodestream create aura --help

Description:
  Create neo4j Aura instances via the Aura API

Usage:
  create aura [options] [--] <name> <region> <instance_type> <memory> <cloud_provider> <tenant_id> <aura_client_id> <aura_client_secret>

Running this with real arguments looks like this (environment variables saved for tenant id, aura client id and aura client secret):

poetry run nodestream create aura testInstance1 us-central1 enterprise-db 2GB gcp $TENANT_ID $AURA_CLIENT_ID $AURA_CLIENT_SECRET

This calls the Aura API and returns information about the newly created Aura instance:

{
    "data": {
        "cloud_provider": "gcp",
        "connection_url": "neo4j+s://example.neo4j.io",
        "id": “exampleID”,
        "name": "testInstance1",
        "password": "yourNewPassword”,
        "region": "us-central1",
        "tenant_id": "yourTenantId”,
        "type": "enterprise-db",
        "username": "neo4j"
    }
}

@zprobst
Copy link
Contributor

zprobst commented Aug 15, 2024

I've started a feature branch for this so we can aggregate changes in this space before a release:

https://github.com/nodestream-proj/nodestream-plugin-neo4j/tree/aura-integration

@zprobst
Copy link
Contributor

zprobst commented Aug 15, 2024

With #20, we're going to land some basic commands support.

I think before I am comfortable releasing it, I'd like to see the following:

  • Add some better options for handling the infamous password problem. Ideally we can provide some option that safely
  • Add some tests to the commands

These features can be added in some additional PRs to the aura-integration branch in separate PRs so we don't blow the scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants