Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Add option to the Databricks Agent to provide a default databricks instance #6253

Open
2 tasks done
andrei-trandafir opened this issue Feb 17, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@andrei-trandafir
Copy link

andrei-trandafir commented Feb 17, 2025

Motivation: Why do you think this is important?

Today, users running Databricks tasks using the databricks webapi plugin can omit providing a databricks instance or token when registering their tasks, given that the Flyte propeller deployment contains this information by default through the its config. This works well in multi-cluster scenarios where each propeller deployment would point to a different Databricks instance, as users wouldn't need to be concerned with the specific platform details.
On the other hand, the Databricks agent requires the databricks instance to be provided by the user (link):

class DatabricksAgent(AsyncAgentBase):
    ...
    async def create(
        self, task_template: TaskTemplate, inputs: Optional[LiteralMap] = None, **kwargs
    ) -> DatabricksJobMetadata:
        data = json.dumps(_get_databricks_job_spec(task_template))
        databricks_instance = task_template.custom["databricksInstance"] # Value enforced here
        ...

This results in the following scenarios:

  • For users that want to move their workflows to the agent implementation, they would need to update both which task config they import and start providing a databricks instance
  • For users that want to use the same logic across multiple databricks instances, they would have to register multiple task versions with duplicate logic.

Furthermore, considering that databricks tokens are usually created per user/instance combination, and that the databricks token is provided by the flyte agent, it makes sense to also provide the instance associated with it (unless overwritten by the workflow writers).

Goal: What should the final outcome look like, ideally?

Platform engineers should be able to provide the databricks instance to the flyte agent to use as a default. One easy implementation of this would be to use an environment variable provided through the agent chart (env variables to the agent deployment are already supported in the chart):

podEnv:
  - name: FLYTE_DATABRICKS_INSTANCE
    value: "my-databricks-instance"

which could then be read like so in the agent code:

class DatabricksAgent(AsyncAgentBase):
    ...
    async def create(
        self, task_template: TaskTemplate, inputs: Optional[LiteralMap] = None, **kwargs
    ) -> DatabricksJobMetadata:
        data = json.dumps(_get_databricks_job_spec(task_template))
        databricks_instance = task_template.custom.get("databricksInstance", os.getenv("FLYTE_DATABRICKS_INSTANCE"))
        ...

For users, this means that the migration from the databricks plugin to the databricks agent, the migration is as simple as updating the import from:

from flytekitplugins.spark import Databricks

to:

from flytekitplugins.spark import DatabricksV2 as Databricks

given that the instance and token are setup by the platform engineers ahead of time.

Describe alternatives you've considered

N/A

Propose: Link/Inline OR Additional context

It would also be useful if this feature could be retroactively added to flytekit 1.14, as that would match the version we're currently upgrading to.

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@andrei-trandafir andrei-trandafir added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Feb 17, 2025
Copy link

welcome bot commented Feb 17, 2025

Thank you for opening your first issue here! 🛠

@eapolinario
Copy link
Contributor

Overriding the databricks instance via an environment variable makes a lot of sense. I'd be more than happy to review a PR.

@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

2 participants