[Feature] Support running SQL models on Google Cloud Dataproc Serverless #1131

gddezero · 2024-10-29T06:33:42Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt-spark functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Context

Google Cloud Dataproc Serverless lets you run Spark workloads without requiring you to provision and manage your own Dataproc cluster. Use the Google Cloud console, Google Cloud CLI, or Dataproc API to submit a batch workload to the Dataproc Serverless service. The service will run the workload on a managed compute infrastructure, autoscaling resources as needed.

Dataproc Serverless is widely used for GCP customers to build data pipelines. A typical use case is submitting Spark SQL jobs to Dataproc Serverless to transform data and build data warehouse.

Current Status

dbt only supports submitting SQL models using Spark thrift server. User need to deploy a Dataproc Cluster, start thrift server and manage the infrastructures underneath.

Request

Support running SQL models on Dataproc Serverless.

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

gddezero added enhancement New feature or request triage labels Oct 29, 2024

gddezero mentioned this issue Oct 30, 2024

[Feature] Support running SQL models on Google Cloud Dataproc Serverless dbt-labs/dbt-bigquery#1353

Closed

3 tasks

amychen1776 removed the triage label Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support running SQL models on Google Cloud Dataproc Serverless #1131

[Feature] Support running SQL models on Google Cloud Dataproc Serverless #1131

gddezero commented Oct 29, 2024

[Feature] Support running SQL models on Google Cloud Dataproc Serverless #1131

[Feature] Support running SQL models on Google Cloud Dataproc Serverless #1131

Comments

gddezero commented Oct 29, 2024

Is this your first time submitting a feature request?

Describe the feature

Context

Current Status

Request

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?