-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support running SQL models on Google Cloud Dataproc Serverless #1353
Comments
Hello @gddezero Could you provide more context about why you prefer Datapoc for SQL rather than directly on BQ? |
At this time, we will not be prioritizing this work (and it wouldn't be done on the Bigquery adapter if it was) so I'm closing this issue for now. |
@amychen1776 With Serverless Spark, user does not need to deploy Spark cluster and thrift server. It will greatly reduce the infrastructure management efforts. I should create this feature request in dbt-spark rather than dbt-bigquery. |
@gddezero that would make the most sense :) For dbt-spark, it will require supporting a new way of connection since we do expect a thrift server/ODBC driver/HTTP |
@amychen1776 Thanks for your advice. I created a the feature request in dbt-spark: dbt-labs/dbt-spark#1131 |
Is this your first time submitting a feature request?
Describe the feature
Context
Google Cloud Dataproc Serverless lets you run Spark workloads without requiring you to provision and manage your own Dataproc cluster. Use the Google Cloud console, Google Cloud CLI, or Dataproc API to submit a batch workload to the Dataproc Serverless service. The service will run the workload on a managed compute infrastructure, autoscaling resources as needed.
Dataproc Serverless is widely used for GCP customers to build data pipelines. A typical use case is submitting Spark SQL jobs to Dataproc Serverless to transform data and build data warehouse.
Current Status
dbt only supports running Python models on Dataproc Serverless as a companion service of BigQuery
https://docs.getdbt.com/docs/core/connect-data-platform/bigquery-setup#running-python-models-on-dataproc
Request
Support running SQL models on Dataproc Serverless
Describe alternatives you've considered
No response
Who will this benefit?
Customers using Google Cloud
Are you interested in contributing this feature?
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: