Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standart flow don't work in Databricks with unity catalog #190

Closed
1 of 3 tasks
bochkarevnv opened this issue Mar 2, 2023 · 8 comments
Closed
1 of 3 tasks

Standart flow don't work in Databricks with unity catalog #190

bochkarevnv opened this issue Mar 2, 2023 · 8 comments
Labels
bug Something isn't working Stale triage

Comments

@bochkarevnv
Copy link

Describe the bug

When I use plugin in Databricks with unity catalog for any start external tables drops and create.

Steps to reproduce

Create spec like this:

version: 2

sources:
  - name: my_external_table
    catalog: foo
    schema: bar
    tables:
      - name: external_table
        external:
          location: '...'
          using: parquet

Run it first time with dbt run-operation stage_external_sources
Run it one more time

Expected results

At first time external table created, at second nothing to do.

Actual results

At first time external table created, at second drops and creates

Screenshots and log output

Log (only main lines)

/* {"app": "dbt", "dbt_version": "1.4.1", "dbt_databricks_version": "1.4.1", "databricks_sql_connector_version": "2.3.0", "profile_name": "main_one", "target_name": "prod", "connection_name": "macro_stage_external_sources"} */
show tables in `bar`
Databricks adapter: <class 'databricks.sql.exc.ServerOperationError'>: [SCHEMA_NOT_FOUND] The schema `bar` cannot be found
Databricks adapter: Error while running:
macro show_tables

with database=, schema=bar, relations=[]
1 of 1 (1) drop table if exists `foo`.`bar`.`external_table`

1 of 10 (2) create table `foo`.`bar`.`external_table`

System information

The contents of your packages.yml file:

Which database are you using dbt with?

  • redshift
  • snowflake
  • other (specify: spark, databricks)

The output of dbt --version:

Core:                                            
  - installed: 1.4.1                             
  - latest:    1.4.4 - ←[33mUpdate available!←[0m
                                                 
  Your version of dbt-core is out of date!       
  You can find instructions for upgrading here:  
  https://docs.getdbt.com/docs/installation      

Plugins:
  - databricks: 1.4.1 - ←[33mUpdate available!←[0m
  - spark:      1.4.1 - ←[32mUp to date!←[0m

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

The operating system you're using:
Windows/Linux

The output of python --version:
Python 3.11.1

Additional context

I think problem is near https://github.com/dbt-labs/dbt-external-tables/blob/main/macros/plugins/spark/get_external_build_plan.sql#L6

@bochkarevnv bochkarevnv added bug Something isn't working triage labels Mar 2, 2023
@grindheim
Copy link

@bochkarevnv We also experienced this behaviour. The cause is that the spark plugin implementation of the file get_external_build_plan.sql uses none for database. But with unity catalog, database exists, so should be provided with the value source_node.database.

This seems to be the only change required to make it work as expected. One could either implement a full new plugin for Databricks here, but that might be a bit overkill given the rest of the files are exactly the same.
Alternatively, one could perhaps add a databricks__get_external_build_plan macro in the same get_external_build_plan.sql file for the spark plugin.

In the short term, you can override the macro locally by creating a new macro sql file and add the below code:

{% macro databricks__get_external_build_plan(source_node) %}

    {% set build_plan = [] %}

    {% set old_relation = adapter.get_relation(
        database = source_node.database,
        schema = source_node.schema,
        identifier = source_node.identifier
    ) %}

    {% set create_or_replace = (old_relation is none or var('ext_full_refresh', false)) %}

    {% if create_or_replace %}
        {% set build_plan = build_plan + [
            dbt_external_tables.dropif(source_node), 
            dbt_external_tables.create_external_table(source_node)
        ] %}
    {% else %}
        {% set build_plan = build_plan + dbt_external_tables.refresh_external_table(source_node) %}
    {% endif %}

    {% set recover_partitions = dbt_external_tables.recover_partitions(source_node) %}
    {% if recover_partitions %}
    {% set build_plan = build_plan + [
        recover_partitions
    ] %}
    {% endif %}

    {% do return(build_plan) %}

{% endmacro %}

@github-actions
Copy link

github-actions bot commented Oct 2, 2023

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 2, 2023
@bochkarevnv
Copy link
Author

Still actual

@github-actions github-actions bot removed the Stale label Oct 3, 2023
Copy link

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Mar 31, 2024
@bochkarevnv
Copy link
Author

Still actual

@github-actions github-actions bot removed the Stale label Apr 2, 2024
@grindheim
Copy link

grindheim commented Apr 18, 2024

This should be fixed by #236

Copy link

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 16, 2024
Copy link

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale triage
Projects
None yet
Development

No branches or pull requests

2 participants