Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] databricks bundle deploy fails to sync files due to 504's/stream timeouts #1102

Open
Solita-VillePuuska opened this issue Dec 16, 2024 · 1 comment
Assignees

Comments

@Solita-VillePuuska
Copy link

Solita-VillePuuska commented Dec 16, 2024

Description
When using the Databricks CLI to deploy our bundle, syncing files fails with 504 responses/stream timeouts from server.

This doesn't always happen. So far deploying new bundles has not had this issue, but after the issue starts, the only fix we've found is to simply destroy the bundle. Then, syncing the bundle's files works again. This issue has come up in two workspaces and when deploying from an Azure DevOps agent or a local machine. This has even happened with just one out of two workspaces failing to sync the same files.

Not sure if the issue is in the CLI, SDK, or the REST API (or something completely different), but we were asked to report the issue to the SDK here.

Reproduction
We haven't found a reliable repro.

Expected behavior
Syncing files to not suddenly start failing.

Is it a regression?
Only tried cli/0.235.0 databricks-sdk-go/0.51.0

Debug Logs
The debug logs contain a fair amount of our code so can't post full logs publicly. Here are at least some of the most relevant parts from the output of databricks bundle deploy -t $(environment) --force-lock --debug

...
< HTTP/2.0 200 OK pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG POST /api/2.0/workspace-files/import-file/Workspace/databricks_bundles/.bundle/data_platform/staging/files/<file_A>.py?overwrite=true
> # Databricks notebook source
> # MAGIC %md
> # MAGIC # Test 3:
> # MAGIC 1. Get meters with different ... (488 more bytes)
< HTTP/2.0 200 OK pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG POST /api/2.0/workspace-files/import-file/Workspace/databricks_bundles/.bundle/data_platform/staging/files/<file_B>.py?overwrite=true
> # Databricks notebook source
> # MAGIC %md ## Base table (Gold level)
> 
> # COMMAND ----------
> 
> # MAG... (39547 more bytes)
< HTTP/2.0 504 Gateway Timeout
< stream timeout pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG non-retriable error: unable to parse response. This is likely a bug in the Databricks SDK for Go or the underlying REST API. Please report this issue with the following debugging information to the SDK issue tracker at https://github.com/databricks/databricks-sdk-go/issues. Request log:
```
POST /api/2.0/workspace-files/import-file/Workspace/databricks_bundles/.bundle/data_platform/staging/files/<file_B>.py?overwrite=true
> * Host: 
> * Accept: application/json
> * Authorization: REDACTED
> * Traceparent: 00-cdaa737d75b3180f2593864c5d56c879-7f6cad4313ca2dff-01
> * User-Agent: cli/0.235.0 databricks-sdk-go/0.51.0 go/1.23.2 os/linux cmd/bundle_deploy cmd-exec-id/b5c42a34-6867-4450-aab1-ad3657f3bfa8 auth/oauth-m2m cicd/azure-devops
> # Databricks notebook source
...
<notebook code>
...
> # MAGIC   on a.Meteringpoint_id = d.Meteringpoint_i... (29403 more bytes)
< HTTP/2.0 504 Gateway Timeout
< * Content-Length: 14
< * Content-Type: text/plain
< * Date: Mon, 16 Dec 2024 07:36:38 GMT
< * Server: databricks
< * Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< * X-Content-Type-Options: nosniff
< * X-Databricks-Org-Id: <redacted, just in case>
< * X-Request-Id: e473eb16-ba9f-4fe8-bea1-e754af6f5d63
< stream timeout
``` pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG POST /api/2.0/workspace-files/import-file/Workspace/databricks_bundles/.bundle/data_platform/staging/files/<file_C>.sql?overwrite=true
> -- Databricks notebook source
> CREATE WIDGET TEXT environment DEFAULT "";
> CREATE WIDGET TEXT sche... (24537 more bytes)
< Error: Post "https://<workspace>/api/2.0/workspace-files/import-file/Workspace%2Fdatabricks_bundles%2F.bundle%2Fdata_platform%2Fstaging%2Ffiles%2F<file_C>.sql?overwrite=true": context canceled pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG non-retriable error: failed in rate limiter: context canceled pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG POST /api/2.0/workspace-files/import-file/Workspace/databricks_bundles/.bundle/data_platform/staging/files/<file_D>.sql?overwrite=true
> -- Databricks notebook source
> CREATE WIDGET TEXT environment DEFAULT "cosdpdev";
> 
> -- COMMAND ---... (36799 more bytes)
< Error: Post "https://<workspace>/api/2.0/workspace-files/import-file/Workspace%2Fdatabricks_bundles%2F.bundle%2Fdata_platform%2Fstaging%2Ffiles%2F<file_D>.sql?overwrite=true": context canceled pid=2808969 mutator=seq mutator=deploy mutator=seq mutator=seq mutator=deferred mutator=seq mutator=files.Upload sdk=true
07:36:38 DEBUG POST /api/2.0/workspace-files/import-file/Workspace/databricks_bundles/.bundle/data_platform/staging/files/<file_E>.sql?overwrite=true
...

Other Information

  • cli/0.235.0
  • databricks-sdk-go/0.51.0
  • go/1.23.2

Additional context

@shreyas-goenka
Copy link
Contributor

Thanks for reaching out @Solita-VillePuuska. I'm not sure what's going wrong here but I have reached out internally to the team that owns the import-file functionality to understand better why you could be seeing the 504 errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants