Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify ssh tunnel dbt connection error #4913

Merged
merged 21 commits into from
Mar 1, 2024
Merged
Changes from 7 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b2b349c
clarify ssh tunnel dbt connection error
mirnawong1 Feb 14, 2024
cb392dc
Merge branch 'current' into mirnawong1-patch-27
mirnawong1 Feb 16, 2024
3af1d9e
Merge branch 'current' into mirnawong1-patch-27
mirnawong1 Feb 29, 2024
6081433
This branch was auto-updated!
github-actions[bot] Feb 29, 2024
40cd091
This branch was auto-updated!
github-actions[bot] Feb 29, 2024
9dc77cc
This branch was auto-updated!
github-actions[bot] Feb 29, 2024
ffae41d
This branch was auto-updated!
github-actions[bot] Feb 29, 2024
b9230f0
This branch was auto-updated!
github-actions[bot] Mar 1, 2024
e7fc7bd
This branch was auto-updated!
github-actions[bot] Mar 1, 2024
1350fce
This branch was auto-updated!
github-actions[bot] Mar 1, 2024
7e5d033
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
01d7df0
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
4b0be49
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
f794f0b
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
ea71e1a
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
7f8aa09
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
5f3829e
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
13d68a2
Update website/docs/docs/cloud/connect-data-platform/connect-redshift…
mirnawong1 Mar 1, 2024
de1501d
This branch was auto-updated!
github-actions[bot] Mar 1, 2024
75f98e6
This branch was auto-updated!
github-actions[bot] Mar 1, 2024
3023390
This branch was auto-updated!
github-actions[bot] Mar 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -63,39 +63,36 @@ chmod 600 ~/.ssh/authorized_keys

The Bastion server should now be ready for dbt Cloud to use as a tunnel into the Redshift environment.

#### Intermittent connection issues

<details>
<summary>Database Error - could not connect to server: Connection timed out</summary>
You will have the following components when you configure a connection to a database using an SSH tunnel: <br />
- An Elastic Load Balancer (ELB) or Network Load Balancing (NLB) instance.<br />
- A bastion host (or jump server) running the <code>sshd</code> process.<br />
- A Database (such as Redshift cluster)<br /><br />

dbt Cloud establishes an SSH tunnel connection through the ELB or NLB to the <code>sshd</code> process. This is responsible for routing traffic to the database. When dbt initiates a job run, an SSH tunnel is created at the start of the run. If this SSH tunnel fails at any point, the job will also fail.<br />

The most common causes of tunnel failures are:<br />
- The SSH daemon terminates the session due to an idle timeout.<br />
- The ELB or NLB terminates the connection when it's idle.<br />

dbt Cloud sets a value for its SSH tunnel called `ServerAliveInterval` and `ServerAliveCountMax` that polls the connection every 30 seconds and the underlying OS in our run "pods" will terminate the connection if the `sshd` process fails to respond after 300s. This will, in many cases, prevent an idle timeout entirely so long as the customer is not using ELB with a firewall-level idle timeout of less than 30 seconds. However, if the customer is using ELB and is using an Idle Connection Timeout of less than 30s, this will be insufficient to prevent tunnels from being terminated. <br />

Some versions of Linux used on bastion hosts use a version of `sshd` with additional idle timeout settings `ClientAliveCountMax`. This value sets the number of client alive messages that may be sent without `sshd` receiving any messages back from the client. If this threshold is reached while client alive messages are being sent, `sshd` will disconnect the client, terminating the session. The client-alive mechanism is helpful when the client or server needs to know when a connection has become inactive. The default value is 3.<br /><br />

`ClientAliveInterval`:<br />
This value sets a timeout interval in seconds after which if no data has been received from the client, `sshd` will send a message through the encrypted channel to request a response from the client. The default is 0, indicating that these messages will not be sent to the client.<br />

Using default values, tunnels could be terminated prematurely by `sshd`. To solve this problem, the `/etc/ssh/sshd_config` file on the bastion host can be configured with the following values:<br /><br />
- `ClientAliveCountMax` 10<br />
- `ClientAliveInterval` 30<br />
where `ClientAliveCountMax` should be set to a non-zero value and `ClientAliveInterval` should be a value less than the ELB or NLB idle timeout value.<br />

With these settings, unresponsive SSH clients will be disconnected after approximately 300 seconds, helping to prevent tunnel failures.
</details>


## Configuration

To learn how to optimize performance with data platform-specific configurations in dbt Cloud, refer to [Redshift-specific configuration](/reference/resource-configs/redshift-configs).
To optimize performance with data platform-specific configurations in dbt Cloud, refer to [Redshift-specific configuration](/reference/resource-configs/redshift-configs).

To grant users or roles database permissions (access rights and privileges), refer to the [Redshift permissions](/reference/database-permissions/redshift-permissions) page or [Postgres permissions](/reference/database-permissions/postgres-permissions) page.

## FAQs

<detailsToggle alt_header="Database Error - could not connect to server: Connection timed out">
When setting up a database connection using an SSH tunnel, you typically need these parts:
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

- A Load Balancer (like ELB or NLB) to manage traffic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be better served keeping the initialisms spelled out first.

mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
- A bastion host (or jump server) that runs the SSH protocol, acting as a secure entry point.
- The database itself (such as a Redshift cluster).

dbt Cloud uses an SSH tunnel to connect through the Load Balancer to the database. This tunnel is set up at the start of any dbt job run. If the tunnel drops, the job fails.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

Tunnel failures usually happen because:

- The SSH service stops if there's no activity for too long.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
- The Load Balancer cuts off the connection if it's idle.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
- dbt Cloud tries to keep the connection alive by checking in every 30 seconds, and the system will end the connection if there's no response from the SSH service after 300 seconds. This helps avoid drops due to inactivity unless the Load Balancer's timeout is less than 30 seconds.

Bastion hosts might have additional SSH settings to disconnect inactive clients after several checks without a response. By default, it checks three times.

To prevent premature disconnections, you can adjust the settings on the bastion host:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are recommended settings, we should state that, otherwise leave it a little more open ended for customers to configure what's right for them


- `ClientAliveCountMax 10` &mdash; Allows more checks before deciding the client is inactive.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
- `ClientAliveInterval 30` &mdash; Checks for client activity every 30 seconds.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
These adjustments ensure that inactive SSH clients are disconnected after about 300 seconds, reducing the chance of tunnel failures.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

</detailsToggle>
Loading