Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 requests after the hyper s3 timeout connection duration fail over large table operations #2301

Closed
liamphmurphy opened this issue Mar 18, 2024 · 39 comments
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate bug Something isn't working storage/aws AWS S3 storage related

Comments

@liamphmurphy
Copy link
Contributor

liamphmurphy commented Mar 18, 2024

Environment

Delta-rs version: python v0.16.0

Binding: Python

Environment:

  • Cloud provider: AWS

Bug

What happened:

On three of our large Delta tables (out of 500, most of the others are a lot smaller) I've repeatedly been seeing errors related to an S3 timeout. I get two kinds of errors; one indicates an error reading data from a parquet file, the other is just a much more generic error. Here they are:

Failed to parse parquet: Parquet error: AsyncChunkReader::get_bytes error: Generic S3 error: Error after 10 retries in 174.881515313s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://s3.us-east-1.amazonaws.com/...redacted...some-data-file.parquet): operation timed out

AND

ERROR - Error deleting events from s3a://path-to-delta-table: Failed to parse parquet: Parquet error: AsyncChunkReader::get_bytes error: Generic S3 error: request or response body error: operation timed out

This can happen in various contexts, but the prominent example in my case is needing to scan the entire Delta table for records matching a value and deleting those matches. I've logged out right before this delete op runs, and the time delta between that log and the error is usually around 5 minutes. This time delta is relevant.

The second one in particular comes up a lot. This is on a table that gets around 250,000 rows added every hour and is partitioned by date 2024-03-18 and hour 15 as example.

My initial thought was that this was due to a failure to download a larger file such as a checkpoint; but those are "only" 668 MB and it is certainly not unheard of, moving files using various AWS SDK's will gladly handle larger files.

Our current theory is that instead this timeout indicate that the hyper client makes a request, then the library does some in-mem operations (such as scanning a large delta table) that takes a long time, and then when delta-rs is ready to reach out to S3 again, it exceeds some kind of keep-alive timeout and fails. This is why that time delta of 5 minutes I mentioned earlier is important.

This seems related to this comment:

/// The `pool_idle_timeout` option of aws http client. Has to be lower than 20 seconds, which is
/// default S3 server timeout <https://aws.amazon.com/premiumsupport/knowledge-center/s3-socket-connection-timeout-error/>.
/// However, since rusoto uses hyper as a client, its default timeout is 90 seconds
/// <https://docs.rs/hyper/0.13.2/hyper/client/struct.Builder.html#method.keep_alive_timeout>.
/// Hence, the `connection closed before message completed` could occur.
/// To avoid that, the default value of this setting is 15 seconds if it's not set otherwise.

And finally I think this sort of timeout is expected from the hyper client, per this (admittedly old) ticket but it explains this exact situation in another context: hyperium/hyper#763

What you expected to happen:

Even on long running operations, such as a scan over the whole table for deletion, the operation would eventually succeed barring no other issues.

How to reproduce it:

IF I AM CORRECT: perform a long running Delta operation, such as a delete on a table with millions of rows and have it try to make a final reach out to S3

More details:

Note this is just a theory. I wanted to get this issue out there to see what others think, I'm going to dig in and see if I can figure out whether it's a issue with the library or something else.

@liamphmurphy liamphmurphy added the bug Something isn't working label Mar 18, 2024
@ion-elgreco
Copy link
Collaborator

@liamphmurphy I'm happy I wasn't the only one facing this. Was questioning our infra setup

@liamphmurphy
Copy link
Contributor Author

@liamphmurphy I'm happy I wasn't the only one facing this. Was questioning our infra setup

Haha I'm also glad to know I'm not completely crazy!

This may be relevant: awslabs/aws-sdk-rust#1079 (comment)

If this fix is related, I've noticed the hyper version delta-rs uses is quite old so it'll be a big version increase.

@ion-elgreco
Copy link
Collaborator

@liamphmurphy just to double check, do you also see the behavior on 0.15.3? That's what we use at the moment

@ion-elgreco
Copy link
Collaborator

The linked code you shared is actually not out yet in a Python release. It was recently refactored by @mightyshazam to the AWS sdk's.

@liamphmurphy
Copy link
Contributor Author

@ion-elgreco Tested with 0.15.3 and yep I get that second more generic error still.

And ahh okay understood, guess that explains why changing that env var wasn't doing anything haha. I'll bring that down locally and see if that fixes it

@ion-elgreco
Copy link
Collaborator

@liamphmurphy we actually see it happen in 1.5minute already:

The above exception was caused by the following exception:
_internal.DeltaError: Generic DeltaTable error: External error: External error: Arrow error: External error: Parquet error: ParquetObjectReader::get_byte_ranges error: Generic S3 error: request or response body error: operation timed out

@liamphmurphy
Copy link
Contributor Author

liamphmurphy commented Mar 18, 2024

I did some debugging and found where this error occurs (at least for me)

It happens here:

The checkpoint_stream.next() call hangs for some time, and then eventually returns an error;

Err(Parquet { source: General("ParquetObjectReader::get_byte_ranges error: Generic S3 error: request or response body error: operation timed out") })

So it appears that it finds the right checkpoint file (which for me is around 800 MB large) and detects any unaccounted for logs made after that checkpoint, but then fails on scanning this stream.

@ion-elgreco ion-elgreco added binding/python Issues for the Python package binding/rust Issues for the Rust crate storage/aws AWS S3 storage related labels Mar 19, 2024
@liamphmurphy
Copy link
Contributor Author

And for some additional logs in case it's helpful (the checkpoint files print line is my own), here's the last snippet of RUST_LOG=debug lines leading to the error.

checkpoint files: [ObjectMeta { location: Path { raw: "_delta_log/00000000000000108548.checkpoint.parquet" }, last_modified: 2024-03-19T13:01:23Z, size: 953533417, e_tag: Some("\"tag\""), version: None }]
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] reuse idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] flushed 1575 bytes
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] parsed 12 headers
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body is content-length (8 bytes)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] pooling idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] reuse idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] flushed 1575 bytes
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] parsed 12 headers
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body is content-length (86293 bytes)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] pooling idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] reuse idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] flushed 1567 bytes
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] parsed 12 headers
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body is content-length (953409770 bytes)
[2024-03-19T14:23:58Z DEBUG hyper::client::client] client connection error: connection closed before message completed

@ion-elgreco
Copy link
Collaborator

@liamphmurphy very useful! so for you it's reading these huge checkpoints then

I think in our case I figured whats happenning it's sometimes our incoming data has 0 rows, which was causing the merge to scan the whole table, will create something separate for that but there is crossover in the connection issues

@liamphmurphy
Copy link
Contributor Author

@ion-elgreco Yeah appears to be. I presume the reason why a stream is used here was to accommodate large checkpoints like this and to not have to load it all in memory, but then that hyper client connection closes before it can finish. 🤔

@thomasfrederikhoeck
Copy link
Contributor

thomasfrederikhoeck commented Apr 3, 2024

@ion-elgreco we seing a similar issue using Azure. When we run the following code it fails with _internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ArrowError(ExternalError(General("ParquetObjectReader::get_byte_ranges error: Generic MicrosoftAzure error: request or response body error: operation timed out")), None)

import os
os.environ["RUST_LOG"]="debug"

from deltalake import DeltaTable
 
blob_path = "az://<redacted path>"
storage_options = {"AZURE_STORAGE_ACCOUNT_NAME": "<redacted sa>", "AZURE_CONTAINER_NAME":'<redacted container>', 'use_azure_cli': 'true'}

dt = DeltaTable(blob_path, storage_options=storage_options)
dt.optimize.z_order(["StatusDateTime"])

The table consists of 12 parquet of roughly 400mb each.

You can see the full error log here:
https://gist.githubusercontent.com/thomasfrederikhoeck/8db799b3fe4fbc2414626a97da9249a9/raw/658e6cbcf090fba424e7eafae99eea30c4ae7231/txt

@ion-elgreco
Copy link
Collaborator

@thomasfrederikhoeck can you try this? I think default is quite low I believe

storage_options = {"timeout": "120s"}

@thomasfrederikhoeck
Copy link
Contributor

thomasfrederikhoeck commented Apr 3, 2024

@ion-elgreco I just ran it with "timeout": "120s". It ran for much longer (15mins +) but failed with following error. I noticed that after a few minutes the network throughput dropped to zero indicating that nothing was happening.

DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ArrowError(ExternalError(General("ParquetObjectReader::get_byte_ranges error: Generic MicrosoftAzure error: Error after 0 retries in 86.1024429s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2019/part-00001-560c8d94-ed4e-4047-b268-ec11ab58100e-c000.snappy.parquet): connection error: An existing connection was forcibly closed by the remote host. (os error 10054)")), None)
The logs goes like this:

[2024-04-03T15:03:08Z DEBUG hyper::proto::h1::io] flushed 10960 bytes
[2024-04-03T15:03:08Z DEBUG datafusion_physical_plan::sorts::sort] Spilling sort data of ExternalSorter to disk whilst inserting
[2024-04-03T15:03:08Z DEBUG hyper::client::client] client connection error: connection error: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:03:17Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:37Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:38Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:55Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:56Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:03:56Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:56Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:03:56Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:22Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:22Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:22Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:22Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:39Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:39Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:56Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:56Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:56Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:56Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:11Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:23Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Apr 3, 2024

@liamphmurphy @thomasfrederikhoeck can you try this branch? #2378

Likely not the issue but worthwhile to try

@thomasfrederikhoeck on your side can you see the logs of your object store, because it indicates in your error msg that it got closed from that side

@thomasfrederikhoeck
Copy link
Contributor

thomasfrederikhoeck commented Apr 3, 2024

@ion-elgreco just tried #2378 . I still get error with the following logs. This might not be related to something like apache/arrow-rs#5366 and pola-rs/polars#14384 (comment)?

With regards to the closing it only happen when I set the timeout to 120s. I'm not 100% how to get logs out from Azure Storage Account.


[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body is content-length (948749 bytes)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:04Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:04Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:04Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:04Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:05Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:05Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:06Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:06Z DEBUG hyper::proto::h1::io] flushed 10958 bytes
[2024-04-03T20:40:07Z DEBUG hyper::proto::h1::io] parsed 21 headers
[2024-04-03T20:40:07Z DEBUG hyper::proto::h1::conn] incoming body is content-length (883496 bytes)
[2024-04-03T20:40:08Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:08Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:09Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:09Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:11Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:12Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:13Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:14Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:15Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:18Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:19Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:19Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:19Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:22Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:22Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:23Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:24Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:26Z DEBUG hyper::client::pool] reuse idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:27Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:28Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:29Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:30Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:34Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:34Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:34Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:34Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:35Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:35Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:36Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:38Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:40Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:40Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:52Z DEBUG reqwest::connect] starting new connection: https://<redacted sa>.blob.core.windows.net/
[2024-04-03T20:40:52Z INFO  object_store::client::retry] Encountered transport error backing off for 0.1 seconds, retry 1 of 10: error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2013/part-00001-de222443-0674-41b8-bdb4-53050a65fa47-c000.snappy.parquet): error trying to connect: dns error: task 1748 was cancelled
Traceback (most recent call last):
  File "c:\projects\ion-elgreco\delta-rs\test.py", line 14, in <module>
    dt.optimize.z_order(["StatusDateTime"])
  File "C:\projects\ion-elgreco\delta-rs\python\deltalake\table.py", line 2022, in z_order
    metrics = self.table._table.z_order_optimize(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ArrowError(ExternalError(General("ParquetObjectReader::get_byte_ranges error: Generic MicrosoftAzure error: request or response body error: operation timed out")), None)

@ion-elgreco
Copy link
Collaborator

@thomasfrederikhoeck @liamphmurphy can you guys please try out this PR: #2392

@thomasfrederikhoeck
Copy link
Contributor

thomasfrederikhoeck commented Apr 7, 2024

@ion-elgreco it did not solve it. :-/ I see the following 5 times:

[2024-04-07T19:26:24Z INFO object_store::client::retry] Encountered transport error backing off for 0.1 seconds, retry 1 of 10: error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2013/part-00002-de222443-0674-41b8-bdb4-53050a65fa47-c000.snappy.parquet): operation timed out

@ion-elgreco
Copy link
Collaborator

@ion-elgreco it did not solve it. :-/ I see the following 5 times:

[2024-04-07T19:26:24Z INFO object_store::client::retry] Encountered transport error backing off for 0.1 seconds, retry 1 of 10: error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2013/part-00002-de222443-0674-41b8-bdb4-53050a65fa47-c000.snappy.parquet): operation timed out

How long is your timeout set to?

@thomasfrederikhoeck
Copy link
Contributor

@ion-elgreco That is without timeout explicitly set. If I set it to timeout 120s it is the same just it takes longer time. I notice a pattern where CPU and RAM are used almost fully until I start seeing [2024-04-08T07:10:44Z DEBUG rustls::common_state] Sending warning alert CloseNotify which then results in ram starting to being released and CPU load lowering. This happens roughly 5 min after the script started. After I see this log there is no network activity. Do you have an idea of where we can add more debug logging to try and understand why and where this happens? :-)

@ion-elgreco
Copy link
Collaborator

@thomasfrederikhoeck I am not really sure what the issue could, because your previous errors showed closed by remote host, so that sounds more like a server-side kill of the connection

@thomasfrederikhoeck
Copy link
Contributor

thomasfrederikhoeck commented Apr 8, 2024

@ion-elgreco if I use the default timeout I don't see the error that the remote closed but in both cases I see object_store reporting operation timed out.

I just tried from an Azure VM to maximize network throughput and that was succesfull. Is there some way to limit the concurrency to test if it is because I'm overloading my network when I try from my laptop? I used to have similar issues in polars until this was merged pola-rs/polars#15083 (see pola-rs/polars#14384 (comment) and pola-rs/polars#14384 (comment))

I'm not familiar with how polars has opted to hook up tokio, but it is important to keep CPU-bound tasks such as decoding parquet off the threadpool used for performing IO.

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Apr 8, 2024

@thomasfrederikhoeck do you mind checking if you see the timeout behavior with 0.15.1?

In 0.15.2 the log replay got introduced this new replay stream concept using futures, it might be the cause.

Ah I see now it happens in Optimize for you, which makes sense I guess. The recordBatchStream is created using a futures::stream.

Let me see if I can stretch my knowledge on async rust here..

@thomasfrederikhoeck
Copy link
Contributor

@ion-elgreco I tried 0.15.1 but that unfurtunally didn't help.

I wish I could assist but my knowledge on rust is very limited - async or not :-D Let me know if there is anything I can test out.

I want to also metion that when running the script above (#2301 (comment)) i sometimes see the following inside the logs:

thread 'tokio-runtime-worker' panicked at C:\Users\runneradmin\.cargo\registry\src\index.crates.io-6f17d22bba15001f\bytes-1.5.0\src\bytes.rs:261:9:
range end out of bounds: 38957 <= 32012
stack backtrace:
   0:     0x7ffa3ae120ba - BrotliDecoderVersion
   1:     0x7ffa3ae39f7b - BrotliDecoderVersion
   2:     0x7ffa3ae0d271 - BrotliDecoderVersion
   3:     0x7ffa3ae11e3a - BrotliDecoderVersion
   4:     0x7ffa3ae147da - BrotliDecoderVersion
   5:     0x7ffa3ae14448 - BrotliDecoderVersion
   6:     0x7ffa3ae14e88 - BrotliDecoderVersion
   7:     0x7ffa3ae14d6d - BrotliDecoderVersion
   8:     0x7ffa3ae12d79 - BrotliDecoderVersion
   9:     0x7ffa3ae14a80 - BrotliDecoderVersion
  10:     0x7ffa3af78617 - BrotliDecoderVersion
  11:     0x7ffa382b4d0b - PyInit__internal
  12:     0x7ffa382e4779 - PyInit__internal
  13:     0x7ffa382c07e7 - PyInit__internal
  14:     0x7ffa382aefa9 - PyInit__internal
  15:     0x7ffa3a06bffd - BrotliDecoderVersion
  16:     0x7ffa382e76a8 - PyInit__internal
  17:     0x7ffa39fdc91f - BrotliDecoderVersion
  18:     0x7ffa39d2f2e5 - bz_internal_error
  19:     0x7ffa38fb479c - PyInit__internal
  20:     0x7ffa38fb5755 - PyInit__internal
  21:     0x7ffa38fb01cf - PyInit__internal
  22:     0x7ffa38d04962 - PyInit__internal
  23:     0x7ffa38fc0e4f - PyInit__internal
  24:     0x7ffa393490c9 - bz_internal_error
  25:     0x7ffa393ad49c - bz_internal_error
  26:     0x7ffa393e4aee - bz_internal_error
  27:     0x7ffa394413ed - bz_internal_error
  28:     0x7ffa3930240a - bz_internal_error
  29:     0x7ffa392fcf09 - bz_internal_error
  30:     0x7ffa392936e9 - bz_internal_error
  31:     0x7ffa39432fbb - bz_internal_error
  32:     0x7ffa3ad9d1b0 - BrotliDecoderVersion
  33:     0x7ffa3ad9b0be - BrotliDecoderVersion
  34:     0x7ffa3ad8f84e - BrotliDecoderVersion
  35:     0x7ffa3ad9a8c3 - BrotliDecoderVersion
  36:     0x7ffa3ad818fe - BrotliDecoderVersion
  37:     0x7ffa3ada2dc0 - BrotliDecoderVersion
  38:     0x7ffa3ad8ca2c - BrotliDecoderVersion
  39:     0x7ffa3ad84f4e - BrotliDecoderVersion
  40:     0x7ffa3ad85d58 - BrotliDecoderVersion
  41:     0x7ffa3ae2312c - BrotliDecoderVersion
  42:     0x7ffb3de07344 - BaseThreadInitThunk
  43:     0x7ffb3e6026b1 - RtlUserThreadStart

@thomasfrederikhoeck
Copy link
Contributor

@ion-elgreco I can maybe hire a consultant to look at this if you would have the time to look at a PR given I can't really contribute myself.

@ion-elgreco
Copy link
Collaborator

@thomasfrederikhoeck for sure, I am happy to free up time to review an PR :)

@thomasfrederikhoeck
Copy link
Contributor

@ion-elgreco I have someone working on it now but given that object_store is changing around the asyncwrite do you think it make sense to continue? Won't it maybe be better to wait for #2436 since it looks like a new relases of the arrow crate might not be to far away: apache/arrow-rs#5688

For reference:
https://github.com/apache/arrow-rs/blob/cd3331989d65f6d56830f9ffa758b4c96d10f4be/object_store/CHANGELOG.md

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Apr 25, 2024

@ion-elgreco I have someone working on it now but given that object_store is changing around the asyncwrite do you think it make sense to continue? Won't it maybe be better to wait for #2436 since it looks like a new relases of the arrow crate might not be to far away: apache/arrow-rs#5688

For reference:
https://github.com/apache/arrow-rs/blob/cd3331989d65f6d56830f9ffa758b4c96d10f4be/object_store/CHANGELOG.md

Yes it's better to first upgrade the object store to using the upload trait and see how much of an impact that is. Perhaps that person can help with refactoring to those APIs.

We are however stuck because we can't upgrade beyond Datafusion 37, see here: #2395 , apache/datafusion#10181

@abhiaagarwal
Copy link
Contributor

@thomasfrederikhoeck can you try #2581, which bumps the object_store dep? I refactored the object store wrapper on the python side to use an internal buffer with the Upload trait that occasionally flushes, it may solve this issue.

@thomasfrederikhoeck
Copy link
Contributor

@abhiaagarwal Nice! I will test today or tomorrow:-)

@abhiaagarwal
Copy link
Contributor

@thomasfrederikhoeck it's merged, try main :)

@ion-elgreco
Copy link
Collaborator

@abhiaagarwal it's actually already released 😉

@thomasfrederikhoeck
Copy link
Contributor

thomasfrederikhoeck commented Jun 12, 2024

@abhiaagarwal @ion-elgreco It looks like it has solved it (not getting any timeouts). NICE! However instead I got the following:

_internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ParquetError(General("AsyncChunkReader::get_bytes error: Generic MicrosoftAzure error: error decoding response body"))

I will create a new issue :-)

Edit: #2592

@abhiaagarwal
Copy link
Contributor

Hmm, @liamphmurphy can you check if this has been solved on the AWS side? The error you're getting @thomasfrederikhoeck is the exact same one that was reported above, the timeouts seem to be solved.

Digging a bit deeper, the error is orginating from here, where it's performing a HTTP range request to get the parquet metadata without downloading the whole file (DuckDB does something similar). I wonder if the error still lies with object_store's handling of HTTP range requests?

@ion-elgreco
Copy link
Collaborator

@abhiaagarwal you didn't update the Rust writer IIRC?

@abhiaagarwal
Copy link
Contributor

@ion-elgreco yep, but the failure has to do with a read. Perhaps there's a path where it reads the parquet metadata before it performs the delete operation?

@thomasfrederikhoeck
Copy link
Contributor

Hmm, @liamphmurphy can you check if this has been solved on the AWS side? The error you're getting @thomasfrederikhoeck is the exact same one that was reported above, the timeouts seem to be solved.

Digging a bit deeper, the error is orginating from here, where it's performing a HTTP range request to get the parquet metadata without downloading the whole file (DuckDB does something similar). I wonder if the error still lies with object_store's handling of HTTP range requests?

@abhiaagarwal the new error I'm getting now is not the one reported above? Previously, I was getting the timouts as reported above but your PR a have fixed this (I tried few times and didn't see it a single time). So I belive #2592 is new :-)

@ion-elgreco
Copy link
Collaborator

@ion-elgreco yep, but the failure has to do with a read. Perhaps there's a path where it reads the parquet metadata before it performs the delete operation?

It might be related to reading the checkpoint and we also often scan the parquet files

@abhiaagarwal
Copy link
Contributor

@thomasfrederikhoeck yeah, I'm just curious as to see whether the same error is observed on the AWS side — the error here seems to be "the internal parquet reader can't decode the metadata returned by the http range request" (where it previously would time out even trying to even execute the range request), which could point to an implementation flaw in the upstream arrow dependency or maybe the object store implementation of azure.

@kszlim
Copy link

kszlim commented Aug 12, 2024

I'm curious if there's a PR that closed this? Was it the object store bump?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate bug Something isn't working storage/aws AWS S3 storage related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants