Skip to content

Improve reqwests error display #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
crepererum opened this issue Sep 10, 2024 · 15 comments
Open

Improve reqwests error display #48

crepererum opened this issue Sep 10, 2024 · 15 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@crepererum
Copy link
Contributor

Originally reported in #272 :

For #272 we only report Generic S3 error: error decoding response body. We should try to improve the error message. It seems that the Display impl. is not super helpful and we might wanna use Debug instead, see seanmonstar/reqwest#2373

@tustvold
Copy link
Contributor

tustvold commented Sep 10, 2024

One thing to be aware of is exposing sensitive headers, or URL path segments.

We could also provide information on common causes of this error - e.g. network/runtime instability

@crepererum crepererum self-assigned this Sep 10, 2024
@itsjunetime
Copy link

take

@crepererum crepererum removed their assignment Sep 15, 2024
@erratic-pattern
Copy link

take

@itsjunetime itsjunetime removed their assignment Sep 30, 2024
@erratic-pattern
Copy link

One thing to be aware of is exposing sensitive headers, or URL path segments. We could also provide information on common causes of this error - e.g. network/runtime instability

The documentation for reqwest::Error only mentions URLs as a concern. Hopefully this means that there are no headers in the output, but I do not know for certain.

There is reqwest::Error::without_url which can be used to scrub the URL from the reqwest error. Either the user could explicitly call this when they have sensitive information in URLs, or we could provide a helper method to accomplish this.

Another option is to have for each object storage API provide its own scrubbing method to eliminate specific sensitive headers from AWS, Azure, etc. I do not know enough about these object storage APIs to make this change.

For now I am just going to change to using the Debug information to see if it provides enough information.

erratic-pattern referenced this issue in erratic-pattern/arrow-rs Oct 6, 2024
see https://github.com/apache/arrow-rs/issues/6377

This should help in the interim to identify specific causes of, sometimes confusing, network
related errors. I think this change should be temporary until we can find a better way to
improve error messages.
erratic-pattern referenced this issue in erratic-pattern/arrow-rs Oct 6, 2024
see https://github.com/apache/arrow-rs/issues/6377

This should help in the interim to identify specific causes of, sometimes confusing, network
related errors. I think this change should be temporary until we can find a better way to
improve error messages.
erratic-pattern referenced this issue in erratic-pattern/arrow-rs Oct 6, 2024
see https://github.com/apache/arrow-rs/issues/6377

This should help in the interim to identify specific causes of, sometimes confusing, network
related errors. I think this change should be temporary until we can find a better way to
improve error messages.
@crepererum
Copy link
Contributor Author

Regarding the argument that @tustvold brought up here: apache/arrow-rs#6518 (comment) and that basically boils down to "people should walk the error chain":

I think the issue here is that object_store is somewhat inconsistent: for some errors, Display prints the entire chain e1: e2: e3 but for reqwest errors it does not. That also make the proposal to "walk the error chain" somewhat impractical, because if you walk the chain and use Display to print the chain parts, you gonna repeat some errors. So our standpoint is to really tell people that the chain should be walked, then stuff like this

https://github.com/apache/arrow-rs/blob/4adbeee14c40c080f8fca3bd42c0c3856bbb151c/object_store/src/aws/client.rs#L69-L70

is wrong and

#[snafu(display("Error bla bla: {}", source))]

should be

#[snafu(display("Error bla bla"))]

@tustvold
Copy link
Contributor

I personally agree that walking the error chain should be the approach, it's the pattern used by the vast majority of other languages, but unfortunately support for this in the ecosystem is limited. If Rust had supported this from day one, perhaps we wouldn't be in this mess, but alas it did not. As it stands I'm honestly not even sure what the consensus is...

Moving to this particular issue, I think we need to surface enough information within the default display implementation to diagnose issues, however, this does not mean we should just expose everything, including what may be sensitive or internal details. My point w.r.t walking the error chain is that in the absence of a clearly articulated error scenario the only option is to either expose everything, which I don't agree with, or instruct people to walk the error chain to get the more detailed information they require.

Perhaps we could reframe this issue as a concrete, what would I like to be in the output that currently is not

@crepererum
Copy link
Contributor Author

Perhaps we could reframe this issue as a concrete, what would I like to be in the output that currently is not

Fine with me. I think #272 provides a good example. The reqwest error that is stringified only says error decoding response body which is somewhat misleading w/o looking at the error chain. It sounds like the body data is malformed or the data got scrambled, but if you walk the error chain you'll often see it's simply a timeout that occurred while fetching the body bytes. So I wonder if we should wrap reqwest::Error into another error type that tries to extract SOME useful information (it could itself walk the chain and find the underlying IO error for example).

@tustvold
Copy link
Contributor

tustvold commented Oct 11, 2024

This sounds reasonable to me, it certainly would help users to make more clear that error means that the request was interrupted mid-stream, even if there isn't really anything more that can be said about why.

@erratic-pattern
Copy link

I feel that cramming more error sources into the Display is a step in the wrong direction. Despite the half-baked edges of Rust error handling making this a tricky decision for library maintainers, following standard practices should lead to a better outcome for both application developers and library developers.

As @crepererum pointed out, right now for a application developer, there is a confusing incompatibility in error messaging between Arrow ecosystem crates and (most?) other third-party crates. Using Display seems to "just work" for getting detailed information for Arrow crates, until the error goes into a third-party crate and then it doesn't.

"So just use an error reporting library, "as the reqwest maintainers suggest. Well, no. Even if you follow "best practice" you're still hindered as an application developer because using any of the available error reporting crates with arrow-rs will produce very redundant and ugly error reports.

To highlight this visually I've made an example repo that creates a simple "application" using object_store with a popular error reporting crate called color_eyre. It attempts to connect to an S3 bucket that doesn't exist and then displays an error.

Here is the output:

Error:
   0: Generic S3 error: Error after 10 retries in 5.723990834s, max_retries:10, retry_timeout:180s, source:error sending request for url (http://169.254.169.254/latest/api/token)
   1: Error after 10 retries in 5.723990834s, max_retries:10, retry_timeout:180s, source:error sending request for url (http://169.254.169.254/latest/api/token)
   2: error sending request for url (http://169.254.169.254/latest/api/token)
   3: client error (Connect)
   4: tcp connect error: Host is down (os error 64)
   5: Host is down (os error 64)

Location:
   examples/object_store.rs:11

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Maybe this is tolerable, but it's definitely ugly. You are at least getting as much information about the error chain as possible. Additionally, by cramming more information into the top-level error message, you're helping users in the default case to discover useful errors beyond just "Generic S3 error". So there is an argument to be made that the current state is acceptable, even if not ideal..

I think this discussion is beyond the scope of just "improve reqwests error display" and we should create an issue to discuss a more broad improvement to error messaging with both the default rust panic handler as well as the color_eyre (or similar crate) panic handler. If that sounds reasonable, I can write one up with points made from this discussion.

I would also like to create a similar example repo for DataFusion to bring up the discussion there, since they also use the same approach to error messaging, but I think the discussion needs to happen here primarily since DataFusion is largely required to follow whatever convention arrow-rs takes.

@erratic-pattern
Copy link

erratic-pattern commented Oct 14, 2024

From our (InfluxData's) side, we are most likely going to unroll the cause chain for object_store errors manually.

@kylebarron
Copy link
Contributor

From our (InfluxData's) side, we are most likely going to unroll the cause chain for object_store errors manually.

Would you be able to link to how you do this?

@crepererum
Copy link
Contributor Author

Roughly speaking, you just call Error::source in a loop and add some string handling:

#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
#[repr(transparent)]
pub struct DisplaySourceChain<T>(T);

impl<T: Error + 'static> Display for DisplaySourceChain<T> {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        // walk the source chain and collect error messages
        let mut err_msgs = Vec::new();
        let mut current_err = Some(&self.0 as &(dyn Error + 'static));
        while let Some(err) = current_err {
            let err_msg = err.to_string();
            err_msgs.push(err_msg);
            current_err = err.source();
        }
        // produce output message parts from source error messages
        // message parts are delimited by the substring ": "
        let mut out_parts = Vec::with_capacity(err_msgs.capacity());
        for err_msg in &err_msgs {
            // not very clean but std lib doesn't easily support splitting on two substrings
            for err_part in err_msg.split(": ").flat_map(|s| s.split("\ncaused by\n")) {
                if !err_part.is_empty() && !out_parts.contains(&err_part) {
                    out_parts.push(err_part);
                }
            }
        }
        write!(f, "{}", out_parts.join(": "))?;
        Ok(())
    }
}

Probably not the most elegant code, but it works for us.

You can use a similar trick + downcasting to convert errors into HTTP status codes, in case you're running the code in an REST or gRPC API.

@kylebarron
Copy link
Contributor

kylebarron commented Mar 6, 2025

Thanks! I wasn't aware of Error::source.

I think in obstore for simplicity I'll just print both the Display and Debug for errors

format!("{err}\n\nDebug source:\n{err:#?}")

@ion-elgreco
Copy link
Contributor

ion-elgreco commented Mar 15, 2025

I ended up with a slightly modified version of @crepererum to get this:

Image

#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
pub struct DisplaySourceChain<T> {
    err: T,
    error_name: String,
}

impl<T: Error + 'static> Display for DisplaySourceChain<T> {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        // walk the source chain and collect error messages
        let mut err_msgs = Vec::new();
        let mut current_err = Some(&self.err as &(dyn Error + 'static));
        while let Some(err) = current_err {
            let err_msg = err.to_string();
            err_msgs.push(err_msg);
            current_err = err.source();
        }
        // produce output message parts from source error messages
        // message parts are delimited by the substring ": "
        let mut out_parts = Vec::with_capacity(err_msgs.capacity());
        for err_msg in &err_msgs {
            // not very clean but std lib doesn't easily support splitting on two substrings
            for err_part in err_msg.split(": ").flat_map(|s| s.split("\ncaused by\n")) {
                if !err_part.is_empty()
                    && !out_parts.contains(&err_part)
                    && !out_parts.iter().map(|p| p.contains(&err_part)).any(|v| v)
                {
                    out_parts.push(err_part);
                }
            }
        }
        for (i, part) in out_parts.iter().enumerate() {
            if i == 0 {
                write!(f, "{}\n", part)?;
            } else {
                write!(
                    f,
                    "{}\x1b[31m{}\x1b[0m {}\n",
                    " ".repeat(self.error_name.len() + ": ".len() + i),
                    "↳",
                    part
                )?;
            }
        }
        Ok(())
    }
}

@alamb
Copy link
Contributor

alamb commented Mar 20, 2025

Migrating from arrow-rs issue #6377

@alamb alamb transferred this issue from apache/arrow-rs Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants