Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Errors more "narrow" #811

Merged
merged 10 commits into from
Oct 12, 2024
Merged

Make Errors more "narrow" #811

merged 10 commits into from
Oct 12, 2024

Conversation

RedPhoenixQ
Copy link
Contributor

This was discussed in #810 here.

This splits up error types so that there is almost one type for each module, which narrows the amount of error variants that is returned from each function.

For the functions where two or more error types may be returned the new combined_error! macro is used to create a new error enum that holds these variants with most error impl's automatically done.

The old errors::Error is still present and public. All other new errors implement From<_> for errors::Error so that the provided errors::Result type can still be used to try (?) any function from this crate.

I haven't spent much time adding/editing docs for these changes since most of them still apply without changes. Maybe the docs for errors::Error should be changed to make it clear that it will not be given out as an error from anywhere directly. There are also some names of the new error types that might need changing

@RedPhoenixQ RedPhoenixQ force-pushed the error-narrowing branch 2 times, most recently from a94ee2a to 004442c Compare September 29, 2024 19:31
@codecov-commenter
Copy link

codecov-commenter commented Sep 29, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 48.20513% with 101 lines in your changes missing coverage. Please review.

Project coverage is 60.14%. Comparing base (39b5905) to head (00a0962).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/errors.rs 4.25% 45 Missing ⚠️
src/name.rs 69.13% 25 Missing ⚠️
src/encoding.rs 47.05% 18 Missing ⚠️
src/events/mod.rs 57.14% 6 Missing ⚠️
examples/read_nodes.rs 0.00% 4 Missing ⚠️
src/escape.rs 0.00% 3 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #811      +/-   ##
==========================================
+ Coverage   60.08%   60.14%   +0.05%     
==========================================
  Files          41       41              
  Lines       15975    15985      +10     
==========================================
+ Hits         9599     9614      +15     
+ Misses       6376     6371       -5     
Flag Coverage Δ
unittests 60.14% <48.20%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@RedPhoenixQ RedPhoenixQ force-pushed the error-narrowing branch 2 times, most recently from 8fcf6fd to a0ab37f Compare September 29, 2024 20:13
@RedPhoenixQ
Copy link
Contributor Author

Sorry for the force pushes, it keeps running my usage example as a doc test and the annotations (like '''rust) mean different things in different rust version apparently

Copy link
Collaborator

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some doc links became broken, that need to be fixed

> cargo doc --all-features
warning: unresolved link to `encoding`
  --> src\encoding.rs:22:18
   |
22 | /// If feature [`encoding`] is disabled, the EncodingError is always `DecodeError::Utf8`:
   |                  ^^^^^^^^ no item named `encoding` in scope
   |
   = help: to escape `[` and `]` characters, add '\' before them like `\[` or `\]`
   = note: `#[warn(rustdoc::broken_intra_doc_links)]` on by default

warning: unresolved link to `Error::IllFormed`
   --> src\reader\buffered_reader.rs:308:75
    |
308 |     /// If a corresponding [`End`] event is not found, an error of type [`Error::IllFormed`]
    |                                                                           ^^^^^^^^^^^^^^^^ no item named `Error` in scope

warning: unresolved link to `Error::IllFormed`
  --> src\reader\slice_reader.rs:88:75
   |
88 |     /// If a corresponding [`End`] event is not found, an error of type [`Error::IllFormed`]
   |                                                                           ^^^^^^^^^^^^^^^^ no item named `Error` in scope

warning: `quick-xml` (lib doc) generated 3 warnings

I'm fine with the first 3 commits, but I'm unsure about the 4th (other two commits just the follow-ups fixing the compilation errors). I'm not sure that such a fine breakdown of errors will be convenient for work. In most cases, you will not be interested in a specific type of error and it will still be thrown higher, but the presence of many types of errors will complicate such throwing (it's not for nothing that things like anyhow::Error appeared).

Do you have opposite experience?


You can fix things and make a force push, GitHub UI provides a way to compare between force pushes, so it is fine). I would prefer to have a history without fix-up commits, so it would be great if each commit:

  • in a compilable state on CI (which means it is compiled fine with all CI tested combinations of flags, in practice usually check of cargo test and cargo test --all-features is enough)
  • cargo doc --all-features does not report warnings
  • each commit updates changelog with the relevant changes (so it is easely to understand later in which commit that change was made)

src/errors.rs Outdated
impl From<Utf8Error> for Error {
/// Creates a new `Error::NonDecodable` from the given error
impl From<EncodingError> for Error {
/// Creates a new `Error::DecodeError` from the given error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You named this variant EncodingError

Suggested change
/// Creates a new `Error::DecodeError` from the given error
/// Creates a new [`Error::EncodingError`] from the given error

Or do the opposite: name the variant DecodeError. Maybe this is preferred, because this error is possible only when reading

src/encoding.rs Outdated
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
match self {
Self::Utf8(e) => Some(e),
#[allow(unreachable_patterns)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not

Suggested change
#[allow(unreachable_patterns)]
#[cfg(feature = "encoding")]

?

///
/// If feature [`encoding`] is disabled, the EncodingError is always `DecodeError::Utf8`:
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum EncodingError {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add #[non_exhaustive] so the consumers forced to explicitly handle wildcard variant. Otherwise if some other crate in the dependency tree activates the encoding feature, the crates without wildcard handling and without encoding feature will fail to compile.

Suggested change
pub enum EncodingError {
#[non_exhaustive]
pub enum EncodingError {

src/encoding.rs Outdated
Comment on lines 51 to 53
Self::Utf8(e) => write!(f, "UTF-8 error: {}", e),
#[cfg(feature = "encoding")]
Self::Other(encoding) => write!(f, "Error occured when decoding {}", encoding.name()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make texts start with lower-case letter and unify them. I assume that error is used only when decoding, otherwise it is needed to tweak messages

Suggested change
Self::Utf8(e) => write!(f, "UTF-8 error: {}", e),
#[cfg(feature = "encoding")]
Self::Other(encoding) => write!(f, "Error occured when decoding {}", encoding.name()),
Self::Utf8(e) => write!(f, "cannot decode input using UTF-8: {}", e),
#[cfg(feature = "encoding")]
Self::Other(encoding) => write!(f, "cannot decode input using {}", encoding.name()),

src/encoding.rs Outdated
@@ -1,14 +1,10 @@
//! A module for wrappers that encode / decode data.

use std::borrow::Cow;
use std::{borrow::Cow, str::Utf8Error};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to not have nested imports:

Suggested change
use std::{borrow::Cow, str::Utf8Error};
use std::borrow::Cow;
use std::str::Utf8Error;

src/name.rs Outdated
Comment on lines 17 to 27
/// Error for when a reserved namespace is set incorrectly.
///
/// This error returned in following cases:
/// - the XML document attempts to bind `xml` prefix to something other than
/// `http://www.w3.org/XML/1998/namespace`
/// - the XML document attempts to bind `xmlns` prefix
/// - the XML document attempts to bind some prefix (except `xml`) to
/// `http://www.w3.org/XML/1998/namespace`
/// - the XML document attempts to bind some prefix to
/// `http://www.w3.org/2000/xmlns/`
InvalidPrefixBind {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we split error into small parts, maybe make a dedicated variant for each listed variant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we split error into small parts, maybe make a dedicated variant for each listed variant?

I will add a commit for this soon. Need to understand the code and the linked standard to see which error applies were which may take a bit longer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 8002cc6

src/name.rs Outdated
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::UnknownPrefix(prefix) => {
f.write_str("Unknown namespace prefix '")?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already mentioned, if we bring order in errors, make all texts with a small letter:

Suggested change
f.write_str("Unknown namespace prefix '")?;
f.write_str("unknown namespace prefix '")?;

src/name.rs Outdated
f.write_str("'")
}
Self::InvalidPrefixBind { prefix, namespace } => {
f.write_str("The namespace prefix '")?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Suggested change
f.write_str("The namespace prefix '")?;
f.write_str("the namespace prefix '")?;

src/errors.rs Outdated
$($variant:ident($error:path $(, $inner_type:path)?) $fmt_str:literal),+ $(,)?
) => {
#[derive(Debug)]
#[allow(missing_docs)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to update macro and write a documentation for each generated variant under which circumstances it will be returned. That is not always obvious from the name and description of the inner error.

Suggested change
#[allow(missing_docs)]

Also, I would prefer to have more traditional syntax for defining enums, in order to in the end the written code will look like:

combined_error! {
  /// Doc
  // derives
  pub enum SpecificError {
    /// Variant 1 doc
    Variant1(Variant1Error) => "display 1 text",
    /// Variant 2 doc
    Variant2(Variant2Error) => "display 2 text",
  }
}

Please also derive Debug for each error type.

@RedPhoenixQ
Copy link
Contributor Author

Some doc links became broken, that need to be fixed

Compiling the docs was an oversight. Will fix.

I'm fine with the first 3 commits, but I'm unsure about the 4th (other two commits just the follow-ups fixing the compilation errors). I'm not sure that such a fine breakdown of errors will be convenient for work. In most cases, you will not be interested in a specific type of error and it will still be thrown higher, but the presence of many types of errors will complicate such throwing (it's not for nothing that things like anyhow::Error appeared).

Do you have opposite experience?

I'm mostly interested in not having irrelevant options when calling lower level apis. I will agree that the attributes methods are inconvenient when returning ReadError (like the read_node example). I still think that having these methods return AttrError is better for when only using those apis.

I'm torn on whether this is better, especially when it's common for ReadError and AttrError appear in the same function. I still prefer returning AttrError but I agree it may not be worth it.

If this split beyond EncodingError and NamespaceError is not desired, the combined_error! macro could also be removed entirely.

You can fix things and make a force push, GitHub UI provides a way to compare between force pushes, so it is fine). I would prefer to have a history without fix-up commits, so it would be great if each commit:

  • in a compilable state on CI (which means it is compiled fine with all CI tested combinations of flags, in practice usually check of cargo test and cargo test --all-features is enough)

  • cargo doc --all-features does not report warnings

  • each commit updates changelog with the relevant changes (so it is easely to understand later in which commit that change was made)

Absolutely. The docs was an oversight and I know that one commit doesn't pass the tests. I will go back and rework all of these commits.

@RedPhoenixQ
Copy link
Contributor Author

When the errors are being changed anyway, there's an opportunity to be consistent about whether Error enum variants should be named SomeError::Io or SomeError::IoError. Any preference here?

@Mingun
Copy link
Collaborator

Mingun commented Oct 1, 2024

I still think that having these methods return AttrError is better for when only using those apis.

When we declare in API a more wide error that is really could happen, I'm fine to narrow the result type, but introducing new fine-granulated error types for that, I think, would be overkill.

The problem is that the API may not be well-established, and with the introduction of validation checks, it may turn out that some functions will return more errors. Some Linux package maintainers already complained, that quick-xml API changes too quickly :).

If this split beyond EncodingError and NamespaceError is not desired, the combined_error! macro could also be removed entirely.

Yes. So for now please left only the changes from the first 3 commits. Maybe in time the 4th commit also would be welcomed, who knows :)?

Any preference here?

SomeError::Io

@RedPhoenixQ
Copy link
Contributor Author

I belive I have fixed all issues from the previous review. The changelog is now incrementally updated every commit and all commits pass cargo test and cargo doc (with --all-features).

I have still included the AttrError change as I didn't understand if it should be discarded or not. Will remove it if you want.

Copy link
Collaborator

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I polished the PR slightly:

  • Normalized all error messages to start from lowercase
  • Made error-related code more consistent: removed unnecessary usages of write! macro, used Self instead of type name where possible (that two changes was related to the errors so it was logical to made them in that PR)
  • Implemented PartialEq and Eq for NamespaceError to simplify testing of the methods returning results with that error
  • That allowed to change tests from manual matching on result to use assert_eq! which will provide nice diff if failed
  • You forgot to return Some from Error::source for Error::Namespace -- fixed
  • Used tuple form instead of struct form for NamespaceError errors. Usually struct form with one field is not used
  • Removed BangType from changelog. It is internal type not visible to users

@Mingun Mingun merged commit 6eea6bb into tafia:master Oct 12, 2024
7 checks passed
@Mingun
Copy link
Collaborator

Mingun commented Oct 12, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants