`read_from_stream_buffered` produces different results than other readers #57

tjallingt · 2023-09-06T09:26:58Z

See the following example:

use speedy::Readable;
use std::io;

#[derive(Debug, Readable)]
struct Test {
    #[speedy(default_on_eof)]
    list: Vec<u8>,
    #[speedy(default_on_eof)]
    value_kind: ValueKind,
}

#[derive(Debug, Default, Readable)]
#[speedy(tag_type = u8)]
enum ValueKind {
    #[default]
    Default = 0,
    Other = 1,
}

fn main() {
    let test_buffer: Vec<u8> = vec![
        0, 32, 0, 0, // list length
        // list data (missing)
        2, // value_kind
    ];

    let test_stream1 = io::Cursor::new(&test_buffer);
    let test_stream2 = io::Cursor::new(&test_buffer);

    let result_buffer = Test::read_from_buffer(&test_buffer);
    dbg!(result_buffer); // Err(Error { kind: InvalidEnumVariant })

    let result_stream_unbuffered = Test::read_from_stream_unbuffered(test_stream1);
    dbg!(result_stream_unbuffered); // Err(Error { kind: InvalidEnumVariant })

    let result_stream_buffered = Test::read_from_stream_buffered(test_stream2);
    dbg!(result_stream_buffered); // Ok(Test { list: [], value_kind: Default })
}

Because read_from_file will switch between read_from_stream_buffered and read_from_buffer_copying depending on the target (Linux x86 or not) this can also cause differences in parsing results between platforms. I happened to run into this when my tests would error on CI but not on my Windows dev machine.

Couple of notes:

All readers return the same error if the list length is smaller, for example vec![0, 31, 0, 0, 2]
The list length is consumed even though default_on_eof was used. I would have expected the reader to reset to the position before it errored.

The text was updated successfully, but these errors were encountered:

koute · 2023-09-06T09:30:34Z

So this discrepancy only happens when default_on_eof is used?

tjallingt · 2023-09-06T09:34:54Z

EDIT: used wrong data, updated to show what i used.

Just tested, they all produce different errors if i remove default_on_eof from both fields

[src/main.rs:29] &test_buffer = [
    0,
    31,
    0,
    0,
]
[src/main.rs:32] result_buffer = Err(
    Error {
        kind: InputBufferIsTooSmall {
            actual_size: 4,
            expected_size: 5,
        },
    },
)
[src/main.rs:35] result_stream_unbuffered = Err(
    Error {
        kind: IoError(
            Error {
                kind: UnexpectedEof,
                message: "failed to fill whole buffer",
            },
        ),
    },
)
[src/main.rs:38] result_stream_buffered = Err(
    Error {
        kind: UnexpectedEndOfInput,
    },
)

This is also a bit strange though, both the streaming readers error with assert_eq!(error.is_eof(), true), however read_from_buffer would not...

Changing the data to make the "array length" bigger again changes the error for the "stream buffered" reader.

[src/main.rs:29] &test_buffer = [
    0,
    32,
    0,
    0,
]
[src/main.rs:32] result_buffer = Err(
    Error {
        kind: InputBufferIsTooSmall {
            actual_size: 4,
            expected_size: 5,
        },
    },
)
[src/main.rs:35] result_stream_unbuffered = Err(
    Error {
        kind: IoError(
            Error {
                kind: UnexpectedEof,
                message: "failed to fill whole buffer",
            },
        ),
    },
)
[src/main.rs:38] result_stream_buffered = Err(
    Error {
        kind: IoError(
            Error {
                kind: UnexpectedEof,
                message: "failed to fill whole buffer",
            },
        ),
    },
)

koute · 2023-09-06T09:39:11Z

Right, so this needs to be mostly unified, and at the very least they should definitely be consistent whether they return Ok or Err. Thanks for the report.

tjallingt · 2023-09-06T09:41:36Z

Wait sorry my test data was wrong (i was changing it to see if i could understand more).
This is the actual output for the data in the issue:

[src/main.rs:28] &test_buffer = [
    0,
    32,
    0,
    0,
    2,
]
[src/main.rs:31] result_buffer = Err(
    Error {
        kind: UnexpectedEndOfInput,
    },
)
[src/main.rs:34] result_stream_unbuffered = Err(
    Error {
        kind: IoError(
            Error {
                kind: UnexpectedEof,
                message: "failed to fill whole buffer",
            },
        ),
    },
)
[src/main.rs:37] result_stream_buffered = Err(
    Error {
        kind: IoError(
            Error {
                kind: UnexpectedEof,
                message: "failed to fill whole buffer",
            },
        ),
    },
)

tjallingt · 2023-09-06T09:55:11Z

Right, so this needs to be mostly unified, and at the very least they should definitely be consistent whether they return Ok or Err. Thanks for the report.

Yea I agree, if some data causes an error when deserializing it doesn't matter what the error is exactly but at least the parsed output should always be consistent. Maybe you have an idea of that the issue could be? I'll investigate a bit further :)

On another note; I wonder what you think of default_on_eof consuming bytes when the data will be discarded.
I do imagine it could be tricky to solve (resetting the reader to the location where it "entered" the default_on_eof may not be possible?). Also a change in behaviour like that could be breaking probably.

Luckily i think its possible to work around it by implementing Readable myself for the parent struct. I can then just grab the remaining bytes as a vec and try to parse it with the different sub-types. So maybe all this needs is a note in the documentation?

tjallingt linked a pull request Feb 14, 2024 that will close this issue

fix: return consistent results from readers #66

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`read_from_stream_buffered` produces different results than other readers #57

`read_from_stream_buffered` produces different results than other readers #57

tjallingt commented Sep 6, 2023 •

edited

Loading

koute commented Sep 6, 2023

tjallingt commented Sep 6, 2023 •

edited

Loading

koute commented Sep 6, 2023

tjallingt commented Sep 6, 2023

tjallingt commented Sep 6, 2023 •

edited

Loading

read_from_stream_buffered produces different results than other readers #57

read_from_stream_buffered produces different results than other readers #57

Comments

tjallingt commented Sep 6, 2023 • edited Loading

koute commented Sep 6, 2023

tjallingt commented Sep 6, 2023 • edited Loading

koute commented Sep 6, 2023

tjallingt commented Sep 6, 2023

tjallingt commented Sep 6, 2023 • edited Loading

`read_from_stream_buffered` produces different results than other readers #57

`read_from_stream_buffered` produces different results than other readers #57

tjallingt commented Sep 6, 2023 •

edited

Loading

tjallingt commented Sep 6, 2023 •

edited

Loading

tjallingt commented Sep 6, 2023 •

edited

Loading