Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse error falls back to Invalid #1312

Merged
merged 43 commits into from
Jul 18, 2023
Merged

Parse error falls back to Invalid #1312

merged 43 commits into from
Jul 18, 2023

Conversation

adrian-kong
Copy link
Contributor

@adrian-kong adrian-kong commented Mar 29, 2023

Description

This PR makes it so that if at any point in the parsing of sbp messages the parsing fails, it has the option of falling back to an InvalidMessage which can then be written to the output. The way this PR acheives this goal is through some major breaking changes. The idea is that when errors are thrown throughout the process, we store all that local information in the constructed error. That way, choosing how to write the output can become as simple as matching on the thrown error and then basically writing SbpMessage::from(error). This enables us to keep the previous behavior of skipping errors or returning on errors simply by refusing to handle the errors.

sbp2json, json2sbp & json2json now have a new optional arg --error_handler which allows the following options skip, return or ToInvalid. that skip and return are the equivalent of the false and true passed to fatal-errors arg which as been removed.

Care was taken to remove some panics in the parsing of SBP which could cause some types of invalid messages to be lost.

Finally this PR does not do things like try to recover from being passed in malformed json or given an I/O error. In case of those error, if the ToInvalid option is passed, the code will panic.

Some issues with this PR --

  1. it has a lot of breaking changes, and it kind of does a lot. Just want to make sure people like the direction i went in.
  2. the lack of recovering from malformed json means I left json2json to panic if given ToInvalid as an option.
  3. It has no unit tests. I wanted some eyes on this code and proposed solution before totally getting it over the finish line. (edit This PR added unit tests when going from an invalid string of bytes -> a message that was parsed as an Sbp::Invalid add unit tests for recovering from parsing errors of sbp #1347)

JIRA Reference

https://swift-nav.atlassian.net/browse/DEVINFRA-1220

)?)
let sender_id = self.sender_id();
let mut payload = self.payload();
let msg = Sbp::from_parts(self.msg_type(), sender_id, payload).or_else(|_| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would mask malformed messages though? Like if the message type is a known message but the payload is too short. Then you'd have an Unknown message with a message type that corresponds to a known message which is a little odd. But maybe that's what we want, I guess it's a trade off between being lenient and surfacing errors better

Copy link
Contributor Author

@adrian-kong adrian-kong Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea... think we're just treating Unknown as the Frame at this point to serialize as json :/

although that makes sense for the purpose of the Unknown message i guess.

maybe a better compromise would to just add a flag like strict to address dropping payload errors?

or maybe a 'from payload' like 'from fields' for json

issue with the raw decoder or persisting broken data is where do we actually trim the unnecessary data - which thought everything would propagate up to sbp converters and they would drop those - or maybe we should just never drop them 😱

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Decoder should only output valid SBP (as in CRC is good and the payload is correct), if a user wants access to potentially invalid messages they can go with the Framer I think. So for example sbp2json should probably be doing something like

for frame in Framer::new(rdr) {
    if let Ok(msg) = frame.to_sbp() { 
        // valid crc/payload, possibly an Unknown message
        json_encoder.write(&msg)?;
    } else {
        // invalid crc or payload, could also be an Unknown message
        // need some json representation of a frame
        json_encoder.write(&frame)?;
    }
} 

@@ -78,7 +79,13 @@ where
W: Write,
F: Formatter + Clone,
{
let source = maybe_fatal_errors(sbp::iter_messages(input), fatal_errors);
let source = maybe_fatal_errors(sbp::iter_frames(input), fatal_errors).map(|frame| {
frame.to_sbp().unwrap_or(Sbp::Unknown(Unknown {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably implement Serialize/Deserialize for Frame instead of using Unknown for the case when the frame is invalid so we can distinguish between an Unknown message type and in invalid message. It should look the same (as in serialize/deserialize to a struct with msg_type, sender_id and payload fields) but then you can tell whether or not its an issue with an outdated version of libsbp or a malformed/corrupted message.

Also you'll need to handle the other direction for this is json2sbp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although I guess we are using an untagged enum so you wouldn't be able to tell the difference between a Frame and an Unknown hmmm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm, is unknown actually used for anything tho
could just use unknown for the types of msg that we dont known about

this would make it so we dont need to do anything to json2sbp since the msgs they produce would just be kinda the payload except wrapped with unknown 🤔

unless we want exact raw corrupt packet without wrapping anything then might be a bit more complicated than this.

Copy link
Contributor Author

@adrian-kong adrian-kong Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notoriaga i.e. would it be ok to simply wrap around as an unknown message (since these msgs are indeed 'unknown') or do we want some special form of deserialization in JSON for corrupt messages in addition to the sbp?
i.e.

{
    "msg_type": "...",
    "payload": "...",
    "sender_id": "...."
},
{
    "corrupt_data": "..."
}

or simply just handling corrupt_data inside Unknown packets like

{
    "msg_type": "0",
    "sender_id": "0",
    "payload": "the corrupt msg"
}

which would then address going from json2sbp - but would serialize as an unknown msg from there onwards which means

  1. it would be handled as a legit unknown message (not sure when unknowns are used)
  2. it would be serialized as a real message, (no modification needed on the sbp2json side - downside is it would be serialized as unknown)

^ tradeoffs without introducing an additional way to serialize corrupt data as JSON format but it would make some sense since these messages are labelled as unknown

@jbangelo @silverjam

Copy link
Contributor

@notoriaga notoriaga Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also add a msg_name field to the json representation of messages like with rtcm2json so then we'd have

{
    "msg_type": 123,
    "msg_name": UNKNOWN,
    "sender_id": 4,
    "payload": [...]
}
{
    "msg_type": 321,
    "msg_name": FRAME,
    "sender_id": 4,
    "payload": [...]
}

could also be nice to have for jq stuff, the message names are bit easier to remember than the ids

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think the message names are alread yhtere for sbp2json, making this field necessary would mean json2sbp only uses this field to check specific case where msg_name == "FRAME" 🤔 not sure if theres any better way or...
^^ this also seems much more complicated / invovled tho #1317 WIP
hmmmm....

rust/sbp2json/src/lib.rs Outdated Show resolved Hide resolved
@silverjam
Copy link
Contributor

Changes I pushed make the JSON output "look" better for random garbage data:

{"crc":44120,"length":132,"msg_type":17681,"msg_name":"UNKNOWN","payload":"QhFF87pNdUgVvuMBbQj/naRyVCUVbhxuPfguaBx5/tgK2rImLA7g74QTHUg+mRdgUCaUL5PnZUtAO+Gumc0xt1OQwVQpsBhSZcbOPMJlImu4a7j33UA9DhL/X1bbS7nKYwjZISrhWcJM+WMHakvH4bOELQ5igBFSAWPWF6mIi2Vdcsot","preamble":85,"sender":47859}
{"crc":51746,"length":116,"msg_type":32385,"msg_name":"UNKNOWN","payload":"JIF+/Uq4ENNpq+uECTleTOXMbPgZ2w0SWeUh4fs6XzumFAgA2kI3vCduAcdeOe35PfI1ejoSkKJoZ9r/lKujjxbUH1yIX73GO0/DGRw/WpHMwUi0y1y/CdbTcxx50ZAZqrh3vOfeUVCb/g5bpwTxEX1x3bA=","preamble":85,"sender":19197}
{"crc":56169,"length":125,"msg_type":35244,"msg_name":"UNKNOWN","payload":"PKyJ6OgEII2cFCgtjjAz+fNOCt3ikYjqkXzgj8sMqslc2pszXn5j4+ZK66zfkf3JKDIE/b80v+iVCZVSL+V0FBBHTAa8z9v7sG1EzFpQvg8ISDdkvLYtcjgqzCZCACHbmRWeHz0yV3hUkJjUH25neHYWDGTWtK91CGrE13A=","preamble":85,"sender":59624}

But tests are having issues and the inverse json2sbp doesn't work on this new output:

data did not match any variant of untagged enum JsonInput
data did not match any variant of untagged enum JsonInput
error parsing payload
error parsing payload
error parsing payload
error parsing payload
error parsing payload
data did not match any variant of untagged enum JsonInput
error parsing payload

@silverjam silverjam marked this pull request as draft June 1, 2023 16:09
@silverjam
Copy link
Contributor

Converted to draft to remove from review list until we can make more progress on this.

@pcrumley
Copy link
Contributor

pcrumley commented Jun 1, 2023

@silverjam I'm picking this up now, but i don't know what you mean by inverse json2sbp. can you explain that workflow to me? I'd like to capture it as a unit test if possible

But tests are having issues and the inverse json2sbp doesn't work on this new output:

@silverjam
Copy link
Contributor

@silverjam I'm picking this up now, but i don't know what you mean by inverse json2sbp. can you explain that workflow to me? I'd like to capture it as a unit test if possible

But tests are having issues and the inverse json2sbp doesn't work on this new output:

I think I just meant that json2sbp isn't handling the output from sbp2json anymore. The main idea with this PR is we wanted sbp2json | json2sbp to be "idempotent" even in the face of bad/corrupted data. I think we've figured out a reasonable way to represent corrupted data in the JSON output, but now need to do the work to get json2sbp to understand it.

@silverjam
Copy link
Contributor

Looks like we might need to bump the compiler version and MSRV that we're using or remove these let-else blocks:

error[E0658]: `let...else` statements are unstable
  --> sbp/src/ser.rs:92:5
   |
92 | /     let Some(msg_type) = msg.message_type() else {
93 | |         // this should be unreachable because things
94 | |         // without message ids should be caught by invalid
95 | |         // messages
96 | |         return Err(WriteFrameError::NoMessageType);
97 | |     };
   | |______^
   |
   = note: see issue #87335 <https://github.com/rust-lang/rust/issues/87335> for more information

error[E0658]: `let...else` statements are unstable
  --> sbp/src/ser.rs:86:5
   |
86 | /     let Some(sender_id) = msg.sender_id() else {
87 | |         // this should be unreachable because things
88 | |         // without sender ids should be written as invalid
89 | |         // messages
90 | |         return Err(WriteFrameError::NoSenderId);
91 | |     };
   | |______^
   |
   = note: see issue #87335 <https://github.com/rust-lang/rust/issues/87335> for more information

@pcrumley
Copy link
Contributor

I am very close to getting this within spec, it just seems like a CRC is added somewhere, e.g this is what
./target/release/sbp2json --error-handler ToInvalid ~/Downloads/corrupted.sbp > out
write invalid messages as:

{"msg_name":"INVALID","payload":"VQoCw5UibB1EFL/0ELSA4EJAzKTjmU2eXsCYqanZWkCiAAwBDwl23lUO","crc":3669}

I'm not sure where that extra crc is coming from. I also removed a few other places where panics could happen by replacing read_u16_le() with something that tries to cast the slice to a 2 byte array and then calls u16::from_bytes_le.

@silverjam
Copy link
Contributor

silverjam commented Jul 12, 2023

Seeing some unexpected results when just using garbage data, these might be "expected" given the current parsing behavior, but not what I initially expected would happen (given the intent of this PR).

If I test with the short.sbp I see INVALID messages being surfaced, which is good:

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <test_data/short.sbp 2>/dev/null | grep INVALID | wc -l
9

However, if I start by creating a bit of garbage data:

❯ dd if=/dev/urandom count=1 bs=64 of=corrupt.bin
1+0 records in
1+0 records out
64 bytes copied, 7.3007e-05 s, 877 kB/s

Contents of the corrupt blob:

❯ od -tx1 corrupt.bin
0000000 0a 28 61 b2 99 ba d6 1d 42 15 f7 9f 36 f8 ca 72
0000020 97 41 af 29 b9 0b f1 0e d3 1d ef ab d4 70 ac 1e
0000040 02 71 a0 59 c1 78 95 5d f9 e5 f9 00 6a 38 06 69
0000060 06 8d f1 80 a3 e2 a1 3c 6b ab 42 e8 18 0a f0 70
0000100

Then attempt to decode that I don't get an invalid message:

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt.bin | grep INVALID | wc -l
0

Putting this data at the start and end yields similar results:

❯ cat corrupt.bin test_data/roundtrip.sbp >corrupt_start.bin

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_start.bin | grep INVALID | wc -l
0

❯ cat test_data/roundtrip.sbp corrupt.bin >corrupt_end.bin

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_end.bin | grep INVALID | wc -l
0

Similar results for the middle:

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <test_data/roundtrip.sbp | head -2500 >head.json

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <test_data/roundtrip.sbp | tail -n +2501 >tail.json

❯ wc -l head.json -l tail.json
   2500 head.json
   2500 tail.json
   5000 total

❯ cargo run --bin json2sbp --quiet <tail.json >tail.bin

❯ cargo run --bin json2sbp --quiet <head.json >head.bin

❯ cat head.bin tail.bin | cargo run --quiet --bin sbp2json | wc -l
5000

❯ cat head.bin corrupt.bin tail.bin >corrupt_middle.bin

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_middle.bin | grep INVALID | wc -l
0

If I use a corrupt blob that happens to start with an SBP preamble byte I get better results. But only in certain cases.

To create the bad blob, you can run xxd in reverse:

xxd corrupt.bin >corrupt.txt
nvim corrupt.txt # change start byte to 55
xxd -r <corrupt.txt >corrupt_sbp_start.bin

This should yield:

❯ od -tx1 corrupt_sbp_start.bin
0000000 55 28 61 b2 99 ba d6 1d 42 15 f7 9f 36 f8 ca 72
0000020 97 41 af 29 b9 0b f1 0e d3 1d ef ab d4 70 ac 1e
0000040 02 71 a0 59 c1 78 95 5d f9 e5 f9 00 6a 38 06 69
0000060 06 8d f1 80 a3 e2 a1 3c 6b ab 42 e8 18 0a f0 70
0000100

As the only chunk of data we don't get anything back:

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_sbp_start.bin | wc -l
0

But in the start and middle test cases we do get invalid messages back:

❯ cat corrupt_sbp_start.bin test_data/roundtrip.sbp >corrupt_start.bin

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_start.bin 2>/dev/null | grep INVALID | wc -l
1

❯ cat head.bin corrupt_sbp_start.bin tail.bin >corrupt_middle.bin

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_middle.bin 2>/dev/null | grep INVALID | wc -l
1

We don't get anything if the bad data is at the end though:

❯ cat test_data/roundtrip.sbp corrupt_sbp_start.bin >corrupt_end.bin

❯ cargo run --bin sbp2json --quiet -- --error-handler to-invalid <corrupt_end.bin | grep INVALID | wc -l
0

Ultimately I think we want these test case to pass, but we might be getting into the weeds of how the SBP parser currently works, so we'll probably want to fix these issues with a new PR.


The above blobs are attached here: corrupt_blobs.zip

@pcrumley
Copy link
Contributor

There is something else, where data aren't round tripping from sbp -> json -> sbp properly, even when it seems to create invalid messages.

This PR is at the point now where it needs unit tests on this behavior, but it is so large I am going to do a stacked PR to it, because it is getting too unwieldy. I am hopeful that with unit tests these bugs will be easy to fix.

@silverjam
Copy link
Contributor

As long as we can still round trip valid data I'm not sure I want to invest more time on this right now, once we get the new tests in let's think about moving on. We've made a lot of good progress but this is starting to be a bigger time sink than I expected.

This PR adds a unit test showing that parsing errors fallsback to an
invalid message.

Seeing this unit test makes it clear why some of the questions
@silverjam was having about invoking this on certain inputs causes it to
pass and others cause it to fail. Additionally the round-tripping on
arbitrary bytes is not possible yet because both the iter_frame &
iter_messages will only return a message if there is something that
starts with the prelude byte, and it returns the bytes that exist
between that prelude and HEADER_LEN, whatever the length of the payload
is on byte 5 + the CRC_LEN. This means that if there are extra bytes
sent in the stream for aliasing for instance, the framer ignores those
bytes, and they are never even tried to be parsed.

@silverjam gave two examples that do not throw an error as the unknown
fallback is currently implemented. One of those has been added as a unit
test here. But the main takeaway is the invalid-fallback as currently
written only works on messages that are properly framed but perhaps
wrongly labeled or with a CRC error.
@@ -19,7 +20,7 @@ const BASE64_BUFLEN: usize = BUFLEN * 4;
pub fn to_writer<W, M>(mut writer: W, msg: &M) -> Result<(), JsonError>
where
W: io::Write,
M: SbpMessage + Serialize,
M: SbpMessage + Serialize + WireFormat + Clone,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think WireFormat was introduced in all of these methods because it was removed form SbpMessage? Are we sure that all of these methods really need WireFormat?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to crate::ser::to_buffer on line 239 requires WireFormat, which then has to be propagated up to pretty much all of the functions in json::ser. I took out WireFormat from the SbpMessage trait in an abandoned refactoring, so if you prefer i can re-add it.

Finally regarding benchmarks, it seems like the rust version is too fast and I'll need to bump some of the expected values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took out WireFormat from the SbpMessage trait in an abandoned refactoring, so if you prefer i can re-add it.

Let's re-add it if it's no longer needed

@pcrumley pcrumley marked this pull request as ready for review July 13, 2023 17:44
@pcrumley pcrumley requested a review from a team as a code owner July 13, 2023 17:44
@silverjam
Copy link
Contributor

@pcrumley Can try cherry-picking ef262b4 to see if it fixes benchmarks here too.

@pcrumley
Copy link
Contributor

@silverjam it passes now! 🎉

@silverjam
Copy link
Contributor

silverjam commented Jul 14, 2023

@adrian-kong FYI, after this PR is merged, the next release cut of libsbp should be 5.0.0.

@pcrumley
Copy link
Contributor

I'm just waiting to merge to let other small changes in if we want to cut two releases. Otherwise I think it is g2g

@silverjam
Copy link
Contributor

silverjam commented Jul 14, 2023

I'm just waiting to merge to let other small changes in if we want to cut two releases. Otherwise I think it is g2g

Two releases is a good idea, let's wait for Adrian to cut 4.17.0, then we can merge this and cut 5.0.0.

cc @pcrumley @adrian-kong

@silverjam
Copy link
Contributor

@pcrumley 4.17.0 was cut I think we're good to merge this

@pcrumley pcrumley merged commit 5f3909c into master Jul 18, 2023
13 checks passed
@pcrumley pcrumley deleted the adrian/unknown-fallback branch July 18, 2023 20:29
@pcrumley
Copy link
Contributor

Thanks for letting me push the button

@silverjam
Copy link
Contributor

3. It has no unit tests. I wanted some eyes on this code and proposed solution before totally getting it over the finish line.

Just noticed this, this isn't true right? I think I saw some unit test updates.

@pcrumley
Copy link
Contributor

No, It has unit test from bytes to Sbp::Invalid -> #1347

In terms of unit test coverage, I think it is OK but there should be one that shows an invalid JSON message is parsed properly. I will do that today.

@pcrumley
Copy link
Contributor

pcrumley commented Jul 18, 2023

see this pr which adds more test coverage: #1352

pcrumley added a commit that referenced this pull request Jul 18, 2023
# Description

@swift-nav/devinfra

This adds an integration test (with some unit tests inside of it) which
test the round tripping of an invalid message from invalid sbp -> json
with msg_name "INVALID" -> invalid sbp. Covers a gap in test coverage
from an earlier PR: #1312

Please note, there are no non-test code changes here, so it passed as
is.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants