Skip to content

Commit

Permalink
Merge branch 'topic/bbannier/issue-1667'
Browse files Browse the repository at this point in the history
  • Loading branch information
bbannier committed Feb 21, 2024
2 parents c39bdb5 + 10e99e9 commit a39dcea
Show file tree
Hide file tree
Showing 7 changed files with 135 additions and 52 deletions.
28 changes: 28 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -1,3 +1,31 @@
1.10.0-dev.147 | 2024-02-21 10:30:22 +0100

* GH-1667: Always advance input before attempting resynchronization. (Benjamin Bannier, Corelight)

When we enter resynchronization after hitting a parse error we
previously would have left the input alone, even though we know it fails
to parse. We then relied fully on resynchronization to advance the
input.

While this just pushed work downstream when synchronizing on literals,
it could cause us loosing input if synchronizing on regular expressions
if we happened to fail parsing due to a gap which is now at the front of
the input (parse errors from gaps are the most likely resynchronization
scenario when parsing genuine traffic); in this case the regular
expression would synchronize at the second byte after the input and we
would synchronize only at a later position.

With this patch we always forcibly advance the input to the next non-gap
position. This has no effect for synchronization on literals, but allows
it to happen earlier for regular expressions.

Closes #1667.

* Refactor test `spicy.types.unit.synchronize-on-gap`. (Benjamin Bannier, Corelight)

This refactoring cleans up how we feed gaps into the parser to testing
with more inputs simpler.

1.10.0-dev.144 | 2024-02-14 15:55:35 +0100

* GH-1652: Fix filters consuming too much data. (Benjamin Bannier, Corelight)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.10.0-dev.144
1.10.0-dev.147
4 changes: 4 additions & 0 deletions spicy/toolchain/src/compiler/codegen/parser-builder.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1242,6 +1242,10 @@ struct ProductionVisitor
pushBuilder(builder()->addWhile(search_start, builder::bool_(true)), [&]() {
// Generate code which synchronizes the input. This will throw a parse error
// if we hit EOD which will implicitly break from the loop.

// The current input has failed, either since it does not match or since
// data was missing. Advance the input to go to the next data.
pb->advanceToNextData();
syncProduction(p);

pushBuilder(builder()->addIf(builder::equal(builder::id("search_start"), state().cur)), [&]() {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,33 +1,37 @@
### BTest baseline data generated by btest-diff. Do not edit. Use "btest -U/-u" to update. Requires BTest >= 0.63.
[spicy-verbose] - state: type=sync::Xs input="A<gap>..." stream=0xXXXXXXXX offsets=0/0/0/1027 chunks=3 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=no
[spicy-verbose] - parsing production: Unit: sync_Xs -> xs
[spicy] sync::Xs
[spicy-verbose] - state: type=sync::Xs input="A<gap>..." stream=0xXXXXXXXX offsets=0/0/0/1027 chunks=3 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=no
[spicy-verbose] - state: type=sync::X1 input="A" stream=0xXXXXXXXX offsets=0/0/0/1 chunks=1 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=no
[spicy-verbose] - parsing production: Unit: sync_X1 -> xs
[spicy-verbose] - state: type=sync::X1 input="A" stream=0xXXXXXXXX offsets=0/0/0/1 chunks=1 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=no
[spicy-verbose] - parsing production: While: xs -> while(<look-ahead-found>): anon
[spicy-verbose] - state: type=sync::Xs input="A<gap>..." stream=0xXXXXXXXX offsets=0/0/0/1027 chunks=3 frozen=no mode=default trim=yes lah=1 lah_token="A" recovering=no
[spicy-verbose] - state: type=sync::Xs input="A<gap>..." stream=0xXXXXXXXX offsets=0/0/0/1027 chunks=3 frozen=no mode=default trim=yes lah=1 lah_token="A" recovering=no
[spicy-verbose] - state: type=sync::X1 input="A" stream=0xXXXXXXXX offsets=0/0/0/1 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="A" recovering=no
[spicy-verbose] - state: type=sync::X1 input="A" stream=0xXXXXXXXX offsets=0/0/0/1 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="A" recovering=no
[spicy-verbose] - parsing production: Ctor: anon -> /(A|B|C)/ (regexp) (container 'xs')
[spicy-verbose] - consuming look-ahead token
[spicy-verbose] - trimming input
[spicy-verbose] - trimming input
[spicy-verbose] - got container item
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 0
[spicy-verbose] resuming after insufficient input, now have 1024 for stream 0xXXXXXXXX
[spicy-verbose] failed to parse list element, will try to synchronize at next possible element
[spicy-verbose] - state: type=sync::Xs input="<gap>..." stream=0xXXXXXXXX offsets=1/0/1/1027 chunks=2 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=yes
[spicy-verbose] - trimming input
[spicy-verbose] - state: type=sync::Xs input="BC" stream=0xXXXXXXXX offsets=1025/0/1025/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="B" recovering=yes
[spicy-verbose] - state: type=sync::X1 input="" stream=0xXXXXXXXX offsets=1025/0/1025/1025 chunks=0 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=yes
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 0
[spicy-verbose] resuming after insufficient input, now have 2 for stream 0xXXXXXXXX
[spicy-verbose] - state: type=sync::X1 input="BC" stream=0xXXXXXXXX offsets=1025/0/1025/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="B" recovering=yes
[spicy-verbose] successfully synchronized
[spicy-verbose] - state: type=sync::Xs input="BC" stream=0xXXXXXXXX offsets=1025/0/1025/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="B" recovering=no
[spicy-verbose] - state: type=sync::X1 input="BC" stream=0xXXXXXXXX offsets=1025/0/1025/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="B" recovering=no
[spicy-verbose] - parsing production: Ctor: anon -> /(A|B|C)/ (regexp) (container 'xs')
[spicy-verbose] - consuming look-ahead token
[spicy-verbose] - trimming input
[spicy-verbose] - trimming input
[spicy-verbose] - got container item
[spicy-verbose] - state: type=sync::Xs input="C" stream=0xXXXXXXXX offsets=1026/0/1026/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="C" recovering=no
[spicy-verbose] - state: type=sync::Xs input="C" stream=0xXXXXXXXX offsets=1026/0/1026/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="C" recovering=no
[spicy-verbose] - state: type=sync::X1 input="C" stream=0xXXXXXXXX offsets=1026/0/1026/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="C" recovering=no
[spicy-verbose] - state: type=sync::X1 input="C" stream=0xXXXXXXXX offsets=1026/0/1026/1027 chunks=1 frozen=no mode=default trim=yes lah=1 lah_token="C" recovering=no
[spicy-verbose] - parsing production: Ctor: anon -> /(A|B|C)/ (regexp) (container 'xs')
[spicy-verbose] - consuming look-ahead token
[spicy-verbose] - trimming input
[spicy-verbose] - trimming input
[spicy-verbose] - got container item
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 0
[$xs=[b"A", b"B", b"C"]]
[spicy-verbose] resuming after insufficient input, now have 0 for stream 0xXXXXXXXX
[spicy-verbose] - setting field 'xs' to '[b"A", b"B", b"C"]'
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
### BTest baseline data generated by btest-diff. Do not edit. Use "btest -U/-u" to update. Requires BTest >= 0.63.
[spicy-verbose] - state: type=sync::X2 input="A" stream=0xXXXXXXXX offsets=0/0/0/1 chunks=1 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=no
[spicy-verbose] - parsing production: Unit: sync_X2 -> xs_2
[spicy-verbose] - state: type=sync::X2 input="A" stream=0xXXXXXXXX offsets=0/0/0/1 chunks=1 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=no
[spicy-verbose] - parsing production: While: xs_2 -> while(<look-ahead-found>): anon_2
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 1
[spicy-verbose] resuming after insufficient input, now have 1025 for stream 0xXXXXXXXX
[spicy-verbose] failed to parse list element, will try to synchronize at next possible element
[spicy-verbose] - trimming input
[spicy-verbose] - state: type=sync::X2 input="" stream=0xXXXXXXXX offsets=1025/0/1025/1025 chunks=0 frozen=no mode=default trim=yes lah=n/a lah_token="n/a" recovering=yes
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 0
[spicy-verbose] resuming after insufficient input, now have 2 for stream 0xXXXXXXXX
[spicy-verbose] - state: type=sync::X2 input="AB" stream=0xXXXXXXXX offsets=1025/0/1025/1027 chunks=1 frozen=no mode=default trim=yes lah=2 lah_token="AB" recovering=yes
[spicy-verbose] successfully synchronized
[spicy-verbose] - state: type=sync::X2 input="AB" stream=0xXXXXXXXX offsets=1025/0/1025/1027 chunks=1 frozen=no mode=default trim=yes lah=2 lah_token="AB" recovering=no
[spicy-verbose] - parsing production: Ctor: anon_2 -> /AB/ (regexp) (container 'xs')
[spicy-verbose] - consuming look-ahead token
[spicy-verbose] - trimming input
[spicy-verbose] - trimming input
[spicy-verbose] - got container item
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 0
[spicy-verbose] resuming after insufficient input, now have 2 for stream 0xXXXXXXXX
[spicy-verbose] - state: type=sync::X2 input="AB" stream=0xXXXXXXXX offsets=1027/0/1027/1029 chunks=1 frozen=no mode=default trim=yes lah=2 lah_token="AB" recovering=no
[spicy-verbose] - state: type=sync::X2 input="AB" stream=0xXXXXXXXX offsets=1027/0/1027/1029 chunks=1 frozen=no mode=default trim=yes lah=2 lah_token="AB" recovering=no
[spicy-verbose] - parsing production: Ctor: anon_2 -> /AB/ (regexp) (container 'xs')
[spicy-verbose] - consuming look-ahead token
[spicy-verbose] - trimming input
[spicy-verbose] - trimming input
[spicy-verbose] - got container item
[spicy-verbose] suspending to wait for more input for stream 0xXXXXXXXX, currently have 0
[spicy-verbose] resuming after insufficient input, now have 0 for stream 0xXXXXXXXX
[spicy-verbose] - setting field 'xs' to '[b"AB", b"AB"]'
54 changes: 54 additions & 0 deletions tests/spicy/types/unit/synchronize-on-gap.spicy
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# @TEST-DOC: Validates that if a gap is encountered during recovery we can still resynchronize.
#
# @TEST-EXEC: spicyc -dj -o sync.hlto sync.spicy

# @TEST-EXEC: HILTI_DEBUG=spicy-verbose spicy-driver -p sync::X1 -F gap_between_matches sync.hlto >gap_between_matches.log 2>&1
# @TEST-EXEC: TEST_DIFF_CANONIFIER=${SCRIPTS}/canonify-spicy-debug btest-diff gap_between_matches.log

# @TEST-EXEC: HILTI_DEBUG=spicy-verbose spicy-driver -p sync::X2 -F gap_while_matching sync.hlto >gap_while_matching.log 2>&1
# @TEST-EXEC: TEST_DIFF_CANONIFIER=${SCRIPTS}/canonify-spicy-debug btest-diff gap_while_matching.log

# @TEST-START-FILE sync.spicy
module sync;

public type X1 = unit {
%port = 80/tcp;
xs: (/(A|B|C)/ &synchronize)[];
on %synced {
confirm;
}
};

# Test gap during regex match, regression test for #1667.
public type X2 = unit {
%port = 81/tcp;
xs: (/AB/ &synchronize)[];
on %synced {
confirm;
}
};
# @TEST-END-FILE

# @TEST-START-FILE gap_between_matches
!spicy-batch v2
@begin-flow id1 stream 80/tcp
@data id1 1
A
@gap id1 1024
@data id1 2
BC
@end-flow id1
# @TEST-END-FILE

# @TEST-START-FILE gap_while_matching
!spicy-batch v2
@begin-flow id1 stream 81/tcp
@data id1 1
A
@gap id1 1024
@data id1 2
AB
@data id1 2
AB
@end-flow id1
# @TEST-END-FILE
39 changes: 0 additions & 39 deletions tests/spicy/types/unit/synchronize-on-gap.test

This file was deleted.

0 comments on commit a39dcea

Please sign in to comment.