Skip to content

ci: added miri job #1103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed

ci: added miri job #1103

wants to merge 9 commits into from

Conversation

xqft
Copy link
Contributor

@xqft xqft commented Jan 31, 2023

Closes #1088

Co-authored-by: lambdaclass-user <[email protected]>
@xqft xqft requested a review from gakonst as a code owner January 31, 2023 14:33
@codecov-commenter
Copy link

codecov-commenter commented Jan 31, 2023

Codecov Report

Merging #1103 (dabe11c) into main (e0dbcae) will increase coverage by 0.13%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #1103      +/-   ##
==========================================
+ Coverage   74.63%   74.77%   +0.13%     
==========================================
  Files         321      333      +12     
  Lines       34954    35518     +564     
==========================================
+ Hits        26088    26558     +470     
- Misses       8866     8960      +94     
Flag Coverage Δ
unit-tests 74.77% <100.00%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
crates/primitives/src/bits.rs 80.00% <ø> (ø)
crates/primitives/src/hex_bytes.rs 93.39% <ø> (ø)
crates/storage/codecs/derive/src/arbitrary.rs 98.33% <100.00%> (+0.02%) ⬆️
crates/primitives/src/peer.rs 47.36% <0.00%> (-15.79%) ⬇️
crates/net/network/src/swarm.rs 61.63% <0.00%> (-0.79%) ⬇️
crates/net/network/src/manager.rs 47.44% <0.00%> (-0.57%) ⬇️
crates/stages/src/stages/bodies.rs 92.68% <0.00%> (-0.45%) ⬇️
crates/net/network/src/peers/manager.rs 83.64% <0.00%> (-0.02%) ⬇️
bin/reth/src/node/mod.rs 0.00% <0.00%> (ø)
bin/reth/src/stage/mod.rs 0.00% <0.00%> (ø)
... and 163 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it has problems with the fuzz tests, doesn't look like UB.
so perhaps we annotate them with miri ignore?

maybe we don't want to run this on every PR
wdyt @onbjerg ?

@mattsse mattsse added the A-observability Related to tracing, metrics, logs and other observability tools label Jan 31, 2023
@xqft
Copy link
Contributor Author

xqft commented Jan 31, 2023

Yep, there's unsupported functionality within #[test_fuzz] and #[tokio::test] as specified by tokio's miri job. I've been debugging this for a while and the solution doesn't seem so trivial, I'm not sure how tokio solved this, they just seem to accepted the fact that some tests may fail by using --no-fail-fast.

@mattsse
Copy link
Collaborator

mattsse commented Jan 31, 2023

I believe the tokio crate has some comments regarding miri and tokio::test tokio::main in its codebase

xqft and others added 2 commits January 31, 2023 14:39
Co-authored-by: lambdaclass-user <[email protected]>
Co-authored-by: lambdaclass-user <[email protected]>
@xqft
Copy link
Contributor Author

xqft commented Jan 31, 2023

I did some investigation with the help of @fkrause98 and we don't see a way out from this right now:

  • The general problem is that miri doesn't support many functions and syscalls which are implemented in libraries like tokio and test-fuzz. In particular macros like #[test_fuzz] and #[tokio::test] derive unsupported functionality by miri. This means that any test executed with these will fail.
  • We think tokio's way of dealing with this is by only testing lib tests by using the --lib flag, not the bin crate one's which is where they implement the most unsupported functionality. There's only one test in their lib which would fail and they make miri ignore it by the use of #[cfg_attr(miri, ignore)]. Other two tests use this macro too but it's because they run too slow on miri. They also use the --no-fail-fast flag I think because of convenience, so only one failed miri test doesn't stop the whole job. I don't think it's related to the problem.

We could theoretically implement #[cfg_attr(miri, ignore)] in every test that fails because of unsupported, but:

  • They're too many. Every async tokio test will need to be ignored, also every fuzz test, and I found some of these which are generated by other functions, something that adds complexity.
  • Everytime a new test is implemented, we need to be careful to not use unsupported functionality or to implement the ignore macro. This creates burden for people who didn't jump into this rabbit hole yet.

Possible ways of action:

  • Using the -Zmiri-panic-on-unsupported flag and find a way to filter out every test which fails because of unsupported functionality (this is relevant)
  • Could it be possible to make miri ignore every test and enable it only on those we think could have UB?

New ideas and insight about this are welcome, we may be missing something

@xqft xqft marked this pull request as draft January 31, 2023 20:12
@gakonst
Copy link
Member

gakonst commented Jan 31, 2023

Thanks for investigating this @xqft - appreciate the detailed writeup & sharing of the link above.

The basic idea seems to be to pass -Z unstable-options --format json to the test runner, run Miri with -Zmiri-panic-on-unsupported, and then parse the resulting JSON to basically ignore tests that failed due to a panic. Though even without that processing, just MIRIFLAGS=-Zmiri-panic-on-unsupported cargo miri test should already tell you if there is any UB anywhere (which leads to Miri errors) or just unsupported operations (which lead to panics, that the test harness will complain about in its summary).

This seems like a good direction. Run the tests and parse the results.

I'm seeing the following on testfuzz, which seems easy to ignore. What is the equivalent for tokio tests? We could filter these out.

error: unsupported operation: can't call foreign function pipe2 on OS linux
--> /home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/pipe.rs:30:21
|
30 | cvt(libc::pipe2(fds.as_mut_ptr(), libc::O_CLOEXEC))?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't call foreign function pipe2 on OS linux
|

@onbjerg
Copy link
Collaborator

onbjerg commented Feb 1, 2023

I think in extension to above, we should probably not fail CI if this job fails seeing as miri is unstable/experimental? @mattsse

@xqft
Copy link
Contributor Author

xqft commented Feb 1, 2023

@gakonst using the -Zmiri-panic-on-unsupported flag every test that fails because of a panic should be about unsupported functionality, and every other which fails with an error should be because of UB. Also we can filter just by the unsupported operation string in every test's stdout.

xqft and others added 4 commits February 3, 2023 10:56
Co-authored-by: lambdaclass-user <[email protected]>
Co-authored-by: lambdaclass-user <[email protected]>
Co-authored-by: lambdaclass-user <[email protected]>
Co-authored-by: lambdaclass-user <[email protected]>
@xqft
Copy link
Contributor Author

xqft commented Feb 3, 2023

I arrived to this solution by creating a custom github action, it works by:

  • Outputting the tests results in json, parsing and formatting these in a readable way with jq.
  • Tests status can be ok, failed or unsupported. Unsupported tests' errors are supressed by the use of -Zmiri-panic-on-unsupported, although some tests seem to bypass this and throw errors on stdout anyway instead of just panicking.
  • Because of the unsupported tests, the job would always fail, so I piped the output to a script that exits with status 101 only if a regex matches that a test has status failed (because of UB). Else it exits with 0.
  • Proptests seem to hang with miri and get sigkilled

This is all really crumbly, the action facilitates things a bit but miri is slow and unstable (oficially experimental), I would only recommend to run this job maybe in a set apart branch that's up to date with main. Maybe a CI job with miri should be scraped entirely and just run it locally with a limited amount of tests each time.

@xqft xqft marked this pull request as ready for review February 3, 2023 15:43
Co-authored-by: lambdaclass-user <[email protected]>
@xqft xqft requested review from rakita and joshieDo as code owners February 3, 2023 16:39
@gakonst gakonst removed request for rakita and joshieDo February 3, 2023 20:08
Co-authored-by: lambdaclass-user <[email protected]>
@gakonst
Copy link
Member

gakonst commented Feb 5, 2023

oof it looks like it doesn't like FFI either

@gakonst gakonst added the S-needs-investigation This issue requires detective work to figure out what's going wrong label Feb 9, 2023
@onbjerg
Copy link
Collaborator

onbjerg commented Feb 14, 2023

We can integrate this with nextest which is pretty nice: https://nexte.st/book/miri.html

@xqft
Copy link
Contributor Author

xqft commented Feb 14, 2023

I discarded nextest because it doesn't yet support tests run in json output

@gakonst
Copy link
Member

gakonst commented Mar 9, 2023

@xqft what's the plan here? I think if we don't have a "clean" way of integrating with success, let's close and revisit later.

@xqft
Copy link
Contributor Author

xqft commented Mar 10, 2023

@gakonst I'm currently not active in reth because of other priorities, but I think miri for now is too incompatible with reth to be added to the CI, and ignoring those incompatibilities doesn't work too well. Agreed with revisiting later.

@xqft xqft closed this Mar 10, 2023
@gakonst
Copy link
Member

gakonst commented Mar 10, 2023

agreed - thank you for taking the time in digging into this, I learned something.

@gakonst gakonst deleted the add-miri-ci branch March 10, 2023 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-observability Related to tracing, metrics, logs and other observability tools S-needs-investigation This issue requires detective work to figure out what's going wrong
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ci: Integrate miri job
5 participants