-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add qemu style nix checks for hydra-cluster, hydra-node, hydra-tui #1647
Conversation
f057c61
to
8ff741e
Compare
Transaction costsSizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using
Script summary
|
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 5094 | 5.75 | 2.27 | 0.44 |
2 | 5297 | 7.09 | 2.80 | 0.46 |
3 | 5496 | 8.56 | 3.39 | 0.49 |
5 | 5902 | 11.12 | 4.39 | 0.53 |
10 | 6906 | 18.11 | 7.16 | 0.65 |
57 | 16355 | 82.81 | 32.75 | 1.77 |
Commit
transaction costs
This uses ada-only outputs for better comparability.
UTxO | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 567 | 10.52 | 4.15 | 0.29 |
2 | 759 | 13.86 | 5.65 | 0.34 |
3 | 944 | 17.33 | 7.20 | 0.38 |
5 | 1322 | 24.65 | 10.44 | 0.48 |
10 | 2253 | 45.22 | 19.36 | 0.75 |
20 | 4123 | 95.99 | 40.76 | 1.40 |
CollectCom
transaction costs
Parties | UTxO (bytes) | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|---|
1 | 57 | 560 | 22.14 | 8.66 | 0.42 |
2 | 114 | 671 | 33.89 | 13.40 | 0.55 |
3 | 170 | 786 | 46.27 | 18.50 | 0.69 |
4 | 227 | 893 | 62.56 | 25.17 | 0.88 |
5 | 284 | 1004 | 78.06 | 31.64 | 1.05 |
6 | 337 | 1116 | 93.57 | 38.25 | 1.23 |
Cost of Decrement Transaction
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 632 | 17.95 | 7.88 | 0.38 |
2 | 785 | 19.09 | 9.07 | 0.40 |
3 | 982 | 20.81 | 10.38 | 0.44 |
5 | 1350 | 26.05 | 13.92 | 0.52 |
10 | 2105 | 32.64 | 20.08 | 0.65 |
50 | 8036 | 99.81 | 75.38 | 1.85 |
Close
transaction costs
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 644 | 20.03 | 8.99 | 0.41 |
2 | 809 | 21.53 | 10.43 | 0.44 |
3 | 946 | 23.03 | 11.85 | 0.46 |
5 | 1373 | 26.98 | 15.60 | 0.54 |
10 | 1888 | 33.48 | 21.90 | 0.66 |
50 | 8021 | 96.90 | 82.84 | 1.88 |
Contest
transaction costs
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 683 | 25.86 | 11.12 | 0.47 |
2 | 798 | 27.69 | 12.65 | 0.50 |
3 | 996 | 29.72 | 14.45 | 0.54 |
5 | 1401 | 34.20 | 18.23 | 0.62 |
10 | 2082 | 43.45 | 26.07 | 0.78 |
40 | 6607 | 98.74 | 74.17 | 1.77 |
Abort
transaction costs
There is some variation due to the random mixture of initial and already committed outputs.
Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|
1 | 4971 | 17.47 | 7.59 | 0.56 |
2 | 5061 | 24.98 | 10.81 | 0.65 |
3 | 5166 | 37.83 | 16.52 | 0.80 |
4 | 5285 | 55.83 | 24.61 | 1.01 |
5 | 5444 | 74.54 | 33.04 | 1.23 |
6 | 5620 | 91.81 | 40.80 | 1.43 |
FanOut
transaction costs
Involves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.
Parties | UTxO | UTxO (bytes) | Tx size | % max Mem | % max CPU | Min fee ₳ |
---|---|---|---|---|---|---|
5 | 0 | 0 | 4934 | 7.89 | 3.34 | 0.46 |
5 | 1 | 57 | 4968 | 9.02 | 4.05 | 0.47 |
5 | 5 | 284 | 5103 | 13.15 | 6.73 | 0.53 |
5 | 10 | 570 | 5275 | 19.01 | 10.37 | 0.61 |
5 | 20 | 1137 | 5611 | 30.52 | 17.57 | 0.77 |
5 | 30 | 1707 | 5954 | 41.25 | 24.44 | 0.92 |
5 | 40 | 2278 | 6295 | 53.17 | 31.82 | 1.09 |
5 | 50 | 2844 | 6631 | 64.51 | 38.94 | 1.24 |
5 | 81 | 4612 | 7685 | 99.47 | 60.99 | 1.73 |
End-to-end benchmark results
This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master
code.
Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.
Generated at 2024-09-21 10:23:02.463197046 UTC
Baseline Scenario
Number of nodes | 1 |
---|---|
Number of txs | 3000 |
Avg. Confirmation Time (ms) | 4.184044471 |
P99 | 6.971361249999921ms |
P95 | 4.57936845ms |
P50 | 3.7388875ms |
Number of Invalid txs | 0 |
Three local nodes
Number of nodes | 3 |
---|---|
Number of txs | 9000 |
Avg. Confirmation Time (ms) | 21.969437453 |
P99 | 107.73078102000024ms |
P95 | 28.43345175ms |
P50 | 19.708637000000003ms |
Number of Invalid txs | 0 |
validateJSON "does-not-matter.json" id Null | ||
`shouldThrow` exceptionContaining @IOException "installed" | ||
`shouldThrow` exceptionContaining @IOException "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this "installed" removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually no idea. It throws, but it's a different message in the VM apparently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the different message / common denominator to assert for then? Asserting that it contains ""
is degenerating this statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vm-test-run-hydra-node> 1) Hydra.JSONSchema, validateJSON withJsonSpecifications, fails with missing tool
vm-test-run-hydra-node> predicate failed on expected exception: IOException
vm-test-run-hydra-node> does-not-matter.json: withBinaryFile: does not exist (No such file or directory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so this is indeed not testing what it should (it "fails with missing tool"
).
So before this would fail with a tool missing to be installed error, but seems like withClearedPATH
behaves different in the nixos vm test?
Not saying this is best test ever and maybe we should rather not test this at all, but setting this to ""
certainly is not the solution here.
I see this error locally:
and, with extreme sadness, I note that this is not the error I see in the CI here. |
I think, at least, if we're going to go down this path we need to hard-code the seeds into the tests; otherwise we're just going to be in a very strange world of non-reproducible error hell. That said; getting this error does make me feel a bit mixed on even this approach; given that the tests are now run a bit non-idiomatically with how development is done; meaning that if I want to replicate that exact error I either have to adjust the nix derivation (terrible) or build the cabal project and run that way (but, of course, I expect I won't get that error again in that case). |
0d4497d
to
6c363c1
Compare
This is a normal flakey test. |
0e1885e
to
e2bdc27
Compare
e2bdc27
to
510b894
Compare
Thanks for putting this up as a talking point :) It's good to see. I think ultimately I'm against this style of testing here. Here's my reasoning:
Overall, I feel like commiting to a "CI as the developer runs it" style of work seems very reasonable; and moreoever just a bit easier to think about day-to-day. Some avenues I can think of for exploring this:
Overall, though, I think the pain of vmTests just doesn't make sense here, for our kind of "day-to-day" workflow. I think there could definitely be a place for the vmTests that would, say, set up a full hydra node environment and test some set of transactions (i.e. it would serve as a bit of a demo of how to build and run a hydra node from scratch, or something? not sure exactly, but could be fun to think about); but that should be additional, not replacing what we have. I mean, I can see that this PR only adds tests, so in some sense it's fine, but I just don't want it to replace our existing testing approach, merely just add something that is additionally useful. What are your thoughts? |
510b894
to
51e8148
Compare
This doesn't bother me. Haskell derivations aren't reproducible either. The flakeyness of test depth hasn't shown up here either. The flakeyness we saw on this and the other branch were due to race conditions which are randomly present and we should just drop those tests.
This is an extremely normal practice that I am used to. Nobody develops by building haskell applications as a derivation, everyone uses cabal in a shell, but we use derivations in CI because they finish instantly in the case of a no-op.
This is odd but doesn't bother me either. It's OK to allocate one core per spun up service within the machine, more or less.
This is a good thing. It highlights to us that we should put tests that require the internet in a separate test entirely, and put tests that don't in a non-sandboxed VM so we get instant finality on those deriviations.
I think this is an anti-pattern. CI is a massive sledgehammer that you should not be running on a 2-second feedback loop. It's something I would use as a checkpoint every 30 minutes to say "What does CI have to say?", or if I want to link a specific error.
Expect hydraJobs, which requires derivations.
I do this with Haskell scripts sometimes as well, but derivations are better.
Not sure how this helps us.
Anything nix will expect derivations.
Requires derivations.
Ideally I would have one github pipeline that consumes the flake and renders it in individual jobs like gitlab dynamic pipeline.
Requires derivations.
My final thought is that I would actually use this locally. Running cabal test all jams up ports so I don't use it. I test specific packages with cabal. But when finalising the branch, I want to use nix flake check so that anything that is done doesn't even show up in the terminal a second time - I just get left with the derivations that are still a problem. VM tests are annoyingly un-granular, but that can be improved if we commit to the style. |
It has shown up as I myself was testing out this very commit; see my comment earlier in the thread! :)
I don't see that it is, in general, because of the poor use of resources. Recalling that the central idea of this issue was speeding things up; this doesn't accomplish that, in general; i.e. it requires care and curation, which is difficult and time-consuming.
I can be convinced; but how do we make it easy to make this consistent and fast between CI and local dev, so we're not maintaining two different ways of doing the same tests? re: "Some avenues I can think of for exploring this: ..." What I was hoping to see is an exploration of how some of these approaches would help our central goal - faster CI.
I think if it's useful for you it's fine to add these extra derivations; but I just don't want to switch our CI to it without doing some more investigations in other ways to speed up the CI ( see above comment ) and then, if we do decide this is best, carefully resolving the problems with the seed/randomness, and the dual-maintenance/dev problems. Maybe to help make some progress here; do you have some example projects out there that use VM tests really nicely? Would be great to see/learn from! |
There are lots of examples in nixpkgs itself. https://github.com/search?q=repo%3ANixOS%2Fnixpkgs+makeTest&type=code |
51e8148
to
8071acb
Compare
I would like to second this. Allowing contributions without needing nix for typical development workflows is a great trait to retain for the project (is it currently true?)
I think your points are orthogonal to each other. From my viewpoint, I would like to have continuous integration workflow to
For 1, IMO the current state is too much nix already (see above)
I think any kind of fault testing that requires full machines (instead of containers), e.g. using |
a0a57ee
to
611e4a6
Compare
611e4a6
to
961f844
Compare
961f844
to
a3fa77a
Compare
Closing (as discussed). |
Allows for running node, cluster and tui tests in an isolated environment with sandbox off for internet access.