-
Notifications
You must be signed in to change notification settings - Fork 752
chore(tests/load): C-chain load testing #3914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
tests/load/ginkgo_test.go
Outdated
flagVars = e2e.RegisterFlagsWithDefaultOwner("avalanchego-load") | ||
} | ||
|
||
var _ = ginkgo.SynchronizedBeforeSuite(func() []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it isn't necessary to use ginkgo if there is only going to be a single test entrypoint and there is no need for test parallelism. For details as to how this can work, follow the call chain in one of the antithesis tests starting from here:
https://github.com/ava-labs/avalanchego/blob/master/tests/antithesis/avalanchego/main.go#L53
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to but my attempts have failed so far, given the tight coupling of e2e tests code with ginkgo. For example NewTestEnvironment
requires a bunch of ginkgo things, and it's rather hard to separate. Any tip on this I am missing out on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires TestContext, not ginkgo. As per the antithesis testing link, it's possible to substitute tests.SimpleTestContext for tests.fixture.e2e.GinkgoTestContext to be able to use NewTestEnvironment without ginkgo. The main requirement of SimpleTestFixture is to invoke defer tc.Cleanup()
to ensure the calls to tc.DeferCleanup(...)
get executed before exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed usage of ginkgo
in favor of UTs: 3b5e69a
@@ -72,6 +73,23 @@ func GetNodeURIs(nodes []*Node) []NodeURI { | |||
return uris | |||
} | |||
|
|||
// GetNodeWebsocketURIs returns a list of websocket URIs for the given nodes and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the example of TestEnvironment.GetNodeURIs
, a valid deferCleanup method is required that ensures forwarding tunnels are scoped to the lifetime of a test.
--out="${CONTRACTS_DIR}/${CONTRACT_NAME}.bindings.go" | ||
echo "Generated ${CONTRACT_NAME}.bindings.go" | ||
done | ||
rm -r "${TEMPDIR}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended that this temporary directory persist in the event of any error in generation? (vs using a cleanup function configured to run on exit e.g. trap cleanup EXIT
)
err = result.err | ||
cancel() | ||
default: | ||
panic("unreachable") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) Why isn't it appropriate to just return an error here?
return agents, nil | ||
} | ||
|
||
func createAgent(ctx context.Context, endpoint string, key *secp256k1.PrivateKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) Arbitrary line break? The convention appears to be all or none of the arguments are separated by line break.
@@ -0,0 +1,17 @@ | |||
global: | |||
scrape_interval: 10s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe configure this scrape interval to tmpnet.prometheusScrapeInterval (exporting in the process) rather than hard-coding? That would ensure a value compatible with the network shutdown delay ensuring a final metrics scrape.
@@ -1073,9 +1078,58 @@ func MetricsLinkForNetwork(networkUUID string, startTime string, endTime string) | |||
endTime = "now" | |||
} | |||
return fmt.Sprintf( | |||
"https://grafana-poc.avax-dev.network/d/kBQpRdWnk/avalanche-main-dashboard?&var-filter=network_uuid%%7C%%3D%%7C%s&var-filter=is_ephemeral_node%%7C%%3D%%7Cfalse&from=%s&to=%s", | |||
"https://%s/d/kBQpRdWnk/avalanche-main-dashboard?&var-filter=network_uuid%%7C%%3D%%7C%s&var-filter=is_ephemeral_node%%7C%%3D%%7Cfalse&from=%s&to=%s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this should be composed with BuildMonitoringURLForNetwork. Maybe a separable change too.
tests/load/prometheus.go
Outdated
"targets": []string{s.addr}, | ||
"labels": map[string]string{ | ||
"network_uuid": networkUUID, | ||
"network_owner": networkOwner, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the owner relevant when there is only ever a single network for a load test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this when I was testing my hypothesis, it's not relevant anymore.
Taskfile.yml
Outdated
cmds: | ||
- task: generate-load-contract-bindings | ||
- task: build | ||
- cmd: ./bin/ginkgo -v ./tests/load/c -- --avalanchego-path=$PWD/build/avalanchego --start-metrics-collector --start-logs-collector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- run-monitored-tmpnnet-cmd automatically configures collector start and checks if the monitoring credentials are available
- none of the other tasks in this file explicitly enable collector start because doing so when credentials are not set in the environment will result in failure
- this task should not be enabling collector start - that is the responsibility of the caller (i.e. to start collectors manually or invoke the task with arguments (e.g.
task [my task] -- --start-metrics-collector
) - That suggests test-load-local is unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged test-load-local
into test-load
and removed arguments to enable collector start: 3d211a7
|
||
func createAgent(ctx context.Context, endpoint string, key *secp256k1.PrivateKey, | ||
tracker *load.Tracker[common.Hash], | ||
) (load.Agent[common.Hash], error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) Not really loving the readability of a generic interface like this... But also not sure how long this pattern will need to persist - hopefully we're moving towards a design where the transaction ID doesn't have to be known to the orchestrator.
As discussed offline, going to take over development of this PR. My first tasks will be the following:
|
Why this should be merged
How this works
Configuration is in Go code for now, in the tests/load/c/ginkgo_test.go file
How this was tested
Need to be documented in RELEASES.md?