Skip to content

chore(tests/load): C-chain load testing #3914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 191 commits into
base: master
Choose a base branch
from
Draft

chore(tests/load): C-chain load testing #3914

wants to merge 191 commits into from

Conversation

qdm12
Copy link
Contributor

@qdm12 qdm12 commented Apr 28, 2025

Why this should be merged

How this works

./bin/ginkgo -v ./tests/load/c -- --avalanchego-path=$PWD/build/avalanchego

Configuration is in Go code for now, in the tests/load/c/ginkgo_test.go file

How this was tested

Need to be documented in RELEASES.md?

flagVars = e2e.RegisterFlagsWithDefaultOwner("avalanchego-load")
}

var _ = ginkgo.SynchronizedBeforeSuite(func() []byte {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it isn't necessary to use ginkgo if there is only going to be a single test entrypoint and there is no need for test parallelism. For details as to how this can work, follow the call chain in one of the antithesis tests starting from here:

https://github.com/ava-labs/avalanchego/blob/master/tests/antithesis/avalanchego/main.go#L53

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to but my attempts have failed so far, given the tight coupling of e2e tests code with ginkgo. For example NewTestEnvironment requires a bunch of ginkgo things, and it's rather hard to separate. Any tip on this I am missing out on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It requires TestContext, not ginkgo. As per the antithesis testing link, it's possible to substitute tests.SimpleTestContext for tests.fixture.e2e.GinkgoTestContext to be able to use NewTestEnvironment without ginkgo. The main requirement of SimpleTestFixture is to invoke defer tc.Cleanup() to ensure the calls to tc.DeferCleanup(...) get executed before exit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed usage of ginkgo in favor of UTs: 3b5e69a

@qdm12 qdm12 force-pushed the qdm12/tests/load branch from 1f7fd9d to e2adf7b Compare April 30, 2025 08:23
@@ -72,6 +73,23 @@ func GetNodeURIs(nodes []*Node) []NodeURI {
return uris
}

// GetNodeWebsocketURIs returns a list of websocket URIs for the given nodes and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the example of TestEnvironment.GetNodeURIs, a valid deferCleanup method is required that ensures forwarding tunnels are scoped to the lifetime of a test.

--out="${CONTRACTS_DIR}/${CONTRACT_NAME}.bindings.go"
echo "Generated ${CONTRACT_NAME}.bindings.go"
done
rm -r "${TEMPDIR}"
Copy link
Contributor

@maru-ava maru-ava May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intended that this temporary directory persist in the event of any error in generation? (vs using a cleanup function configured to run on exit e.g. trap cleanup EXIT)

err = result.err
cancel()
default:
panic("unreachable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No action required) Why isn't it appropriate to just return an error here?

return agents, nil
}

func createAgent(ctx context.Context, endpoint string, key *secp256k1.PrivateKey,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No action required) Arbitrary line break? The convention appears to be all or none of the arguments are separated by line break.

@@ -0,0 +1,17 @@
global:
scrape_interval: 10s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe configure this scrape interval to tmpnet.prometheusScrapeInterval (exporting in the process) rather than hard-coding? That would ensure a value compatible with the network shutdown delay ensuring a final metrics scrape.

@@ -1073,9 +1078,58 @@ func MetricsLinkForNetwork(networkUUID string, startTime string, endTime string)
endTime = "now"
}
return fmt.Sprintf(
"https://grafana-poc.avax-dev.network/d/kBQpRdWnk/avalanche-main-dashboard?&var-filter=network_uuid%%7C%%3D%%7C%s&var-filter=is_ephemeral_node%%7C%%3D%%7Cfalse&from=%s&to=%s",
"https://%s/d/kBQpRdWnk/avalanche-main-dashboard?&var-filter=network_uuid%%7C%%3D%%7C%s&var-filter=is_ephemeral_node%%7C%%3D%%7Cfalse&from=%s&to=%s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this should be composed with BuildMonitoringURLForNetwork. Maybe a separable change too.

"targets": []string{s.addr},
"labels": map[string]string{
"network_uuid": networkUUID,
"network_owner": networkOwner,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the owner relevant when there is only ever a single network for a load test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this when I was testing my hypothesis, it's not relevant anymore.

Taskfile.yml Outdated
cmds:
- task: generate-load-contract-bindings
- task: build
- cmd: ./bin/ginkgo -v ./tests/load/c -- --avalanchego-path=$PWD/build/avalanchego --start-metrics-collector --start-logs-collector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • run-monitored-tmpnnet-cmd automatically configures collector start and checks if the monitoring credentials are available
  • none of the other tasks in this file explicitly enable collector start because doing so when credentials are not set in the environment will result in failure
  • this task should not be enabling collector start - that is the responsibility of the caller (i.e. to start collectors manually or invoke the task with arguments (e.g. task [my task] -- --start-metrics-collector)
  • That suggests test-load-local is unnecessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged test-load-local into test-load and removed arguments to enable collector start: 3d211a7


func createAgent(ctx context.Context, endpoint string, key *secp256k1.PrivateKey,
tracker *load.Tracker[common.Hash],
) (load.Agent[common.Hash], error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No action required) Not really loving the readability of a generic interface like this... But also not sure how long this pattern will need to persist - hopefully we're moving towards a design where the transaction ID doesn't have to be known to the orchestrator.

@RodrigoVillar
Copy link
Contributor

As discussed offline, going to take over development of this PR. My first tasks will be the following:

  1. Switching from using ginkgo to having the load tests be unit tests (as referenced here, there isn't a need to use ginkgo).
  2. Addressing minor nits from reviews above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

5 participants