Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go.mod: upgrade opentelemetry deps #13888

Merged
merged 1 commit into from
Apr 13, 2022

Conversation

willbeason
Copy link
Contributor

Downstream users of etcd experience build issues when using dependencies
which require more recent (incompatible) versions of opentelemetry. This
commit upgrades the dependencies so that downstream users stop
experiencing these issues.

This will resolve:

@serathius
Copy link
Member

FYI this is third PR that tried to upgrade opentelemetry, however others didn't pass tests and seem abandoned. #13723 #13808.

@willbeason
Copy link
Contributor Author

FYI this is third PR that tried to upgrade opentelemetry, however others didn't pass tests and seem abandoned. #13723 #13808.

Ok. I'm having trouble seeing the failed test results. I only see:

"Error: The log was not found. It may have been deleted based on retention settings."

@willbeason
Copy link
Contributor Author

Is there a way to run the workflow locally?

@serathius
Copy link
Member

Error in Static analysis:

FAIL: inconsistent versions for depencency: google.golang.org/grpc
  - go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/[email protected] from: go.etcd.io/etcd/server/v3
  - google.golang.org/[email protected] from: go.etcd.io/etcd/api/v3
  - google.golang.org/[email protected] from: go.etcd.io/etcd/client/v3
  - google.golang.org/[email protected] from: go.etcd.io/etcd/etcdctl/v3
FAIL: inconsistent versions for depencency: google.golang.org/protobuf
FAIL: inconsistent dependencies
  - google.golang.org/[email protected] from: go.etcd.io/etcd/pkg/v3
FAIL: 'dep' failed at Tue Apr  5 18:27:09 UTC 2022

@willbeason
Copy link
Contributor Author

Thanks! I'll take a look tomorrow and correct that

@willbeason
Copy link
Contributor Author

If I'm reading that correctly, I just need to update go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc. The commit I've just pushed will hopefully do that.

@willbeason
Copy link
Contributor Author

For future people: you can run static analysis yourself. For the above failure (inconsistent dependencies), for example, I can use this command:

PASSES='dep' ./scripts/test.sh

This isn't guaranteed to catch all failures as there could be architecture-specific failures, but it will likely catch most of them.

@willbeason
Copy link
Contributor Author

I see - the main intent of these failures is that:

  • this repository contains multiple Go modules
  • the 'dep' check ensures that every external dependency has exactly the same version for all of these modules

@willbeason willbeason force-pushed the upgrade-opentelemetry branch from c7fcde7 to dfcd6f8 Compare April 6, 2022 14:20
@willbeason
Copy link
Contributor Author

willbeason commented Apr 6, 2022

Ok - new failure this time, so making progress.

Here is the failure: https://gist.github.com/willbeason/c774057148da7dc69fb1bf0fbc5d166c

This failure can be recreated with the command:

PASSES='bom' ./scripts/test.sh

It looks like a check that ensures the various tools used by scripts are the ones installed by a default invocation of go install. In this case, bill-of-materials.

This seems to ensure that various files that list dependencies all contain the appropriate records

@willbeason
Copy link
Contributor Author

The solution here was to run this script which automatically fixes these issues:

./scripts/fix.sh

@willbeason
Copy link
Contributor Author

That's odd - the only change in the code is running ./scripts/fix.sh (I verified there were no spurious changes), but now Semaphore reports that the build fails for i386 architecture, whereas for the previous commit it passed. The only changes are to bill-of-materials files and to go.sum. Could it be a flake, or have I missed something?

@willbeason
Copy link
Contributor Author

Ah, I see from the failure that it is a flake:

Step 1/14 : FROM ubuntu:21.10
toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
make[1]: *** [build-docker-test] Error 1
make[1]: Leaving directory `/home/runner/etcd'
make: *** [ensure-docker-test-image-exists] Error 2

@serathius
Copy link
Member

Yep, it's flaky. #13448

@willbeason
Copy link
Contributor Author

Hooray, static analysis passed!

@serathius
Copy link
Member

@willbeason appreciate any feedback on how we could make contributions easier :P.

@willbeason
Copy link
Contributor Author

@serathius What would have made things faster for me would be:

  • In CONTRIBUTING.md, list scripts in scripts/ that a first-time contributor would need (and what they're for). Or - have a short explanation at the top of each script which briefly says what it's for. (Doesn't have to be a full usage guide, just "Updates license information in modules and submodules. CI checks that running this causes no changes." would go a long way)
  • For CI failures, suggest the appropriate script to fix the corresponding failure. For example, if the bom test fails, most likely the contributor just needs to run ./scripts/fix.sh

@serathius
Copy link
Member

Created #13896, feel free to add more.

@willbeason
Copy link
Contributor Author

Thanks!

I'm really hoping that this PR doesn't run into the same issues that the previous PRs ran into. The deps here are higher versions than the others, but the related issues haven't been closed, so fingers crossed.

@willbeason
Copy link
Contributor Author

Drat! Integration tests failing with timeouts like previous PRs.

@willbeason
Copy link
Contributor Author

willbeason commented Apr 6, 2022

Heads up: I'll be intermittently available tomorrow, and I'm OOO Friday. I'll be back Monday. I'll still be on for a few more hours today.

Looking at first failure: https://gist.github.com/willbeason/e6b372803acbf58d16bc27a533d338ac

So for the TestV3StorageQuotaApply, there's a goroutine leak before the test even starts. That should help narrow down when the goroutine is leaked.

@willbeason
Copy link
Contributor Author

That's odd - running the test in isolation locally passes:

tests$ go test ./integration/ --run TestV3StorageQuotaApply --race --count=10
ok  	go.etcd.io/etcd/tests/v3/integration	14.924s

(note that since I ran the test 10x, the runtime is 10x longer than usual)

So this means that a test which runs before TestV3StorageQuotaApply causes the problem

@willbeason
Copy link
Contributor Author

@serathius I see that while many tests in tests/integration run with integration2.BeforeTest(t) as the first call, not all do. Is that intentional? Since the goroutine is leaked before TestV3StorageQuotaApply is run, then another test must be leaking the goroutine (since we know TestMain does not). Tests which call integration2.BeforeTest(t) check no goroutines are leaked as a cleanup step, so IIUC then one of the few tests which does not call integration2.BeforeTest(t) must be leaking the goroutine.

Do I have this right?

@codecov-commenter
Copy link

codecov-commenter commented Apr 6, 2022

Codecov Report

Merging #13888 (4730691) into main (a5b9f72) will decrease coverage by 0.09%.
The diff coverage is n/a.

❗ Current head 4730691 differs from pull request most recent head 8fab77c. Consider uploading reports for the commit 8fab77c to get more accurate results

@@            Coverage Diff             @@
##             main   #13888      +/-   ##
==========================================
- Coverage   72.39%   72.30%   -0.10%     
==========================================
  Files         469      469              
  Lines       38402    38402              
==========================================
- Hits        27802    27765      -37     
- Misses       8813     8855      +42     
+ Partials     1787     1782       -5     
Flag Coverage Δ
all 72.30% <ø> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
client/v3/snapshot/v3_snapshot.go 0.00% <0.00%> (-54.35%) ⬇️
client/pkg/v3/tlsutil/tlsutil.go 83.33% <0.00%> (-8.34%) ⬇️
server/storage/schema/membership.go 51.82% <0.00%> (-8.03%) ⬇️
server/etcdserver/api/v3rpc/lease.go 74.68% <0.00%> (-7.60%) ⬇️
server/etcdserver/cindex/cindex.go 84.90% <0.00%> (-7.55%) ⬇️
server/etcdserver/api/v3rpc/watch.go 83.55% <0.00%> (-4.37%) ⬇️
etcdutl/snapshot/v3_snapshot.go 61.86% <0.00%> (-4.24%) ⬇️
server/etcdserver/cluster_util.go 70.35% <0.00%> (-3.17%) ⬇️
client/v3/concurrency/election.go 79.68% <0.00%> (-2.35%) ⬇️
server/proxy/grpcproxy/watch.go 93.64% <0.00%> (-1.74%) ⬇️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5b9f72...8fab77c. Read the comment docs.

@serathius
Copy link
Member

@willbeason yes, we have a lot of flaky tests and have effort to track and fix them. Please ignore the test failures if they are flakes. I will rerun the test if needed.

Please fix the conflict and rebase the PR.

@willbeason
Copy link
Contributor Author

@serathius Done!

@ptabor
Copy link
Contributor

ptabor commented Apr 9, 2022

go: golang.org/x/[email protected][79](https://github.com/etcd-io/etcd/runs/5956315508?check_suite_focus=true#step:5:79)3c2f9a: missing go.sum entry; to add it:
	go mod download golang.org/x/crypto
must be run from 'go.etcd.io/etcd/v3' module directory

@willbeason
Copy link
Contributor Author

Updated, and manually verified locally that "dep" and "bom" pass

client/pkg/testutil/leak.go Outdated Show resolved Hide resolved
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit

@willbeason
Copy link
Contributor Author

@ahrtr Fixed! I don't think I have permissions to merge.

@ptabor
Copy link
Contributor

ptabor commented Apr 13, 2022

Please squash the commits and I will merge as soon as the tests are green.

Downstream users of etcd experience build issues when using dependencies
which require more recent (incompatible) versions of opentelemetry. This
commit upgrades the dependencies so that downstream users stop
experiencing these issues.
@willbeason willbeason force-pushed the upgrade-opentelemetry branch from aabf426 to eab1e0c Compare April 13, 2022 14:14
@willbeason
Copy link
Contributor Author

@ptabor Ok - squashed commits

@ptabor ptabor merged commit 2e034d2 into etcd-io:main Apr 13, 2022
@willbeason willbeason deleted the upgrade-opentelemetry branch April 13, 2022 17:19
This was referenced Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants