Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(diagnostics) add diff endpoints #6131

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

feat(diagnostics) add diff endpoints #6131

wants to merge 13 commits into from

Conversation

rainest
Copy link
Contributor

@rainest rainest commented Jun 4, 2024

What this PR does / why we need it:

Adds a diff endpoint to the diagnostic server and sets up infrastructure for DB mode updates to ship their diffs to the diagnostic server. This endpoint is available when sensitive config dumps are enabled, as it includes diffs of sensitive resources. We could add filters to exclude certs/passwords/etc. while retaining other items, but for the initial MVP, easiest to just require an opt in to sensitive diagnostic output.

Which issue this PR fixes:

Fix #5289

Special notes for your reviewer:

This requires Kong/go-database-reconciler#119. My original work in GDR was only sending error events; it should have been sending everything. This currently has a tmp commit against to update the module to the branch. That GDR PR needs to merged and tagged as fix release before merging this.

Per Kong/go-database-reconciler#120 I observed an odd bug with mysterious empty events 👻 Dunno where these are from, but trying to figure that out will take more time than is available. As a workaround, the event handler on the KIC side discards any event with no action, as such events should never be of interest.

Manual testing via running TEST_DATABASE_MODE=postgres KONG_TEST_CLUSTER=kind:kind GOFLAGS= dlv test -- -test.run SomeTestThatModifiesConfig. Almost any test will do, since they almost all modify config. You can either set up a watch curl -sv localhost:10256/debug/config/diff-report to see it over time or debugger pause after the diff send to see an individual diff. Output looks something like example.txt.

Unit testing isn't exhaustive, but covers most of the endpoint functionality. It checks the KIC-side behavior, not content, with the expectation that if content's broken, that's GDR's problem.

PR Readiness Checklist:

Complete these before marking the PR as ready to review:

  • the CHANGELOG.md release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR

@rainest rainest added this to the KIC v3.3.x milestone Jun 24, 2024
@rainest
Copy link
Contributor Author

rainest commented Jun 24, 2024

Moving into 3.3 as a stretch. IIRC this should be 50% done, just dropped from 3.2 between time constraints and low priority.

Rename the diagnostics.ConfigDump struct to ClientDiagnostic to reflect
that it now handles multiple types of diagnostic data (config dumps,
config diffs, and fallback status) from the Kong client.
Add a struct to hold diff history in the diagnostic server. It holds a
fixed number of diffs (5, arbitrarily) and provides functions to update
and view history by config hash.

Add a diagnostics function to generate diagnostic diff structs from GDR
event data.

Add a diagnostics update loop case to handle inbound diffs.

Call the diagnostic diff generator when processing an event and append
the diagnostic diff to the diff list to the DB mode event handler.
Dump sensitive config during integration, mostly to support debugging
the diff endpoint. While there's no diff diagnostic integration test,
having it available allows for easier interactive debugging of the
endpoint, or viewing diff states when running DB-backed tests.
Add timestamps and a list of available diffs to the diff endpoint.

Add unit tests for the diff endpoint.
Remove scaffolding comments and unfinished wishlist items. Add missing
doc comments.

Clean up some garbage the linter caught.
@rainest rainest removed the do not merge let the author merge this, don't merge for them. label Aug 2, 2024
@rainest
Copy link
Contributor Author

rainest commented Aug 3, 2024

Fix one bug, find another:

This has somehow broken the envtest for DB mode error event generation. That test should normally function by creating a consumer and tossing it at a mock admin API that always returns an error, which should in turn make GDR bubble up an error event through the result channel, which KIC will then turn into a Kubernetes Event.

Looking at the event handler only shows yet more of the wonderful 👻 events described in Kong/go-database-reconciler#120, though unfortunately I can't get the debugger to stop anywhere in the GDR portion, even at the low level of setting a watchpoint on the channel buffer itself. IDK what the heck's going there, since that's not something the envtest mock stuff should skip.

Tried to pin down the problem to either this or the GDR change:

so whatever causes it appears to be in this PR's new code, but IDK what. The new skip phantom events case isn't on the error handling side and shouldn't affect it, and git bisect isn't giving me good clues ☹️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output of entity changes in KIC's logs instead of Println in database-reconciler repo
2 participants