-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: jaeger traces not persisted across container restarts #82
Conversation
Your org has enabled the Graphite merge queue for merging into mainAdd the label “graphite/merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link. |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #82 +/- ##
==========================================
+ Coverage 45.72% 48.09% +2.36%
==========================================
Files 56 56
Lines 3427 3541 +114
==========================================
+ Hits 1567 1703 +136
+ Misses 1655 1625 -30
- Partials 205 213 +8 ☔ View full report in Codecov by Sentry. |
WalkthroughThe changes in this pull request involve modifications to the Changes
Assessment against linked issues
Possibly related PRs
Suggested reviewers
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
15f5e42
to
238e710
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range comments (1)
common/docker/service/nakama.go (1)
Line range hint
39-54
: Consider revising the telemetry architecture.While enabling traces by default helps ensure they are captured for persistence (addressing WORLD-1208), the current changes create an inconsistent architecture where telemetry sub-features (tracing, metrics) can be active even when the main telemetry feature is disabled. This could lead to:
- Confusion in configuration management
- Unexpected resource usage
- Security implications from always-on metrics endpoints
Consider implementing a clearer hierarchy:
cfg.Telemetry
as the master switch- Individual features (tracing, metrics) configurable only when telemetry is enabled
- Document the architecture decisions in comments
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (4)
- .gitignore (1 hunks)
- common/docker/service/evm.go (1 hunks)
- common/docker/service/jaeger.go (2 hunks)
- common/docker/service/nakama.go (1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
common/docker/service/evm.go
[warning] 31-31: common/docker/service/evm.go#L31
Added line #L31 was not covered by tests
🔇 Additional comments (7)
.gitignore (1)
37-37
: LGTM! Verify build scripts compatibility.The explicit relative path
./world
is clearer than the implicit path. However, let's ensure this doesn't affect any build scripts.✅ Verification successful
Let me check the content of README.md to understand how the binary is referenced there.
Let me check if there are any other gitignore patterns in the repository that might need standardization, and also verify if there are any build scripts that we might have missed in our previous search.
Change to
./world
in .gitignore is safe and consistentThe change to use
./world
is safe and follows good practices:
- No build scripts found that could be affected
- README.md only references the binary in documentation context
- Other binary ignores in .gitignore (like
world-cli
) follow similar relative path patterns- The change makes the path resolution more explicit without changing functionality
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if any build scripts or documentation reference the binary path # Test: Search for references to the binary. Expect: Consistent usage of the path. rg -l 'world' --type sh rg -l 'world' --type mdLength of output: 206
Script:
#!/bin/bash # Check how the binary is referenced in README.md rg 'world' README.md -C 2Length of output: 1375
Script:
#!/bin/bash # Check the content of .gitignore for similar patterns cat .gitignore # Search for build scripts in any directory fd -e sh -e bashLength of output: 618
common/docker/service/jaeger.go (2)
7-7
: LGTM: Required import for volume mount configuration.The added mount package import is necessary for implementing persistent storage.
23-33
: LGTM: Proper configuration for persistent storage using Badger.The environment variables correctly configure Badger as a persistent storage backend, addressing the issue of trace data loss during container restarts.
Security consideration: Verify CORS configuration.
The
Access-Control-Allow-Origin:*
header allows requests from any origin. While this might be acceptable for local development, it could pose security risks if accidentally used in production.✅ Verification successful
The previous search might have been too specific. Let's broaden our search to understand the deployment context and verify if this configuration is indeed only used for local development as commented in the code.
Let's examine how this Jaeger configuration is used in the broader context and if there are any production-specific overrides.
We need to verify if this is strictly a development environment configuration or if it could potentially be used in production.
CORS configuration is safe as it's strictly for local development
The CORS setting is part of a local development configuration, as evidenced by:
- The code is explicitly commented as "not recommended to be used for production environment"
- This configuration is part of the development tooling (
dev.go
andstart.go
) used for local development and testing- No production deployment configurations were found that could inherit this CORS setting
The permissive CORS header is acceptable in this context as it's isolated to local development environments.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if there are any production deployment configurations that might inherit this CORS setting rg -l "QUERY_ADDITIONAL_HEADERS.*Access-Control-Allow-Origin" --type yaml --type jsonLength of output: 87
Script:
#!/bin/bash # Let's check for any deployment or configuration files that might contain Jaeger settings fd -e yaml -e yml -e json | xargs rg -l "jaeger|JAEGER" # Also check for any Docker Compose or Kubernetes deployment files fd "docker-compose|deployment" -e yaml -e yml | xargs catLength of output: 181
Script:
#!/bin/bash # Let's check the content of nakama.go to understand how Jaeger is used cat common/docker/service/nakama.go # Also check if there are any other files that might reference this Docker service configuration rg -l "docker/service" --type goLength of output: 3692
Script:
#!/bin/bash # Let's check the dev and start commands to understand the deployment context cat cmd/world/cardinal/dev.go cat cmd/world/cardinal/start.go # Also check if there are any production deployment related files fd "prod|production" -e go -e yaml -e yml | xargs catLength of output: 13207
common/docker/service/evm.go (2)
31-31
: Verify the relevance of this change to the PR objective.This change (removing the linting directive and modifying the default value assignment) appears unrelated to the PR's objective of fixing Jaeger trace persistence. Please clarify if this is an intentional cleanup or should be moved to a separate PR.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 31-31: common/docker/service/evm.go#L31
Added line #L31 was not covered by tests
31-31
: 🧹 Nitpick (assertive)Add test coverage for default faucet configuration.
This line sets a critical default configuration for the faucet feature but lacks test coverage. Consider adding tests to verify the default behavior.
Would you like me to help generate test cases for the default configuration handling?
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 31-31: common/docker/service/evm.go#L31
Added line #L31 was not covered by testscommon/docker/service/nakama.go (2)
39-41
: Review the telemetry and tracing logic consistency.The change makes tracing enabled by default, even when telemetry is disabled (
!cfg.Telemetry
). This creates a potentially confusing situation where a disabled telemetry feature still produces traces.Consider one of these approaches:
- Keep telemetry as the master switch:
if traceEnabled == "" || !cfg.Telemetry { - traceEnabled = "true" + traceEnabled = strconv.FormatBool(cfg.Telemetry) }
- Document why traces should be collected independently of the telemetry setting
Line range hint
48-54
: Review metrics configuration behavior and security implications.The changes introduce several concerns:
- Metrics are now always enabled by default, potentially consuming resources unnecessarily
- The NAKAMA_METRICS_ENABLED environment variable is only checked when telemetry is enabled, creating an inconsistent configuration pattern
- The Prometheus endpoint (port 9100) will be exposed by default, which could be a security consideration
Consider this alternative implementation:
- metricsEnabled := true + metricsEnabled := cfg.Telemetry if cfg.Telemetry { cfgMetricsEnabled, err := strconv.ParseBool(cfg.DockerEnv["NAKAMA_METRICS_ENABLED"]) if err == nil { metricsEnabled = cfgMetricsEnabled } }Let's verify the metrics configuration in other services:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (1)
- common/docker/service/jaeger.go (2 hunks)
🔇 Additional comments (3)
common/docker/service/jaeger.go (3)
7-7
: LGTM: Import added for Docker volume mount configuration.
45-46
: Volume mount configuration implements persistence correctly.
23-33
: Storage configuration aligns with persistence requirements.The environment variables correctly configure Badger as a persistent storage backend, addressing the trace persistence issue. The comments effectively document the local development context.
Let's verify the Badger configuration matches Jaeger's recommendations:
✅ Verification successful
Storage configuration is appropriate for local development
The Badger storage configuration is used exclusively in the local development context, as evidenced by:
- The configuration exists only in
common/docker/service/jaeger.go
- The code includes clear comments about its intended use for local development
- No production configurations were found that would conflict with this setup
- The configuration aligns with basic Badger persistence requirements
The environment variables are properly isolated to the development environment, with separate Jaeger configurations (
JAEGER_ADDR
andJAEGER_SAMPLE_RATE
) being used in the Nakama service configuration.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify Badger configuration against Jaeger documentation # Expected: Should find matching configuration in Jaeger's documentation or examples rg -l "SPAN_STORAGE_TYPE=badger" . rg -l "BADGER_EPHEMERAL=false" .Length of output: 135
Script:
#!/bin/bash # Let's search for any other Badger-related configurations or documentation in the codebase rg -i "badger.*storage" . # Also search for any comments or documentation about storage configuration rg -i "storage.*configuration" . # Check if there are any other environment variables related to Jaeger rg "JAEGER_" .Length of output: 399
Closes: WORLD-1208
Overview
Previously the Jaeger container only stores the spans in memory, which is lost when the container is restarted. This PR fixes this by configuring Jaeger to use Badger as the span storage engine with a Docker volume.
Brief Changelog
Testing and Verifying
Manually tested and verified.
Summary by CodeRabbit
New Features
Chores
.gitignore
to refine binary path specifications.