-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(tests): Add end-to-end tests with the Datadog Agent #18538
Conversation
✅ Deploy Preview for vector-project ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
✅ Deploy Preview for vrl-playground ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This pull request is automatically being deployed by Amplify Hosting (learn more). |
This pull request is automatically being deployed by Amplify Hosting (learn more). |
This pull request is automatically being deployed by Amplify Hosting (learn more). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check-spelling found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
Regression Detector ResultsRun ID: cc081c6f-e1e2-4c2d-9770-9fd6f61ac059 Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | syslog_log2metric_humio_metrics | ingress throughput | +1.47 | [+1.38, +1.56] |
➖ | otlp_http_to_blackhole | ingress throughput | +0.85 | [+0.69, +1.01] |
➖ | splunk_hec_route_s3 | ingress throughput | +0.81 | [+0.31, +1.30] |
➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | +0.60 | [+0.50, +0.71] |
➖ | datadog_agent_remap_blackhole_acks | ingress throughput | +0.48 | [+0.38, +0.59] |
➖ | http_elasticsearch | ingress throughput | +0.38 | [+0.30, +0.45] |
➖ | http_to_http_noack | ingress throughput | +0.23 | [+0.14, +0.33] |
➖ | http_to_s3 | ingress throughput | +0.13 | [-0.15, +0.41] |
➖ | socket_to_socket_blackhole | ingress throughput | +0.10 | [+0.03, +0.17] |
➖ | http_to_http_json | ingress throughput | +0.03 | [-0.05, +0.10] |
➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | +0.00 | [-0.16, +0.16] |
➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | -0.00 | [-0.15, +0.15] |
➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.01 | [-0.12, +0.10] |
➖ | enterprise_http_to_http | ingress throughput | -0.06 | [-0.11, -0.01] |
➖ | http_text_to_http_json | ingress throughput | -0.31 | [-0.44, -0.19] |
➖ | syslog_humio_logs | ingress throughput | -0.48 | [-0.59, -0.38] |
➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | -0.55 | [-0.69, -0.40] |
➖ | otlp_grpc_to_blackhole | ingress throughput | -0.64 | [-0.73, -0.55] |
➖ | http_to_http_acks | ingress throughput | -0.64 | [-1.95, +0.67] |
➖ | datadog_agent_remap_blackhole | ingress throughput | -0.70 | [-0.78, -0.61] |
➖ | syslog_loki | ingress throughput | -0.99 | [-1.06, -0.92] |
➖ | fluent_elasticsearch | ingress throughput | -1.16 | [-1.62, -0.69] |
➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | -1.20 | [-1.31, -1.08] |
➖ | datadog_agent_remap_datadog_logs | ingress throughput | -1.38 | [-1.46, -1.29] |
➖ | syslog_splunk_hec_logs | ingress throughput | -1.81 | [-1.89, -1.73] |
➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -1.94 | [-2.07, -1.81] |
➖ | file_to_blackhole | egress throughput | -2.82 | [-5.25, -0.39] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
Regression Detector ResultsRun ID: 600a602d-bc6f-49dc-b507-3ea8dc3cbfa1 Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | datadog_agent_remap_blackhole_acks | ingress throughput | +2.07 | [+1.96, +2.19] |
➖ | syslog_splunk_hec_logs | ingress throughput | +1.41 | [+1.36, +1.45] |
➖ | fluent_elasticsearch | ingress throughput | +0.66 | [+0.19, +1.13] |
➖ | datadog_agent_remap_datadog_logs | ingress throughput | +0.65 | [+0.56, +0.73] |
➖ | syslog_log2metric_humio_metrics | ingress throughput | +0.51 | [+0.40, +0.62] |
➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | +0.46 | [+0.38, +0.53] |
➖ | http_to_s3 | ingress throughput | +0.33 | [+0.05, +0.61] |
➖ | otlp_http_to_blackhole | ingress throughput | +0.32 | [+0.16, +0.49] |
➖ | socket_to_socket_blackhole | ingress throughput | +0.18 | [+0.09, +0.26] |
➖ | http_to_http_noack | ingress throughput | +0.18 | [+0.07, +0.28] |
➖ | datadog_agent_remap_blackhole | ingress throughput | +0.11 | [+0.01, +0.22] |
➖ | http_to_http_json | ingress throughput | +0.05 | [-0.03, +0.13] |
➖ | http_elasticsearch | ingress throughput | +0.02 | [-0.04, +0.09] |
➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | +0.00 | [-0.14, +0.15] |
➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | +0.00 | [-0.16, +0.16] |
➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.03 | [-0.14, +0.09] |
➖ | enterprise_http_to_http | ingress throughput | -0.06 | [-0.14, +0.02] |
➖ | syslog_loki | ingress throughput | -0.16 | [-0.23, -0.09] |
➖ | file_to_blackhole | egress throughput | -0.17 | [-2.62, +2.28] |
➖ | splunk_hec_route_s3 | ingress throughput | -0.29 | [-0.79, +0.21] |
➖ | otlp_grpc_to_blackhole | ingress throughput | -0.36 | [-0.44, -0.27] |
➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | -0.37 | [-0.45, -0.30] |
➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | -0.64 | [-0.78, -0.50] |
➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -0.92 | [-1.05, -0.79] |
➖ | syslog_humio_logs | ingress throughput | -1.25 | [-1.38, -1.12] |
➖ | http_to_http_acks | ingress throughput | -1.74 | [-3.05, -0.44] |
➖ | http_text_to_http_json | ingress throughput | -2.46 | [-2.59, -2.32] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
I realized what was causing the integration test suite to fail in the MQ. It's due to the change to the integration test's Dockerfile, which added the FEATURES arg , and additional build steps. This results in a net increase in the compilation time for the images used. This should solve the issue with the integration tests failing in the MQ. |
Regression Detector ResultsRun ID: 0b4465eb-67d3-4027-8cf1-e1f39c81fea9 Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | +1.95 | [+1.80, +2.10] |
➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | +1.14 | [+1.07, +1.22] |
➖ | datadog_agent_remap_blackhole_acks | ingress throughput | +0.76 | [+0.67, +0.85] |
➖ | datadog_agent_remap_datadog_logs | ingress throughput | +0.50 | [+0.42, +0.58] |
➖ | http_to_http_acks | ingress throughput | +0.48 | [-0.84, +1.79] |
➖ | socket_to_socket_blackhole | ingress throughput | +0.22 | [+0.16, +0.28] |
➖ | syslog_log2metric_humio_metrics | ingress throughput | +0.18 | [+0.06, +0.29] |
➖ | http_to_http_noack | ingress throughput | +0.17 | [+0.08, +0.27] |
➖ | syslog_humio_logs | ingress throughput | +0.17 | [+0.07, +0.27] |
➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | +0.16 | [+0.07, +0.24] |
➖ | file_to_blackhole | egress throughput | +0.11 | [-2.35, +2.57] |
➖ | http_to_http_json | ingress throughput | +0.07 | [-0.01, +0.15] |
➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | +0.00 | [-0.16, +0.16] |
➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | -0.00 | [-0.15, +0.14] |
➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.03 | [-0.14, +0.09] |
➖ | http_to_s3 | ingress throughput | -0.07 | [-0.35, +0.21] |
➖ | splunk_hec_route_s3 | ingress throughput | -0.08 | [-0.57, +0.42] |
➖ | enterprise_http_to_http | ingress throughput | -0.10 | [-0.18, -0.01] |
➖ | otlp_grpc_to_blackhole | ingress throughput | -0.19 | [-0.28, -0.10] |
➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -0.22 | [-0.34, -0.09] |
➖ | otlp_http_to_blackhole | ingress throughput | -0.26 | [-0.40, -0.11] |
➖ | syslog_splunk_hec_logs | ingress throughput | -0.28 | [-0.33, -0.23] |
➖ | http_elasticsearch | ingress throughput | -0.53 | [-0.60, -0.47] |
➖ | fluent_elasticsearch | ingress throughput | -0.90 | [-1.36, -0.43] |
➖ | datadog_agent_remap_blackhole | ingress throughput | -0.93 | [-1.02, -0.85] |
➖ | syslog_loki | ingress throughput | -1.03 | [-1.08, -0.97] |
➖ | http_text_to_http_json | ingress throughput | -1.56 | [-1.68, -1.45] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
Regression Detector ResultsRun ID: b1d11b08-aaac-4266-8e73-05d2d50f02f5 Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | syslog_log2metric_humio_metrics | ingress throughput | +2.05 | [+1.93, +2.17] |
➖ | otlp_grpc_to_blackhole | ingress throughput | +1.45 | [+1.36, +1.55] |
➖ | splunk_hec_route_s3 | ingress throughput | +1.41 | [+0.90, +1.91] |
➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | +1.38 | [+1.24, +1.53] |
➖ | datadog_agent_remap_blackhole | ingress throughput | +1.27 | [+1.19, +1.35] |
➖ | fluent_elasticsearch | ingress throughput | +0.62 | [+0.15, +1.09] |
➖ | syslog_loki | ingress throughput | +0.22 | [+0.17, +0.27] |
➖ | http_to_http_noack | ingress throughput | +0.18 | [+0.09, +0.28] |
➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | +0.07 | [-0.02, +0.16] |
➖ | http_to_s3 | ingress throughput | +0.06 | [-0.22, +0.33] |
➖ | http_to_http_json | ingress throughput | +0.05 | [-0.02, +0.13] |
➖ | http_elasticsearch | ingress throughput | +0.00 | [-0.06, +0.07] |
➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | +0.00 | [-0.14, +0.15] |
➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | +0.00 | [-0.16, +0.16] |
➖ | datadog_agent_remap_datadog_logs | ingress throughput | -0.01 | [-0.09, +0.08] |
➖ | enterprise_http_to_http | ingress throughput | -0.03 | [-0.12, +0.05] |
➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.06 | [-0.17, +0.05] |
➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | -0.28 | [-0.36, -0.19] |
➖ | otlp_http_to_blackhole | ingress throughput | -0.33 | [-0.48, -0.18] |
➖ | socket_to_socket_blackhole | ingress throughput | -0.38 | [-0.44, -0.32] |
➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -0.40 | [-0.53, -0.26] |
➖ | syslog_humio_logs | ingress throughput | -0.44 | [-0.55, -0.34] |
➖ | datadog_agent_remap_blackhole_acks | ingress throughput | -0.46 | [-0.57, -0.35] |
➖ | syslog_splunk_hec_logs | ingress throughput | -0.48 | [-0.53, -0.42] |
➖ | http_text_to_http_json | ingress throughput | -0.52 | [-0.65, -0.39] |
➖ | file_to_blackhole | egress throughput | -0.59 | [-3.07, +1.90] |
➖ | http_to_http_acks | ingress throughput | -1.10 | [-2.41, +0.22] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
…ev#18538) * start * check spelling allow * reorg * try consume a log * trying on logs * saving progress on vdev changes * more vdev drafts * Revert "more vdev drafts" This reverts commit 758a06a. * Revert "saving progress on vdev changes" This reverts commit 7a69290. * extract Vector instance to its own container * cleanup the compose file * Generate logs on fly * cleanup * cleanup * cleanup * remove sleep * add to CI * spell checker * fix issue with DD changes detection * more workflow fix * increase timeout as experiment * update vector config for log endpoint field * touch up * start * CI * feedback bg * add scaffolding for the differentiation of series API versions * fix(datadog_metrics sink): fix the integration tests which weren't actually validating anything * fix workflows * clippy * fix filter for traces * add first pass * add testing coverage * cargo.lock * reduce duplicated code * cleanup * clippy * feedback ds: remove check for sort by name * feedback ds: extend unit tests for v2 * feedback ds: extend the int test coverage * Revert "feedback ds: remove check for sort by name" This reverts commit c95a326. * add explicit sort check * add env var for v1 support * check events * add note in deprecations * feedback ds * ds feedback logs PR * config pain; start aggregation * feedback ds: move to scripts/e2e * add v6 and v7 agent versions to test against * remove dead code allow * move to e2e dir * improve comparison logic * code reduction * add sketches, some reorg * cleanup * add notes about the e2e prefix needed * re-use runner image * fixes * e2e tests now run in CI * clippy * remove unused struct * true up the int test comment changes * cleanup * rework the massaging and comparison model * compare to all versions * cleanup and address TODO for sketches * feedback ds: workflows * feedback ds: simpler compose * feedback ds: agent versions clarity * feedback ds: comments * clippy * touchups * feedback bg: assert_eq * increase timeout workflow * run in 8 core runner * check * separate e2e workflow * comment * fix workflows * prune timing * allow build_all option to start subcommand * hack to bypass this temporary issue with CI * feedback bg * feedback bg * feedback ds: metric type check * add descriptive comment * spelling * chore(vdev): refactor e2e tests into own subcommand (vectordotdev#19666) * merge conflict * update workflows * rename * fix e2e logic * fix e2e * fix e2e * script usage * touchups to workflows * separate dockerfile for the e2e tests * add the Dockerfile
closes: #18534
closes: #18829
closes: #18535
epic: #18533
Adds complete e2e tests with the Datadog Agent, and the
datadog_agent
source,datadog_logs
sink anddatadog_metrics
sink.The
vdev
infrastructure for running integration tests was refactored to be re-used for e2e tests, and a new subcommand to vdev was added to execute them,vdev e2e
.The main aspect of this setup that is different from existing integration tests is that the Vector components under test are run in a more black box fashion, by running a Vector instance in one of the containers along side the other services in the compose files. To accomplish this, the same image that is built to execute the test runner, is re-utilized for the vector compose service.
For emission of events, an "Emitter" is spun up to either generate fake logs or in the case of metrics, generate metrics with DogStatsD.
On the receiving end, a
fakeintake
instance receives the log events that are either sent from the Agent or sent from Vector.