Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: fail tests which write too much data #9537

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Oct 28, 2024

Problem

We see tests fail on service startup in a way that suggests some other rogue tests might be loading the test machines too heavily.

Summary of changes

  • Assert out during teardown if a test wrote more than 128MB of data to disk. This should give us a shortlist of tests that may be overloading the system.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp force-pushed the jcsp/test-fail-on-too-much-data branch from 23a37c3 to 7635345 Compare October 28, 2024 12:39
Copy link

github-actions bot commented Oct 28, 2024

7282 tests run: 6749 passed, 169 failed, 364 skipped (full report)


Failures on Postgres 17

Failures on Postgres 16

Failures on Postgres 15

Failures on Postgres 14

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_branching_with_pgbench[release-pg14-flat-1-10] or test_branching_with_pgbench[release-pg14-flat-1-10] or test_branching_with_pgbench[release-pg14-cascade-1-10] or test_branching_with_pgbench[release-pg14-cascade-1-10] or test_pageserver_compaction_smoke[release-pg14-vanilla] or test_pageserver_compaction_smoke[release-pg14-vanilla] or test_pageserver_compaction_smoke[release-pg14-interpreted] or test_pageserver_compaction_smoke[release-pg14-interpreted] or test_hot_standby_feedback[release-pg14] or test_hot_standby_feedback[release-pg14] or test_pgdata_import_smoke[release-pg14-8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg14-8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg14-None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg14-None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_basic_eviction[release-pg14] or test_basic_eviction[release-pg14] or test_on_demand_wal_download[release-pg14] or test_on_demand_wal_download[release-pg14] or test_pg_regress[release-pg14-None] or test_pg_regress[release-pg14-None] or test_tx_abort_with_many_relations[release-pg14] or test_tx_abort_with_many_relations[release-pg14] or test_pg_regress[release-pg14-4] or test_pg_regress[release-pg14-4] or test_check_visibility_map[release-pg14] or test_check_visibility_map[release-pg14] or test_peer_recovery[release-pg14] or test_branching_with_pgbench[release-pg15-flat-1-10] or test_branching_with_pgbench[release-pg15-flat-1-10] or test_branching_with_pgbench[release-pg15-cascade-1-10] or test_branching_with_pgbench[release-pg15-cascade-1-10] or test_pageserver_compaction_smoke[release-pg15-vanilla] or test_pageserver_compaction_smoke[release-pg15-vanilla] or test_pageserver_compaction_smoke[release-pg15-interpreted] or test_pageserver_compaction_smoke[release-pg15-interpreted] or test_hot_standby_feedback[release-pg15] or test_hot_standby_feedback[release-pg15] or test_pgdata_import_smoke[release-pg15-8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg15-8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg15-None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg15-None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_basic_eviction[release-pg15] or test_basic_eviction[release-pg15] or test_lfc_resize[release-pg15] or test_on_demand_wal_download[release-pg15] or test_on_demand_wal_download[release-pg15] or test_pg_regress[release-pg15-None] or test_pg_regress[release-pg15-None] or test_tx_abort_with_many_relations[release-pg15] or test_tx_abort_with_many_relations[release-pg15] or test_pg_regress[release-pg15-4] or test_pg_regress[release-pg15-4] or test_check_visibility_map[release-pg15] or test_check_visibility_map[release-pg15] or test_branching_with_pgbench[release-pg16-flat-1-10] or test_branching_with_pgbench[release-pg16-flat-1-10] or test_branching_with_pgbench[release-pg16-cascade-1-10] or test_branching_with_pgbench[release-pg16-cascade-1-10] or test_pageserver_compaction_smoke[release-pg16-vanilla] or test_pageserver_compaction_smoke[release-pg16-vanilla] or test_pageserver_compaction_smoke[release-pg16-interpreted] or test_pageserver_compaction_smoke[release-pg16-interpreted] or test_hot_standby_feedback[release-pg16] or test_hot_standby_feedback[release-pg16] or test_pgdata_import_smoke[release-pg16-None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg16-None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg16-8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_pgdata_import_smoke[release-pg16-8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS] or test_basic_eviction[release-pg16] or test_basic_eviction[release-pg16] or test_on_demand_wal_download[release-pg16] or test_on_demand_wal_download[release-pg16] or test_pg_regress[release-pg16-4] or test_pg_regress[release-pg16-4] or test_pg_regress[release-pg16-None] or test_pg_regress[release-pg16-None] or test_tx_abort_with_many_relations[release-pg16] or test_tx_abort_with_many_relations[release-pg16] or test_check_visibility_map[release-pg16] or test_check_visibility_map[release-pg16] or test_branching_with_pgbench[release-pg17-flat-1-10] or test_branching_with_pgbench[release-pg17-flat-1-10] or test_branching_with_pgbench[debug-pg17-flat-1-10] or test_branching_with_pgbench[release-pg17-flat-1-10] or test_branching_with_pgbench[release-pg17-flat-1-10] or test_branching_with_pgbench[release-pg17-cascade-1-10] or test_branching_with_pgbench[release-pg17-cascade-1-10] or test_branching_with_pgbench[debug-pg17-cascade-1-10] or test_branching_with_pgbench[release-pg17-cascade-1-10] or test_branching_with_pgbench[release-pg17-cascade-1-10] or test_pageserver_compaction_smoke[release-pg17-interpreted] or test_pageserver_compaction_smoke[release-pg17-interpreted] or test_pageserver_compaction_smoke[release-pg17-interpreted] or test_pageserver_compaction_smoke[release-pg17-interpreted] or test_pageserver_compaction_smoke[release-pg17-vanilla] or test_pageserver_compaction_smoke[release-pg17-vanilla] or test_pageserver_compaction_smoke[release-pg17-vanilla] or test_pageserver_compaction_smoke[release-pg17-vanilla] or test_combocid[debug-pg17] or test_backward_compatibility[debug-pg17] or test_forward_compatibility[debug-pg17] or test_versions_mismatch[debug-pg17-combination_nnnnn] or test_versions_mismatch[debug-pg17-combination_ooonn] or test_versions_mismatch[debug-pg17-combination_ononn] or test_versions_mismatch[debug-pg17-combination_onnnn] or test_versions_mismatch[debug-pg17-combination_nnnoo] or test_compute_catalog[debug-pg17] or test_fast_growing_tenant[debug-pg17-relative_spare] or test_fast_growing_tenant[debug-pg17-relative_equal] or test_pageserver_respects_overridden_resident_size[debug-pg17-relative_equal] or test_broken_tenants_are_skipped[debug-pg17] or test_pageserver_evicts_until_pressure_is_relieved[debug-pg17-relative_equal] or test_partial_evict_tenant[debug-pg17-relative_spare] or test_partial_evict_tenant[debug-pg17-relative_equal] or test_pageserver_falls_back_to_global_lru[debug-pg17-relative_equal] or test_secondary_mode_eviction[debug-pg17] or test_gc_aggressive[debug-pg17] or test_hot_standby_feedback[release-pg17] or test_hot_standby_feedback[release-pg17] or test_hot_standby_feedback[debug-pg17] or test_hot_standby_feedback[release-pg17] or test_hot_standby_feedback[release-pg17] or test_large_schema[debug-pg17] or test_basic_eviction[release-pg17] or test_basic_eviction[release-pg17] or test_basic_eviction[release-pg17] or test_basic_eviction[release-pg17] or test_gc_of_remote_layers[debug-pg17] or test_lfc_resize[release-pg17] or test_ondemand_download_timetravel[debug-pg17] or test_on_demand_wal_download[release-pg17] or test_on_demand_wal_download[release-pg17] or test_on_demand_wal_download[debug-pg17] or test_on_demand_wal_download[release-pg17] or test_on_demand_wal_download[release-pg17] or test_pageserver_reconnect[debug-pg17] or test_pageserver_restarts_under_worload[debug-pg17] or test_pg_regress[release-pg17-None] or test_pg_regress[release-pg17-None] or test_pg_regress[debug-pg17-None] or test_pg_regress[release-pg17-None] or test_pg_regress[release-pg17-None] or test_tx_abort_with_many_relations[release-pg17] or test_tx_abort_with_many_relations[release-pg17] or test_tx_abort_with_many_relations[release-pg17] or test_tx_abort_with_many_relations[release-pg17] or test_pg_regress[release-pg17-4] or test_pg_regress[release-pg17-4] or test_pg_regress[debug-pg17-4] or test_pg_regress[release-pg17-4] or test_pg_regress[release-pg17-4] or test_isolation[debug-pg17-4] or test_isolation[debug-pg17-None] or test_timetravel[debug-pg17] or test_replica_start_with_too_many_unused_xids[debug-pg17] or test_background_operation_cancellation[debug-pg17] or test_subscriber_restart[debug-pg17] or test_timelines_parallel_endpoints[debug-pg17] or test_tenants_many[debug-pg17] or test_threshold_based_eviction[debug-pg17] or test_check_visibility_map[release-pg17] or test_check_visibility_map[release-pg17] or test_check_visibility_map[release-pg17] or test_check_visibility_map[release-pg17] or test_check_visibility_map[debug-pg17] or test_replace_safekeeper[debug-pg17] or test_pull_timeline[debug-pg17-False] or test_pull_timeline[debug-pg17-True] or test_concurrent_computes[debug-pg17]"

Test coverage report is not available

The comment gets automatically updated with the latest test results
cc14a84 at 2025-01-10T18:07:25.706Z :recycle:

@jcsp jcsp force-pushed the jcsp/test-fail-on-too-much-data branch 3 times, most recently from afd4404 to 7ef9d33 Compare December 19, 2024 10:16
@jcsp jcsp force-pushed the jcsp/test-fail-on-too-much-data branch from 7ef9d33 to 434b0aa Compare January 10, 2025 11:57
github-merge-queue bot pushed a commit that referenced this pull request Jan 10, 2025
## Problem

I noticed in #9537 that tests
which work with compat snapshots were writing several hundred MB of
data, which isn't really necessary.

Also, the snapshots are large but don't have the proper variety of
storage format features, e.g. they could just have L0 deltas.

## Summary of changes

- Use smaller scale factor and runtime to generate less data
- Configure a small layer size and use force image layer generation so
that our output contains L1 deltas and image layers, and has a decent
number of entries in the layer map
github-merge-queue bot pushed a commit that referenced this pull request Jan 10, 2025
## Problem

This test writes ~5GB of data. It is not suitable to run in parallel
with all the other small tests in test_runner/regress.

via #9537 

## Summary of changes

- Move test_parallel_copy into the performance directory, so that it
does not run in parallel with other tests
github-merge-queue bot pushed a commit that referenced this pull request Jan 10, 2025
## Problem

These two tests came up in #9537 as doing multi-gigabyte I/O, and from
inspection of the tests it doesn't seem like they need that to fulfil
their purpose.

## Summary of changes

- In test_local_file_cache_unlink, run fewer background threads with a
smaller number of rows. These background threads AFAICT exist to make
sure some I/O is going on while we unlink the LFC directory, but 5
threads should be enough for "some".
- In test_lfc_resize, tweak the test to validate that the cache size is
larger than the final size before resizing it, so that we're sure we're
writing enough data to really be doing something. Then decrease the
pgbench scale.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant