Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute release 2025-01-07 #10288

Merged
merged 89 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from 88 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
2451969
fix(ci): Allow github-action-script to post reports (#10136)
rahulinux Dec 13, 2024
7dc3826
Fix pg_regress tests on a cloud staging instance (#10134)
a-masterov Dec 13, 2024
ce8eb08
Extract public sk types to safekeeper_api (#10137)
arssher Dec 13, 2024
2c91062
test_prefetch: reduce timeout to default 5m from 10m (#10105)
bayandin Dec 13, 2024
fcff752
fix(test_timeline_archival_chaos): flakiness caused by orphan layers …
problame Dec 13, 2024
eeabecd
Correctly update LFC used_pages in case of LFC resize (#10128)
knizhnik Dec 13, 2024
07d1db5
Improve comments and log messages in the logical replication monitor …
tristan957 Dec 13, 2024
7ee5dca
fix(pageserver): race between gc-compaction and repartition (#10127)
skyzh Dec 13, 2024
d56fea6
CI: always require aws-oicd-role-arn input to be set (#10145)
bayandin Dec 13, 2024
2521eba
Check for invalid down link while prefetching B-Tree leave pages for …
knizhnik Dec 13, 2024
cf161e1
fix(adapter): password not set in role drop (#10130)
myrrc Dec 14, 2024
f3ecd5d
pageserver: revert flush backpressure (#8550) (#10135)
erikgrinaker Dec 15, 2024
117c1b5
Do not perform prefetch for temp relations (#10146)
knizhnik Dec 16, 2024
ebcbc1a
pageserver: tighten up code around SLRU dir key handling (#10082)
jcsp Dec 16, 2024
24d6587
chore(proxy): refactor self-signed config (#10154)
conradludgate Dec 16, 2024
1ed0e52
Extract safekeeper http client to separate crate. (#10140)
arssher Dec 16, 2024
c5e3314
Add test restarting compute at WAL page boundary (#10111)
arssher Dec 16, 2024
6565fd4
chore: fix clippy lints 2024-12-06 (#10138)
conradludgate Dec 16, 2024
3d30a7a
pageserver: make `RemoteTimelineClient::schedule_index_upload` infall…
erikgrinaker Dec 16, 2024
2e4c9c5
chore(proxy): remove allow_self_signed from regular proxy (#10157)
conradludgate Dec 16, 2024
59b7ff8
chore(proxy): disallow unwrap and unimplemented (#10142)
conradludgate Dec 16, 2024
28ccda0
test_runner: ignore error in `test_timeline_archival_chaos` (#10161)
erikgrinaker Dec 16, 2024
aa7ab9b
proxy: Allow dumping TLS session keys for debugging (#10163)
cloneable Dec 16, 2024
e226d7a
Fix docker compose with PG17 (#10165)
a-masterov Dec 17, 2024
b0e43c2
postgres_ffi: add `WalStreamDecoder::complete_record()` benchmark (#1…
erikgrinaker Dec 17, 2024
b5833ef
remote_storage: configurable connection pooling for ABS (#10169)
jcsp Dec 17, 2024
2dfd3ca
fix(compute): Report compute_backpressure_throttling_seconds as count…
ololobus Dec 17, 2024
007b13b
Don't build tests in compute image, use ninja (#10149)
myrrc Dec 17, 2024
a55853f
utils: symbolize heap profiles (#10153)
erikgrinaker Dec 17, 2024
7dddbb9
Add pg_repack extension (#10100)
tristan957 Dec 17, 2024
93e9583
[proxy]: Use TLS for cancellation queries (#10152)
awarus Dec 17, 2024
fd23022
storcon: include preferred AZ in compute notifications (#9953)
jcsp Dec 17, 2024
2ee6bc5
chore(proxy): update vendored postgres libs to edition 2021 (#10139)
conradludgate Dec 17, 2024
c52514a
Fix allure report creation on periodic `pg_regress` testing (#10171)
a-masterov Dec 17, 2024
aaf980f
Online checkpoint replication state (#9976)
knizhnik Dec 18, 2024
8569629
Add safekeepers command to storcon_cli for listing (#10151)
arpad-m Dec 18, 2024
1d12efc
fix(pageserver): allow repartition errors during gc-compaction smoke …
skyzh Dec 18, 2024
1668d39
safekeeper: fix typo in allowlist for `/profile/heap` (#10186)
erikgrinaker Dec 18, 2024
d63602c
chore(proxy): fully remove allow-self-signed-compute flag (#10168)
conradludgate Dec 18, 2024
835287b
neon_local: add a `flock` to protect against concurrent execution (#1…
jcsp Dec 18, 2024
3d1c3a8
feat(pageserver): add compact queue http endpoint (#10173)
skyzh Dec 18, 2024
6d3e809
refactor(test): tighten up test_gc_feedback (#10126)
skyzh Dec 18, 2024
61fcf64
Fix flukyness of test_physical_and_logical_replicaiton.py (#10176)
knizhnik Dec 18, 2024
cc138b5
fix(pageserver): run psql in thread to avoid blocking (#10177)
skyzh Dec 19, 2024
a1b0558
fast import: importer: use aws s3 cli (#10162)
problame Dec 19, 2024
43dc034
Run pgbench on 10 GB scale factor on database with n relations (e.g. …
Bodobolero Dec 19, 2024
b135194
proxy: Delay SASL complete message until auth is done (#10189)
cloneable Dec 19, 2024
65042cb
tests: use high IO concurrency in `test_pgdata_import_smoke`, use `ef…
jcsp Dec 19, 2024
afda6d4
storage_scrubber: don't report half-created timelines as corruption (…
jcsp Dec 19, 2024
502d512
safekeeper: lift benchmarking utils into safekeeper crate (#10200)
VladLazar Dec 19, 2024
628451d
safekeeper: short-circuit interpreted wal sender (#10202)
VladLazar Dec 19, 2024
04517c6
Do not reload config file on PS reconnect (#10204)
knizhnik Dec 19, 2024
b89e02f
fix(pageserver): consider partial compaction layer map in layer check…
skyzh Dec 19, 2024
197a89a
Increase default stotrage controller heartbeat interval from 100msec …
knizhnik Dec 19, 2024
9c53b41
fix(pageserver): update remote latest_gc_cutoff after gc-compaction (…
skyzh Dec 19, 2024
f94248a
chore(libs/proxy): refactor tokio-postgres connection control flow (#…
conradludgate Jan 2, 2025
38c7a2a
chore(proxy): pre-load native tls certificates and propagate compute …
conradludgate Jan 2, 2025
b3cd883
Unlock LFC mutex when LFC cache is disabled (#10235)
knizhnik Jan 2, 2025
26600f2
Skip running clippy without default features (#10098)
jcgruenhage Jan 2, 2025
ee22d4c
proxy: Set TCP_NODELAY for compute connections (#10240)
cloneable Jan 2, 2025
8c7dcd2
Set heartbeat interval for chaos test (#10222)
knizhnik Jan 2, 2025
1622fd8
proxy: recognize but ignore the 3 new redis message types (#10197)
knz Jan 2, 2025
56e6ebf
chore: building compute_tools and local_proxy together (#10257)
conradludgate Jan 2, 2025
363ea97
Add more substantial tests for compute migrations (#9811)
tristan957 Jan 2, 2025
cd10c71
compute: Add spec support for disabling LFC resizing (#10132)
sharnoff Jan 2, 2025
eefad27
Inline various migration queries (#10231)
tristan957 Jan 2, 2025
7a598b9
[proxy/docs]imprv: Add local testing section to proxy README (#10230)
awarus Jan 3, 2025
2d4f267
cargo: update diesel, pq-sys (#10256)
jcsp Jan 3, 2025
ba9722a
tests: add upload wait in test_scrubber_physical_gc_ancestors (#10260)
jcsp Jan 3, 2025
c08759f
storcon: verbose logs in rare case of shards not attached yet (#10262)
jcsp Jan 3, 2025
1303cd5
Fix defusing race between Tenant::shutdown and offload_timeline (#10150)
arpad-m Jan 3, 2025
e9d30ed
pageserver: fix a 500 during timeline creation + shutdown (#10259)
jcsp Jan 3, 2025
b33299d
pageserver,safekeeper: disable heap profiling (#10268)
erikgrinaker Jan 3, 2025
1393cc6
Revert "pageserver: revert flush backpressure (#8550) (#10135)" (#10270)
erikgrinaker Jan 3, 2025
a77e87a
pageserver: assert that uploads don't modify indexed layers (#10228)
erikgrinaker Jan 3, 2025
4b2f568
docker: include vanilla debian postgres client (#10269)
jcsp Jan 3, 2025
b368e62
build(deps): bump jinja2 from 3.1.4 to 3.1.5 in the pip group (#10236)
dependabot[bot] Jan 4, 2025
406cca6
Update neon_fixtures.py - remove logs (#10219)
areyou1or0 Jan 6, 2025
fda52a0
feat(proxy): dont trigger error alerts for unknown topics (#10266)
conradludgate Jan 6, 2025
95f1920
cargo: build with frame pointers (#10226)
erikgrinaker Jan 6, 2025
4a6556e
fix(pageserver): ensure GC computes time cutoff using the same start …
skyzh Jan 6, 2025
b342a02
Dockerfile: build with `force-frame-pointers=yes` (#10286)
erikgrinaker Jan 6, 2025
ad7f14d
test_runner: update packages for Python 3.13 (#10285)
bayandin Jan 6, 2025
02f81b6
Fix clippy warning on macOS (#10282)
bayandin Jan 6, 2025
30863c0
libpagestore: timeout = max(0, difference), not min(0, difference) (#…
MMeent Jan 7, 2025
ea84ec3
Split promote-images into promote-images-dev and promote-images-prod …
jcgruenhage Jan 7, 2025
be38123
Fix accounting of dropped prefetched GetPage requests (#10276)
MMeent Jan 7, 2025
6292d93
Compute release 2025-01-07
github-actions[bot] Jan 7, 2025
31bd2dc
Fix promote-images-prod after splitting it out (#10292)
jcgruenhage Jan 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,16 @@
# by the RUSTDOCFLAGS env var in CI.
rustdocflags = ["-Arustdoc::private_intra_doc_links"]

# Enable frame pointers. This may have a minor performance overhead, but makes it easier and more
# efficient to obtain stack traces (and thus CPU/heap profiles). It may also avoid seg faults that
# we've seen with libunwind-based profiling. See also:
#
# * <https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html>
# * <https://github.com/rust-lang/rust/pull/122646>
#
# NB: the RUSTFLAGS envvar will replace this. Make sure to update e.g. Dockerfile as well.
rustflags = ["-Cforce-frame-pointers=yes"]

[alias]
build_testing = ["build", "--features", "testing"]
neon = ["run", "--bin", "neon_local"]
14 changes: 6 additions & 8 deletions .github/actions/allure-report-generate/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,9 @@ inputs:
type: boolean
required: false
default: false
aws_oicd_role_arn:
description: 'the OIDC role arn to (re-)acquire for allure report upload - if not set call must acquire OIDC role'
required: false
default: ''
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true

outputs:
base-url:
Expand Down Expand Up @@ -84,12 +83,11 @@ runs:
ALLURE_VERSION: 2.27.0
ALLURE_ZIP_SHA256: b071858fb2fa542c65d8f152c5c40d26267b2dfb74df1f1608a589ecca38e777

- name: (Re-)configure AWS credentials # necessary to upload reports to S3 after a long-running test
if: ${{ !cancelled() && (inputs.aws_oicd_role_arn != '') }}
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
if: ${{ !cancelled() }}
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600 # 1 hour should be more than enough to upload report

# Potentially we could have several running build for the same key (for example, for the main branch), so we use improvised lock for this
Expand Down
14 changes: 6 additions & 8 deletions .github/actions/allure-report-store/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,9 @@ inputs:
unique-key:
description: 'string to distinguish different results in the same run'
required: true
aws_oicd_role_arn:
description: 'the OIDC role arn to (re-)acquire for allure report upload - if not set call must acquire OIDC role'
required: false
default: ''
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true

runs:
using: "composite"
Expand All @@ -36,12 +35,11 @@ runs:
env:
REPORT_DIR: ${{ inputs.report-dir }}

- name: (Re-)configure AWS credentials # necessary to upload reports to S3 after a long-running test
if: ${{ !cancelled() && (inputs.aws_oicd_role_arn != '') }}
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
if: ${{ !cancelled() }}
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600 # 1 hour should be more than enough to upload report

- name: Upload test results
Expand Down
12 changes: 5 additions & 7 deletions .github/actions/download/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,17 @@ inputs:
prefix:
description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
aws_oicd_role_arn:
description: "the OIDC role arn for aws auth"
required: false
default: ""
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true

runs:
using: "composite"
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600

- name: Download artifact
Expand Down
25 changes: 12 additions & 13 deletions .github/actions/run-python-test-set/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,9 @@ inputs:
description: 'benchmark durations JSON'
required: false
default: '{}'
aws_oicd_role_arn:
description: 'the OIDC role arn to (re-)acquire for allure report upload - if not set call must acquire OIDC role'
required: false
default: ''
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true

runs:
using: "composite"
Expand All @@ -62,7 +61,7 @@ runs:
with:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}-artifact
path: /tmp/neon
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

- name: Download Neon binaries for the previous release
if: inputs.build_type != 'remote'
Expand All @@ -71,7 +70,7 @@ runs:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}-artifact
path: /tmp/neon-previous
prefix: latest
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

- name: Download compatibility snapshot
if: inputs.build_type != 'remote'
Expand All @@ -83,7 +82,7 @@ runs:
# The lack of compatibility snapshot (for example, for the new Postgres version)
# shouldn't fail the whole job. Only relevant test should fail.
skip-if-does-not-exist: true
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

- name: Checkout
if: inputs.needs_postgres_source == 'true'
Expand Down Expand Up @@ -221,19 +220,19 @@ runs:
# The lack of compatibility snapshot shouldn't fail the job
# (for example if we didn't run the test for non build-and-test workflow)
skip-if-does-not-exist: true
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

- name: (Re-)configure AWS credentials # necessary to upload reports to S3 after a long-running test
if: ${{ !cancelled() && (inputs.aws_oicd_role_arn != '') }}
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
if: ${{ !cancelled() }}
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600 # 1 hour should be more than enough to upload report

- name: Upload test results
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-store
with:
report-dir: /tmp/test_output/allure/results
unique-key: ${{ inputs.build_type }}-${{ inputs.pg_version }}
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
4 changes: 2 additions & 2 deletions .github/actions/save-coverage-data/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ runs:
name: coverage-data-artifact
path: /tmp/coverage
skip-if-does-not-exist: true # skip if there's no previous coverage to download
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

- name: Upload coverage data
uses: ./.github/actions/upload
with:
name: coverage-data-artifact
path: /tmp/coverage
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
4 changes: 2 additions & 2 deletions .github/actions/upload/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ inputs:
prefix:
description: "S3 prefix. Default is '${GITHUB_SHA}/${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
aws_oicd_role_arn:
aws-oicd-role-arn:
description: "the OIDC role arn for aws auth"
required: false
default: ""
Expand Down Expand Up @@ -61,7 +61,7 @@ runs:
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600

- name: Upload artifact
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_benchmarking_preparation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

# we create a table that has one row for each database that we want to restore with the status whether the restore is done
- name: Create benchmark_restore_status table if it does not exist
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/_build-and-test-locally.yml
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ jobs:
with:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-artifact
path: /tmp/neon
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

# XXX: keep this after the binaries.list is formed, so the coverage can properly work later
- name: Merge and upload coverage data
Expand Down Expand Up @@ -308,7 +308,7 @@ jobs:
real_s3_region: eu-central-1
rerun_failed: true
pg_version: ${{ matrix.pg_version }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
TEST_RESULT_CONNSTR: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}
CHECK_ONDISK_DATA_COMPATIBILITY: nonempty
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/actionlint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ jobs:
# SC2086 - Double quote to prevent globbing and word splitting. - https://www.shellcheck.net/wiki/SC2086
SHELLCHECK_OPTS: --exclude=SC2046,SC2086
with:
fail_on_error: true
fail_level: error
filter_mode: nofilter
level: error

Expand Down
Loading
Loading