Ao related commits cherry-pick #884

reshke · 2025-01-21T19:36:26Z

Fixes #ISSUE_Number

What does this PR do?

Type of Change

Bug fix (non-breaking change)
New feature (non-breaking change)
Breaking change (fix or feature with breaking changes)
Documentation update

Breaking Changes

Test Plan

Unit tests added/updated
Integration tests added/updated
Passed make installcheck
Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Followed contribution guide
Added/updated documentation
Reviewed code for security implications
Requested review from cloudberry committers

Additional Context

CI Skip Instructions

This change is an extension of the PR #13246. The commit c121bcd83c9 fixes the bug for ALTER TABLE EXPAND TABLE and ALTER TABLE SET DISTRIBUTED BY operations on AO tables with indexes. These operations on AO tables with indexes are executed with CT + IIS whereas on AO tables without indexes which are executed with CTAS The bug about not preserving storage options is also present for ALTER TABLE on AO tables without indexes case. This change fixes that bug by adding the storage options preservation logic to AT calls for AO tables which go through an internal CTAS. Co-authored-by: Soumyadeep Chakraborty <[email protected]>

follow up #13213 and #13221. dispatch temp namespace oid in every query. on writer gang skips set temp namespace oid, the writer gang will do InitTempTableNamespace(). on reader gang always use the oid from query message, allow temp NS oid change dynamically. temp namespace oid will stale after creating temp NS rollback. it should be fine, new oid will be dispatched in the next sql.

Since this PR #11377 has fixed bitmap index scan when concurrent insert update full bitmap page, which resolved concurrent read on bitmap index when there's insert running in the backend may cause the bitmap scan read wrong tid: 1. Query on bitmap: A query starts and reads all bitmap pages to PAGE_FULL, increases the next tid to fetch, and releases the lock after reading each page. 2. Concurrent insert: insert a tid into PAGE_FULL cause expand compressed words to new words, and rearrange words into PAGE_NEXT. 3. Query on bitmap: fetch PAGE_NEXT and expect the first tid in it should equal the saved next tid. But actually PAGE_NEXT now contains words used to belong in PAGE_FULL. This causes the real next tid less than the expected next tid. But our scan keeps increasing the wrong tid. And then this leads to a wrong result. The PR used _bitmap_catchup_to_next_tid() function to adjust result->lastScanWordNo to the correct position if there has concurrent read/write and causes rearranging words into the next bitmap index page, it used the value of words->firstTid and result->nextTid to judge whether rearranging words into the next bitmap index page happened or not, this is not entirely right. This is because BMIterateResult can only store 16*1024=16384 TIDs, but BMBatchWords can save 3968 words by default, a words is 64bit, even in the worst case (all words not compressed) , BMBatchWords can hold 3968 * 64 = 253952 TIDs. So if there has a PAGE_FULL, the PAGE_FULL must scan it more than one time, but the value of BMBatchWords->firstTid will not be updated during each scan, it will only be updated when new bitmap index pages are read. Therefore, in the absence of concurrent read/write, if we need to scan the same BMBatchWords multiple times, it will lead to the wrong scan position, resulting in wrong output results, as #13446. To summarize, we just need to check for a rearranged condition when the new bitmap index page is read from disk, we should do nothing when we scan the same BMBatchWords.

The case prevent_ao_wal will check XLOG entries and before the check it invokes VACUUM without table names so this vacuum will operates on tables in the whole database. If previous cases leave some Append Only Tables, then there might be more XLOG entries to make the case failure. This commit fixes the issue by adding table names in the case.

In commit a0684da we made a change to use the fixed TOAST_TUPLE_TARGET instead of custom toast_tuple_target that's calculated based on the the blocksize of the AO table for toasting AO table tuples. This caused an issue that some tuples that are well beyond the toast_tuple_target are not being toasted (because they're smaller than TOAST_TUPLE_TARGET). This is ok when the tuples can still fit into the AO table's blocksize. But if not, an error would occur. E.g.: postgres=# create table t (a int, b int[]) WITH (appendonly=true, blocksize=8192); CREATE TABLE postgres=# insert into t select 1, array_agg(x) from generate_series(1, 2030) x; ERROR: item too long (check #1): length 8160, maxBufferLen 8168 CONTEXT: Append-Only table 't' Therefore, we should revert those changes, including: 1. Still use the toast_tuple_target for toasting 'x' and 'e' columns for AO tables. There's also a GPDB_12_MERGE_FIXME in the original comment suggesting to use RelationGetToastTupleTarget for AO table (in order to unify the logic to calculate maxDataLen for heap and AO tuples, I suppose). But that's not possible currently because the calculated toast_tuple_target is stored in AppendOnlyInsertDesc and we cannot get it from the Relation struct. And it seems to me that we won't have a way to do that easily. Therefore, keep this FIXME comment being removed. 2. Still use the toast_tuple_target for toasting 'm' columns for AO tables. Also restore the FIXME comment suggesting that we should use a bigger target size for 'm' columns. This should be something that worth investigating more into. Commit a0684da also includes a change to use custom toast_tuple_target for heap tables, which should be a valid change. I think that fixed an oversight when merging the upstream commit c251336.

divyeshddv and others added 5 commits January 21, 2025 19:19

reshke added the cherry-pick cherry-pick upstream commts label Jan 21, 2025

my-ship-it approved these changes Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ao related commits cherry-pick #884

Ao related commits cherry-pick #884

reshke commented Jan 21, 2025

Ao related commits cherry-pick #884

Are you sure you want to change the base?

Ao related commits cherry-pick #884

Conversation

reshke commented Jan 21, 2025

What does this PR do?

Type of Change

Breaking Changes

Test Plan

Impact

Checklist

Additional Context

CI Skip Instructions