Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-35151 DDL recovery for high-level indexes #3520

Conversation

svoj
Copy link
Contributor

@svoj svoj commented Sep 13, 2024

  • [] The Jira issue number for this PR is: MDEV-35151

Description

Various fixes to make DDL recovery work with high-level indexes.

Release Notes

None.

How can this PR be tested?

mtr

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.
  • This is a bug fix to a new feature and the PR is based against the feature tree.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@svoj svoj requested a review from vuvova September 13, 2024 16:03
@CLAassistant
Copy link

CLAassistant commented Sep 13, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

file->ha_rename_table(idx_from, idx_to);
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a good approach. It feels like you put too much inside one DDL_RENAME_PHASE_TABLE entry. There can be crashes between renames of individual vector indexes (presuming there will be many some day).

Perhaps, DDL_RENAME_PHASE_TABLE should be one ha_rename_table only? That is, a table with hlindexes would generate one DDL_RENAME_PHASE_TABLE entry per index (and one for the main table, of course)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for DROP, etc.

Copy link
Contributor Author

@svoj svoj Sep 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be crashes between renames of individual vector indexes (presuming there will be many some day).

Yes, sure. And my code handles it inline with what original code have been doing, that is ignores non_existing_table_errors(), e.g. if handler files were moved and .frm not. Or as you suggested above crash between different high-level indexes rename.

... a table with hlindexes would generate one DDL_RENAME_PHASE_TABLE entry per index (and one for the main table, of course)?

  1. PHASE is just an enum member of an ACTION, it can't have stuff like table name or whatever attached.
  2. ACTION is more logical (or high-level) entity, which may do many things like binlogging, handling stat tables, triggers, handler files and of course .frm.

Here're options:

  1. We could add special phases to handle high-level indexes, but there's going to be no much difference between my solution. And it will introduce compatibility issues. Though it is very true that older versions won't be able to handle high-level indexes properly even with this patch.
  2. We can't re-use existing actions like DDL_LOG_RENAME_TABLE_ACTION, as I mentioned before they're way to sophisticated for high-level indexes.
  3. In theory we could try re-using special actions like DDL_LOG_RENAME_ACTION for high level indexes. That is issue one DDL_LOG_RENAME_TABLE_ACTION and then DDL_LOG_RENAME_ACTION for each high-level index. It may work, it has no compatibility concerns, but it is a question if we're going to be able to integrate it with e.g. ALTER.

Your thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'm much of an expert in ddl logging. Let's use your approach if it works.

file->ha_rename_table(idx_from, idx_to);
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for DROP, etc.

mysql-test/main/AAA.test Outdated Show resolved Hide resolved
storage/maria/ha_maria.cc Outdated Show resolved Hide resolved
@vuvova vuvova force-pushed the bb-11.6-MDEV-32887-vector branch 11 times, most recently from e231991 to 4557941 Compare September 25, 2024 19:07
@vuvova
Copy link
Member

vuvova commented Sep 28, 2024

could you please rebase it on top of the latest bb-11.6-MDEV-32887-vector ? It's impossible to review now, it looks like it includes basically everything

@vuvova vuvova force-pushed the bb-11.6-MDEV-32887-vector branch 2 times, most recently from cadf2be to fe9092a Compare September 29, 2024 19:54
@svoj svoj force-pushed the bb-11.6-MDEV-32887-vector-ddl branch 2 times, most recently from 3d2f51e to 5bea61a Compare October 2, 2024 15:22
@vuvova vuvova force-pushed the bb-11.6-MDEV-32887-vector branch 2 times, most recently from 1d2a5fc to b29dca9 Compare October 6, 2024 19:59
@vuvova vuvova force-pushed the bb-11.6-MDEV-32887-vector branch 2 times, most recently from a8a5206 to 8bb7868 Compare October 16, 2024 14:34
vuvova and others added 4 commits October 17, 2024 09:42
…e upon using non-default table options and system variables

extend the option_list expicitly on CREATE/ALTER, not implicitly
on parsing.
…th vector key and unique key

InnoDB rename needs the same workaround for hlindexes
as it has for partitions
… DISTINCT

don't apply distinct optimization to order by a vector index
When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.

We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").

We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.

Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.

The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().

Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.

It should be noted that ever since
commit 685d958 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.

::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.

xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
vuvova and others added 10 commits October 19, 2024 20:05
… DISTINCT MariaDB#2

mhnsw index can be used only for ORDER BY. not for GROUP BY
…_key_to_innobase upon query from table with usual key on vector field

add test
with streaming implemened mhnsw no longer needs to know
the LIMIT in advance. let's just cap it to avoid allocating
too much memory for the one step result set
…s and VEC_ToText accepts

let VEC_FromText validate that the vector l2squared isn't NaN.
VEC_ToText still prints everything.
…ue of mhnsw_min_limit

mhnsw_min_limit must not be larger than candidates queue size
Since high-level index tables do not participate in thr_multi_lock(), added
explicit call to THR_LOCK::start_trans(). This is needed mostly for Aria to
handle transaction logging.
@svoj svoj force-pushed the bb-11.6-MDEV-32887-vector-ddl branch from 5bea61a to 3869f7a Compare October 21, 2024 14:32
@svoj svoj changed the title DDL recovery for high-level indexes (CREATE, DROP and RENAME) DDL recovery for high-level indexes Oct 21, 2024
@vuvova vuvova changed the title DDL recovery for high-level indexes MDEV-35222 DDL recovery for high-level indexes Oct 21, 2024
@vuvova vuvova changed the title MDEV-35222 DDL recovery for high-level indexes MDEV-35151 DDL recovery for high-level indexes Oct 21, 2024
@vuvova vuvova force-pushed the bb-11.6-MDEV-32887-vector branch 2 times, most recently from 1b6db26 to 62afa2f Compare October 23, 2024 18:27
@vuvova
Copy link
Member

vuvova commented Oct 25, 2024

cherry-picked into the vector branch, thanks! closing.

@vuvova vuvova closed this Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants