Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql, opt: avoid full scans in mutation queries with cost flags #137984

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Dec 25, 2024

opt: convert memo.Cost to a struct with CostFlags

This is a mechanical change that will set up the ability to use
CostFlags in a future commit.

Release note: None

opt,sql: support hint to avoid full scan

Added a hint to avoid full scans (see release note below for details).
To support this change, added a field to memo.Cost with a new type
memo.CostFlags, which contains a number of boolean flags and supports
"multi-dimensional costing". This allows the optimizer to compare plans
based on the flags set in addition to the single-dimensional float64
cost. For example, plans with the new FullScanPenalty cost flag enabled
will always be more expensive than plans without any cost flags, even
if the base float64 cost is lower.

The new CostFlags type also includes a flag for HugeCostPenalty, which
must be set for plans with hugeCost. This ensures that existing
hints that use hugeCost still work if some other cost flags are set,
since HugeCostPenalty takes precedence over other cost flags.

This new CostFlags field is needed to support hints that do not cause an
error if the optimizer cannot find a plan complying with the hint. This
is needed because the previous approach of simply using hugeCost to
avoid certain plans meant that if such plans were unavoidable, we could
not effectively compare plans with cost greater than hugeCost due to
loss of floating point precision.

Informs #79683

Release note (sql change): Added support for a new index hint,
AVOID_FULL_SCAN, which will prevent the optimizer from planning a
full scan for the specified table if any other plan is possible. The
hint can be used in the same way as other existing index hints. For
example, SELECT * FROM table_name@{AVOID_FULL_SCAN};. This hint is
similar to NO_FULL_SCAN, but will not error if a full scan cannot be
avoided. Note that normally a full scan of a partial index would not
be considered a "full scan" for the purposes of the NO_FULL_SCAN and
AVOID_FULL_SCAN hints, but if the user has explicitly forced the
partial index via FORCE_INDEX=index_name, we do consider it a full
scan.

sql,opt: add setting avoid_full_table_scans_in_mutations

Fixes #79683

Release note (sql change): Added a new session setting
avoid_full_table_scans_in_mutations, which when set to true, causes
the optimizer to avoid planning full table scans for mutation queries
if any other plan is possible. It now defaults to true.

opt: remove a stale comment above optbuilder.buildScan

Removed a comment that references a function parameter that no longer
exists.

Release note: None

Copy link

blathers-crl bot commented Dec 25, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rytaft rytaft force-pushed the avoid-full-scans-flags branch 3 times, most recently from 1d476e3 to 1a0d73c Compare December 25, 2024 21:37
This is a mechanical change that will set up the ability to use
CostFlags in a future commit.

Release note: None
Added a hint to avoid full scans (see release note below for details).
To support this change, added a field to memo.Cost with a new type
memo.CostFlags, which contains a number of boolean flags and supports
"multi-dimensional costing". This allows the optimizer to compare plans
based on the flags set in addition to the single-dimensional float64
cost. For example, plans with the new FullScanPenalty cost flag enabled
will always be more expensive than plans without any cost flags, even
if the base float64 cost is lower.

The new CostFlags type also includes a flag for HugeCostPenalty, which
must be set for plans with "hugeCost". This ensures that existing
hints that use hugeCost still work if some other cost flags are set,
since HugeCostPenalty takes precedence over other cost flags.

This new CostFlags field is needed to support hints that do not cause an
error if the optimizer cannot find a plan complying with the hint. This
is needed because the previous approach of simply using "hugeCost" to
avoid certain plans meant that if such plans were unavoidable, we could
not effectively compare plans with cost greater than hugeCost due to
loss of floating point precision.

Informs cockroachdb#79683

Release note (sql change): Added support for a new index hint,
AVOID_FULL_SCAN, which will prevent the optimizer from planning a
full scan for the specified table if any other plan is possible. The
hint can be used in the same way as other existing index hints. For
example, SELECT * FROM table_name@{AVOID_FULL_SCAN};. This hint is
similar to NO_FULL_SCAN, but will not error if a full scan cannot be
avoided. Note that normally a full scan of a partial index would not
be considered a "full scan" for the purposes of the NO_FULL_SCAN and
AVOID_FULL_SCAN hints, but if the user has explicitly forced the
partial index via FORCE_INDEX=index_name, we do consider it a full
scan.
Fixes cockroachdb#79683

Release note (sql change): Added a new session setting
avoid_full_table_scans_in_mutations, which when set to true, causes
the optimizer to avoid planning full table scans for mutation queries
if any other plan is possible. It now defaults to true.
Removed a comment that references a function parameter that no longer
exists.

Release note: None
@rytaft rytaft force-pushed the avoid-full-scans-flags branch from 1a0d73c to d55cf61 Compare December 26, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: the optimizer cost model should consider contention
2 participants