-
Notifications
You must be signed in to change notification settings - Fork 80
feat(clp-s)!: Reserve the $
, !
, and #
namespaces for future use.
#827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
## Walkthrough
This pull request introduces three new namespace constants in the core constants header and integrates them into the existing tokenisation and unescaping logic. The changes modify the `tokenize_column_descriptor` function to use these extra constants, expand the KQL grammar to recognise them as special characters, and update test cases alongside documentation to reflect the new escaping requirements for keys starting with these characters.
## Changes
| File(s) | Change Summary |
|-----------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `components/core/.../archive_constants.hpp`, <br> `components/core/.../SearchUtils.cpp`, <br> `components/core/.../Kql.g4` | - Added `constexpr` constants: `cRangeIndexNamespace` (`"$"`), `cReservedNamespace1` (`"!"`), and `cReservedNamespace2` (`"#"`). <br> - Modified the `tokenize_column_descriptor` function to use a switch statement for multiple namespace characters. <br> - Expanded the `SPECIAL_CHARACTER` fragment in the KQL grammar to include the new characters. |
| `components/core/.../test-kql.cpp`, <br> `docs/.../reference-json-search-syntax.md` | - Updated test cases to support namespaced columns with the additional characters and used dynamic queries for multiple scenarios. <br> - Revised documentation to include escaping requirements for keys beginning with `@`, `$`, `!`, and `#`. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant Caller
participant Tokenizer as tokenize_column_descriptor
Note over Tokenizer: Begin processing descriptor string
Caller ->> Tokenizer: Pass descriptor string
Tokenizer ->> Tokenizer: Switch on first character
alt Character is one of '@', '$', '!', '#'
Tokenizer ->> Tokenizer: Set descriptor_namespace & increment start_index
else No matching namespace character
Tokenizer ->> Tokenizer: Use default handling
end
Tokenizer -->> Caller: Return tokenised output Possibly Related PRs
Suggested Reviewers
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs review.
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/src/user-guide/reference-json-search-syntax.md (2)
195-195
: Use consistent emphasis style
The phrase “keys that start with …” uses underscores for emphasis, but the document otherwise uses asterisks. Please switch to*start*
to satisfy MD049 and maintain consistency.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
198-198
: Correct the bullet list entry for backslash
The linea `\` (backslash):isn’t formatted as a list item. It should read:
* `\` (backslash)to match the other entries in the list.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-guide/reference-json-search-syntax.md
(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/src/user-guide/reference-json-search-syntax.md
195-195: Emphasis style
Expected: asterisk; Actual: underscore
(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore
(MD049, emphasis-style)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-13, true)
🔇 Additional comments (1)
docs/src/user-guide/reference-json-search-syntax.md (1)
194-203
: Escaping rules update accurately reflects code changes
The documentation now correctly requires escaping of leading\
,@
,$
,!
, and#
in keys, aligning perfectly with the newly reserved namespaces in the codebase.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally okay with the changes in this PR. I just have one question: do we have any guideline that restricts the use of char constants? If not, we could store those namespace characters as char
constants. That way, we can directly use constants::cAutogenNamespace
in the switch statement and replace the @
and !
literals with constants::cAutogenNamespace
and constants::cReservedNamespace1
, respectively.
Co-authored-by: wraymo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/src/user-guide/reference-json-search-syntax.md (2)
195-195
: Use asterisk style for emphasis
To comply with our Markdown style (MD049), replace the underscores around start with asterisks, e.g.,*start*
instead of_start_
.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
198-198
: Maintain consistent bullet punctuation
The first list item ends with a colon (:
) while the others don’t. For consistency, remove the trailing colon from`\` (backslash):
or add a colon to the remaining items.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-guide/reference-json-search-syntax.md
(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/src/user-guide/reference-json-search-syntax.md
195-195: Emphasis style
Expected: asterisk; Actual: underscore
(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore
(MD049, emphasis-style)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: lint-check (macos-latest)
🔇 Additional comments (1)
docs/src/user-guide/reference-json-search-syntax.md (1)
194-202
: Clarify namespace escape requirements for$
,!
, and#
The documentation accurately patches the escaping rules to include the new$
,!
, and#
namespaces. This aligns with the parser and unescape logic updates in code.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
Actually most of the reason I'm using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
docs/src/user-guide/reference-json-search-syntax.md (1)
195-196
: Document escaping rules for keys starting with reserved namespace characters
This matches the earlier suggestion to clarify that keys beginning with special characters (@
,$
,!
,#
) must be escaped.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
🧹 Nitpick comments (1)
docs/src/user-guide/reference-json-search-syntax.md (1)
195-196
: Use asterisk for emphasis instead of underscores
Markdown lint (MD049) flags the use of underscores for emphasis. Please use*start*
rather than_start_
to comply with the project's Markdown style.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore(MD049, emphasis-style)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-guide/reference-json-search-syntax.md
(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/src/user-guide/reference-json-search-syntax.md
195-195: Emphasis style
Expected: asterisk; Actual: underscore
(MD049, emphasis-style)
195-195: Emphasis style
Expected: asterisk; Actual: underscore
(MD049, emphasis-style)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-14, false)
🔇 Additional comments (1)
docs/src/user-guide/reference-json-search-syntax.md (1)
198-201
: Reserved namespace characters documented correctly
The newly added list of reserved namespace characters (@
,$
,!
,#
) is clear and aligns with the implementation changes in parsing and unescaping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs review.
y-scope#827) Co-authored-by: kirkrodrigues <[email protected]> Co-authored-by: wraymo <[email protected]>
Description
This PR reserves the
$
,!
, and#
namespaces for future use. Specifically we update the parsing and unescaping rules for columns to treat columns beginning with$
,!
, and#
as belonging to those respective namespaces unless escaped with a backslash.Note that each change to the column parsing and unescaping rules is a breaking archive format change because it affects the interpretation of the authoritative timestamp column stored in the archive. Even though this is a breaking change we don't increment the archive version because if this PR gets merged before the next release we get to batch it together with a similar column format change for which we did increment the archive version.
We have near-term plans to use the
$
namespace to allow users to reference columns in the archive range index (see #737). The!
and#
namespaces have no immediate use, but they are reserved in this PR to avoid having to introduce more breaking changes of this type for the foreseeable future.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Documentation
Tests