-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document C string literal tokens. #1423
Conversation
Note: this feature is being stabilized in rust-lang/rust#117472 -- CI will fail until run with I ran |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Can you also include a section that indicates that C-string literals are only available in Edition 2021 or newer? Edition differences are specified in blockquotes (search for "Edition Differences" for the format).
I believe the Reserved prefixes section needs to be updated with c
and cr
being excluded.
I believe Literal patterns will need to be updated since C-strings are accepted there syntactically. (They can't really be used since CStr doesn't implement Eq/PartialEq, though.)
5d19507
to
2481014
Compare
Thank you for the quick review! I've made the recommended changes, fixed the ASCII vs Unicode misunderstanding, and tried to clarify the wording around |
A _C string literal_ is a sequence of Unicode characters and _escapes_, | ||
preceded by the characters `U+0063` (`c`) and `U+0022` (double-quote), and | ||
followed by the character `U+0022`. If the character `U+0022` is present within | ||
the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character. | ||
Alternatively, a C string literal can be a _raw C string literal_, defined | ||
below. The type of a C string literal is [`&core::ffi::CStr`][CStr]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whilst below it is mentioned that code point escapes are encoded as UTF-8, nowhere is it stated how the Unicode characters contained within the C string literal are encoded in the ensuing CStr
: I presume also UTF-8? Perhaps this should be stated explicitly for the avoidance of any doubt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done -- I added a section about encoding after the escapes.
2481014
to
c86fa19
Compare
…ilstrieb Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
c86fa19
to
ae1eb71
Compare
The stabilization PR has merged and this PR's CI build is now green. |
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Update books ## rust-lang/nomicon 1 commits in 1842257814919fa62e81bdecd5e8f95be2839dbb..83d015105e6d490fc30d6c95da1e56152a50e228 2023-11-22 15:35:31 UTC to 2023-11-22 15:35:31 UTC - Reword the section on general race conditions (rust-lang/nomicon#431) ## rust-lang/reference 5 commits in cd8193e972f61b92117095fc73b67af767b4d6bc..692d216f5a1151e8852ddb308ba64040e634c876 2023-12-04 09:45:06 UTC to 2023-11-21 17:57:18 UTC - Fix note on `self` coercion (rust-lang/reference#1431) - Document C string literal tokens. (rust-lang/reference#1423) - type-layout.md: Warn about repr(align)/repr(packed) and field order (rust-lang/reference#1430) - Lone `self` in a method body resolves to the self parameter (rust-lang/reference#1427) - Reference wildcard patterns from underscore expr (rust-lang/reference#1428) ## rust-lang/rust-by-example 4 commits in a6581246f96837113968c02187db24f742af3908..da0a06aada31a324ae84a9eaee344f6a944b9683 2023-11-27 12:50:49 UTC to 2023-11-21 11:58:19 UTC - fix tiny typo in string conversion docs (rust-lang/rust-by-example#1776) - fix(arg): Remove reference to Rust Cookbook in arg parsing (rust-lang/rust-by-example#1775) - fix:typo error (rust-lang/rust-by-example#1774) - Remove space between `&` and `self` (rust-lang/rust-by-example#1772) ## rust-lang/rustc-dev-guide 5 commits in ddb8b13..904bb5a 2023-11-28 13:13:36 UTC to 2023-11-22 06:13:00 UTC - Update how-to-build-and-run.md (rust-lang/rustc-dev-guide#1828) - notification groups: add information about how to ping them (rust-lang/rustc-dev-guide#1818) - Add explanations on how to run rustc_codegen_gcc tests (rust-lang/rustc-dev-guide#1821) - Add back the `canonicalization` chapter. (rust-lang/rustc-dev-guide#1532) - Emphasize that the experts map is not up to date (rust-lang/rustc-dev-guide#1826)
Update books ## rust-lang/nomicon 1 commits in 1842257814919fa62e81bdecd5e8f95be2839dbb..83d015105e6d490fc30d6c95da1e56152a50e228 2023-11-22 15:35:31 UTC to 2023-11-22 15:35:31 UTC - Reword the section on general race conditions (rust-lang/nomicon#431) ## rust-lang/reference 5 commits in cd8193e972f61b92117095fc73b67af767b4d6bc..692d216f5a1151e8852ddb308ba64040e634c876 2023-12-04 09:45:06 UTC to 2023-11-21 17:57:18 UTC - Fix note on `self` coercion (rust-lang/reference#1431) - Document C string literal tokens. (rust-lang/reference#1423) - type-layout.md: Warn about repr(align)/repr(packed) and field order (rust-lang/reference#1430) - Lone `self` in a method body resolves to the self parameter (rust-lang/reference#1427) - Reference wildcard patterns from underscore expr (rust-lang/reference#1428) ## rust-lang/rust-by-example 4 commits in a6581246f96837113968c02187db24f742af3908..da0a06aada31a324ae84a9eaee344f6a944b9683 2023-11-27 12:50:49 UTC to 2023-11-21 11:58:19 UTC - fix tiny typo in string conversion docs (rust-lang/rust-by-example#1776) - fix(arg): Remove reference to Rust Cookbook in arg parsing (rust-lang/rust-by-example#1775) - fix:typo error (rust-lang/rust-by-example#1774) - Remove space between `&` and `self` (rust-lang/rust-by-example#1772) ## rust-lang/rustc-dev-guide 5 commits in ddb8b13..904bb5a 2023-11-28 13:13:36 UTC to 2023-11-22 06:13:00 UTC - Update how-to-build-and-run.md (rust-lang/rustc-dev-guide#1828) - notification groups: add information about how to ping them (rust-lang/rustc-dev-guide#1818) - Add explanations on how to run rustc_codegen_gcc tests (rust-lang/rustc-dev-guide#1821) - Add back the `canonicalization` chapter. (rust-lang/rustc-dev-guide#1532) - Emphasize that the experts map is not up to date (rust-lang/rustc-dev-guide#1826)
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
Rollup merge of rust-lang#118614 - rustbot:docs-update, r=ehuss Update books ## rust-lang/nomicon 1 commits in 1842257814919fa62e81bdecd5e8f95be2839dbb..83d015105e6d490fc30d6c95da1e56152a50e228 2023-11-22 15:35:31 UTC to 2023-11-22 15:35:31 UTC - Reword the section on general race conditions (rust-lang/nomicon#431) ## rust-lang/reference 5 commits in cd8193e972f61b92117095fc73b67af767b4d6bc..692d216f5a1151e8852ddb308ba64040e634c876 2023-12-04 09:45:06 UTC to 2023-11-21 17:57:18 UTC - Fix note on `self` coercion (rust-lang/reference#1431) - Document C string literal tokens. (rust-lang/reference#1423) - type-layout.md: Warn about repr(align)/repr(packed) and field order (rust-lang/reference#1430) - Lone `self` in a method body resolves to the self parameter (rust-lang/reference#1427) - Reference wildcard patterns from underscore expr (rust-lang/reference#1428) ## rust-lang/rust-by-example 4 commits in a6581246f96837113968c02187db24f742af3908..da0a06aada31a324ae84a9eaee344f6a944b9683 2023-11-27 12:50:49 UTC to 2023-11-21 11:58:19 UTC - fix tiny typo in string conversion docs (rust-lang/rust-by-example#1776) - fix(arg): Remove reference to Rust Cookbook in arg parsing (rust-lang/rust-by-example#1775) - fix:typo error (rust-lang/rust-by-example#1774) - Remove space between `&` and `self` (rust-lang/rust-by-example#1772) ## rust-lang/rustc-dev-guide 5 commits in ddb8b13..904bb5a 2023-11-28 13:13:36 UTC to 2023-11-22 06:13:00 UTC - Update how-to-build-and-run.md (rust-lang/rustc-dev-guide#1828) - notification groups: add information about how to ping them (rust-lang/rustc-dev-guide#1818) - Add explanations on how to run rustc_codegen_gcc tests (rust-lang/rustc-dev-guide#1821) - Add back the `canonicalization` chapter. (rust-lang/rustc-dev-guide#1532) - Emphasize that the experts map is not up to date (rust-lang/rustc-dev-guide#1826)
This reverts commit 21a27e1, reversing changes made to 01a12f2. This is being reverted in rust-lang/rust#119528
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
No description provided.