Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce formatting width and precision to 16 bits #136932

Merged
merged 6 commits into from
Mar 11, 2025

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Feb 12, 2025

This is part of #99012

This is reduces the width and precision fields in format strings to 16 bits. They are currently full usizes, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision.

By reducing these fields to 16 bit, we can reduce FormattingOptions to 64 bits (see #136974) and improve the in memory representation of format_args!(). (See additional context below.)

This also fixes a bug where the width or precision is silently truncated when cross-compiling to a target with a smaller usize. By reducing the width and precision fields to the minimum guaranteed size of usize, 16 bits, this bug is eliminated.

This is a breaking change, but affects almost no existing code.


Details of this change:

There are three ways to set a width or precision today:

  1. Directly a formatting string, e.g. println!("{a:1234}")
  2. Indirectly in a formatting string, e.g. println!("{a:width$}", width=1234)
  3. Through the unstable FormattingOptions::width method.

This PR:

  • Adds a compiler error for 1. (println!("{a:9999999}") no longer compiles and gives a clear error.)
  • Adds a runtime check for 2. (println!("{a:width$}, width=9999999) will panic.)
  • Changes the signatures of the (unstable) FormattingOptions::[get_]width methods to use a u16 instead.

Additional context for improving FormattingOptions and fmt::Arguments:

All the formatting flags and options are currently:

  • The + flag (1 bit)
  • The - flag (1 bit)
  • The # flag (1 bit)
  • The 0 flag (1 bit)
  • The x? flag (1 bit)
  • The X? flag (1 bit)
  • The alignment (2 bits)
  • The fill character (21 bits)
  • Whether a width is specified (1 bit)
  • Whether a precision is specified (1 bit)
  • If used, the width (a full usize)
  • If used, the precision (a full usize)

Everything except the last two can simply fit in a u32 (those add up to 31 bits in total).

If we can accept a max width and precision of u16::MAX, we can make a FormattingOptions that is exactly 64 bits in size; the same size as a thin reference on most platforms.

If, additionally, we also limit the number of formatting arguments, we can also reduce the size of fmt::Arguments (that is, of a format_args!() expression).

@m-ou-se m-ou-se added T-libs Relevant to the library team, which will review and decide on the PR/issue. A-fmt Area: `core::fmt` S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. labels Feb 12, 2025
@m-ou-se m-ou-se self-assigned this Feb 12, 2025
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2025
@m-ou-se m-ou-se removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2025
@m-ou-se
Copy link
Member Author

m-ou-se commented Feb 12, 2025

@bors try

@bors
Copy link
Contributor

bors commented Feb 12, 2025

⌛ Trying commit 7355738 with merge 7af7790...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 12, 2025
…try>

Reduce formatting `width` and `precision` to 16 bits

This is reduces the `width` and `precision` fields in format strings to 16 bits. They are currently full `usize`s, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision.

This is a breaking change, but probably affects virtually no code. Let's do a crater run to find out. Marking this as experiment for now.

---

There are three ways to set a width or precision today:

1. Directly a formatting string, e.g. `println!("{a:1234}")`
2. Indirectly in a formatting string, e.g. `println!("{a:width$}", width=1234)`
3. Through the unstable `FormattingOptions::width` method.

This PR:

- Adds a compiler error for 1. (`println!("{a:9999999}")` no longer compiles and gives a clear error.)
- Adds a runtime check for 2. (`println!("{a:width$}, width=9999999)` will panic.)
- Changes the signature of `FormattingOptions::width` to take a `u16` instead.

---

Additional context:

All the formatting flags and options are currently:

- The `+` flag (1 bit)
- The `-` flag (1 bit)
- The `#` flag (1 bit)
- The `0` flag (1 bit)
- The `x?` flag (1 bit)
- The `X?` flag (1 bit)
- The alignment (2 bits)
- The fill character (21 bits)
- Whether a width is specified (1 bit)
- Whether a precision is specified (1 bit)
- If used, the width (a full usize)
- If used, the precision (a full usize)

Everything except the last two can simply fit in a `u32` (those add up to 31 bits in total).

If we can accept a max width and precision of u16::MAX, we can make a `FormattingOptions` that is exactly 64 bits in size; the same size as a thin reference on most platforms.

If, additionally, we also limit the number of formatting arguments to u16::MAX, we can also reduce the size of `fmt::Arguments` (that is, of a `format_args!()` expression).
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 12, 2025

☀️ Try build successful - checks-actions
Build commit: 7af7790 (7af779037716ae4125ceabb429791b4cf5dd0a43)

@m-ou-se
Copy link
Member Author

m-ou-se commented Feb 12, 2025

@craterbot run mode=build-and-test

@craterbot
Copy link
Collaborator

👌 Experiment pr-136932 created and queued.
🤖 Automatically detected try build 7af7790
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. labels Feb 12, 2025
@craterbot
Copy link
Collaborator

🚧 Experiment pr-136932 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@m-ou-se m-ou-se force-pushed the fmt-width-precision-u16 branch from 7355738 to ccb9429 Compare February 12, 2025 20:05
@oxalica
Copy link
Contributor

oxalica commented Feb 13, 2025

This is a breaking change, but probably affects virtually no code. Let's do a crater run to find out. Marking this as experiment for now.

As a data point, I do use usize width for indentation generation of pretty-printer in some of my crates.
It's used as a shortcut for " ".repeat(width) without allocation (at least I think so?). I can accept to enforce u16 here because nobody actually use 65536 indentation levels, but usize is definitely a more natural choice for "length" semantic than u16.

write!(self.w, "\n{:1$}]", "", self.cur_indent)?;
write!(f, "{:len$}", "", len = (stride - 1) * 4)?;

@m-ou-se
Copy link
Member Author

m-ou-se commented Feb 13, 2025

The type will stay a usize; changing that would break a lot of things. This PR adds a runtime check (panic) that the usize you give it isn't above u16::MAX. (And then stores everything internally as a u16.)

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 13, 2025
Reduce FormattingOptions to 64 bits

This reduces FormattingOptions from 6-7 machine words (384 bits on 64-bit platforms, 224 bits on 32-bit platforms) to just 64 bits (a single register on 64-bit platforms).

This PR includes rust-lang#136932, which reduces the width and precision options to 16 bits, to make it all fit.

Before:

```rust
pub struct FormattingOptions {
    flags: u32, // only 6 bits used
    fill: char,
    align: Option<Alignment>,
    width: Option<usize>,
    precision: Option<usize>,
}
```

After:

```rust
pub struct FormattingOptions {
    /// Bits:
    ///  - 0: `+` flag [rt::Flag::SignPlus]
    ///  - 1: `-` flag [rt::Flag::SignMinus]
    ///  - 2: `#` flag [rt::Flag::Alternate]
    ///  - 3: `0` flag [rt::Flag::SignAwareZeroPad]
    ///  - 4: `x?` flag [rt::Flag::DebugLowerHex]
    ///  - 5: `X?` flag [rt::Flag::DebugUpperHex]
    ///  - 6-7: Alignment (0: Left, 1: Right, 2: Center, 3: Unknown)
    ///  - 8: Width flag (if set, the width field below is used)
    ///  - 9: Precision flag (if set, the precision field below is used)
    ///  - 10: unused
    ///  - 11-31: fill character (21 bits, a full `char`)
    flags: u32,
    /// Width if width flag above is set. Otherwise, always 0.
    width: u16,
    /// Precision if precision flag above is set. Otherwise, always 0.
    precision: u16,
}
```
@craterbot
Copy link
Collaborator

🎉 Experiment pr-136932 is completed!
📊 172 regressed and 113 fixed (582049 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the denylist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Feb 13, 2025
@m-ou-se
Copy link
Member Author

m-ou-se commented Feb 14, 2025

Crater results: Only two regressions are caused by an error or panic from this change.

  1. gh/vasekp/stream-rust - Runtime panic - Using a value close to usize::MAX as precision - Update: fixed
  2. gh/uutils/coreutils - Compiler error in a unit test - Update: fixed

@m-ou-se m-ou-se added I-libs-nominated Nominated for discussion during a libs team meeting. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. labels Feb 14, 2025
@m-ou-se m-ou-se marked this pull request as ready for review February 14, 2025 08:45
@m-ou-se m-ou-se force-pushed the fmt-width-precision-u16 branch from 1fe3f1b to 2ce0205 Compare March 10, 2025 12:58
@scottmcm
Copy link
Member

@bors r+ rollup=iffy (might want to bisect here)

@bors
Copy link
Contributor

bors commented Mar 10, 2025

📌 Commit 2ce0205 has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 10, 2025
@scottmcm scottmcm dismissed their stale review March 10, 2025 17:07

Feedback was addressed.

jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 10, 2025
…=scottmcm

Reduce formatting `width` and `precision` to 16 bits

This is part of rust-lang#99012

This is reduces the `width` and `precision` fields in format strings to 16 bits. They are currently full `usize`s, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision.

By reducing these fields to 16 bit, we can reduce `FormattingOptions` to 64 bits (see rust-lang#136974) and improve the in memory representation of `format_args!()`. (See additional context below.)

This also fixes a bug where the width or precision is silently truncated when cross-compiling to a target with a smaller `usize`. By reducing the width and precision fields to the minimum guaranteed size of `usize`, 16 bits, this bug is eliminated.

This is a breaking change, but affects almost no existing code.

---

Details of this change:

There are three ways to set a width or precision today:

1. Directly a formatting string, e.g. `println!("{a:1234}")`
2. Indirectly in a formatting string, e.g. `println!("{a:width$}", width=1234)`
3. Through the unstable `FormattingOptions::width` method.

This PR:

- Adds a compiler error for 1. (`println!("{a:9999999}")` no longer compiles and gives a clear error.)
- Adds a runtime check for 2. (`println!("{a:width$}, width=9999999)` will panic.)
- Changes the signatures of the (unstable) `FormattingOptions::[get_]width` methods to use a `u16` instead.

---

Additional context for improving `FormattingOptions` and `fmt::Arguments`:

All the formatting flags and options are currently:

- The `+` flag (1 bit)
- The `-` flag (1 bit)
- The `#` flag (1 bit)
- The `0` flag (1 bit)
- The `x?` flag (1 bit)
- The `X?` flag (1 bit)
- The alignment (2 bits)
- The fill character (21 bits)
- Whether a width is specified (1 bit)
- Whether a precision is specified (1 bit)
- If used, the width (a full usize)
- If used, the precision (a full usize)

Everything except the last two can simply fit in a `u32` (those add up to 31 bits in total).

If we can accept a max width and precision of u16::MAX, we can make a `FormattingOptions` that is exactly 64 bits in size; the same size as a thin reference on most platforms.

If, additionally, we also limit the number of formatting arguments, we can also reduce the size of `fmt::Arguments` (that is, of a `format_args!()` expression).
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 10, 2025
…=scottmcm

Reduce formatting `width` and `precision` to 16 bits

This is part of rust-lang#99012

This is reduces the `width` and `precision` fields in format strings to 16 bits. They are currently full `usize`s, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision.

By reducing these fields to 16 bit, we can reduce `FormattingOptions` to 64 bits (see rust-lang#136974) and improve the in memory representation of `format_args!()`. (See additional context below.)

This also fixes a bug where the width or precision is silently truncated when cross-compiling to a target with a smaller `usize`. By reducing the width and precision fields to the minimum guaranteed size of `usize`, 16 bits, this bug is eliminated.

This is a breaking change, but affects almost no existing code.

---

Details of this change:

There are three ways to set a width or precision today:

1. Directly a formatting string, e.g. `println!("{a:1234}")`
2. Indirectly in a formatting string, e.g. `println!("{a:width$}", width=1234)`
3. Through the unstable `FormattingOptions::width` method.

This PR:

- Adds a compiler error for 1. (`println!("{a:9999999}")` no longer compiles and gives a clear error.)
- Adds a runtime check for 2. (`println!("{a:width$}, width=9999999)` will panic.)
- Changes the signatures of the (unstable) `FormattingOptions::[get_]width` methods to use a `u16` instead.

---

Additional context for improving `FormattingOptions` and `fmt::Arguments`:

All the formatting flags and options are currently:

- The `+` flag (1 bit)
- The `-` flag (1 bit)
- The `#` flag (1 bit)
- The `0` flag (1 bit)
- The `x?` flag (1 bit)
- The `X?` flag (1 bit)
- The alignment (2 bits)
- The fill character (21 bits)
- Whether a width is specified (1 bit)
- Whether a precision is specified (1 bit)
- If used, the width (a full usize)
- If used, the precision (a full usize)

Everything except the last two can simply fit in a `u32` (those add up to 31 bits in total).

If we can accept a max width and precision of u16::MAX, we can make a `FormattingOptions` that is exactly 64 bits in size; the same size as a thin reference on most platforms.

If, additionally, we also limit the number of formatting arguments, we can also reduce the size of `fmt::Arguments` (that is, of a `format_args!()` expression).
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 11, 2025
…=scottmcm

Reduce formatting `width` and `precision` to 16 bits

This is part of rust-lang#99012

This is reduces the `width` and `precision` fields in format strings to 16 bits. They are currently full `usize`s, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision.

By reducing these fields to 16 bit, we can reduce `FormattingOptions` to 64 bits (see rust-lang#136974) and improve the in memory representation of `format_args!()`. (See additional context below.)

This also fixes a bug where the width or precision is silently truncated when cross-compiling to a target with a smaller `usize`. By reducing the width and precision fields to the minimum guaranteed size of `usize`, 16 bits, this bug is eliminated.

This is a breaking change, but affects almost no existing code.

---

Details of this change:

There are three ways to set a width or precision today:

1. Directly a formatting string, e.g. `println!("{a:1234}")`
2. Indirectly in a formatting string, e.g. `println!("{a:width$}", width=1234)`
3. Through the unstable `FormattingOptions::width` method.

This PR:

- Adds a compiler error for 1. (`println!("{a:9999999}")` no longer compiles and gives a clear error.)
- Adds a runtime check for 2. (`println!("{a:width$}, width=9999999)` will panic.)
- Changes the signatures of the (unstable) `FormattingOptions::[get_]width` methods to use a `u16` instead.

---

Additional context for improving `FormattingOptions` and `fmt::Arguments`:

All the formatting flags and options are currently:

- The `+` flag (1 bit)
- The `-` flag (1 bit)
- The `#` flag (1 bit)
- The `0` flag (1 bit)
- The `x?` flag (1 bit)
- The `X?` flag (1 bit)
- The alignment (2 bits)
- The fill character (21 bits)
- Whether a width is specified (1 bit)
- Whether a precision is specified (1 bit)
- If used, the width (a full usize)
- If used, the precision (a full usize)

Everything except the last two can simply fit in a `u32` (those add up to 31 bits in total).

If we can accept a max width and precision of u16::MAX, we can make a `FormattingOptions` that is exactly 64 bits in size; the same size as a thin reference on most platforms.

If, additionally, we also limit the number of formatting arguments, we can also reduce the size of `fmt::Arguments` (that is, of a `format_args!()` expression).
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2025
Rollup of 16 pull requests

Successful merges:

 - rust-lang#126856 (remove deprecated tool `rls`)
 - rust-lang#136932 (Reduce formatting `width` and `precision` to 16 bits)
 - rust-lang#137314 (change definitely unproductive cycles to error)
 - rust-lang#137612 (Update bootstrap to edition 2024)
 - rust-lang#138002 (Disable CFI for weakly linked syscalls)
 - rust-lang#138052 (strip `-Wlinker-messages` wrappers from `rust-lld` rmake test)
 - rust-lang#138063 (Improve `-Zunpretty=hir` for parsed attrs)
 - rust-lang#138109 (make precise capturing args in rustdoc Json typed)
 - rust-lang#138147 (Add maintainers for powerpc64le-unknown-linux-gnu)
 - rust-lang#138245 (stabilize `ci_rustc_if_unchanged_logic` test for local environments)
 - rust-lang#138296 (Remove `AdtFlags::IS_ANONYMOUS` and `Copy`/`Clone` condition for anonymous ADT)
 - rust-lang#138300 (add tracking issue for unqualified_local_imports)
 - rust-lang#138307 (Allow specifying glob patterns for try jobs)
 - rust-lang#138313 (Update books)
 - rust-lang#138315 (use next_back() instead of last() on DoubleEndedIterator)
 - rust-lang#138318 (Rustdoc: remove a bunch of `@ts-expect-error` from main.js)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors
Copy link
Contributor

bors commented Mar 11, 2025

⌛ Testing commit 2ce0205 with merge 374ce1f...

@matthiaskrgr
Copy link
Member

@jieyouxu
image
this means we should NOT burry this in a rollup of 25 PRs ..

@jieyouxu
Copy link
Member

this means we should NOT burry this in a rollup of 25 PRs ..

If this is perf-sensitive, then it should be marked as rollup=never, or it can be perf-build specifically post rollup. The size of the rollup has no bearing to this, you would have to perf build separately in a 5 PR rollup anyway.

@jieyouxu
Copy link
Member

For future bisection,
@bors rollup=never

@bors
Copy link
Contributor

bors commented Mar 11, 2025

☀️ Test successful - checks-actions
Approved by: scottmcm
Pushing 374ce1f to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 11, 2025
@bors bors merged commit 374ce1f into rust-lang:master Mar 11, 2025
7 checks passed
@rustbot rustbot added this to the 1.87.0 milestone Mar 11, 2025
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
   Compiling clap v4.5.28
   Compiling build_helper v0.1.0 (/home/runner/work/rust/rust/src/build_helper)
   Compiling citool v0.1.0 (/home/runner/work/rust/rust/src/ci/citool)
    Finished `release` profile [optimized] target(s) in 33.53s
     Running `target/release/citool post-merge-analysis 90384941aae4ea38de00e4702b50757e9b882a19 374ce1f90951b4dd1c8789c6a5905abe8ea99ef8`
error: unrecognized subcommand 'post-merge-analysis'

  tip: a similar subcommand exists: 'post-merge-report'

Usage: citool <COMMAND>

For more information, try '--help'.
##[error]Process completed with exit code 2.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (374ce1f): comparison URL.

Overall result: ❌ regressions - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.4% [0.4%, 0.5%] 3
Regressions ❌
(secondary)
1.2% [1.2%, 1.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.4%, 0.5%] 3

Max RSS (memory usage)

Results (primary -1.1%, secondary 2.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.3% [2.1%, 2.4%] 2
Regressions ❌
(secondary)
2.8% [2.1%, 4.1%] 6
Improvements ✅
(primary)
-7.8% [-7.8%, -7.8%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.1% [-7.8%, 2.4%] 3

Cycles

Results (secondary -1.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
5.3% [5.3%, 5.4%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.8% [-4.9%, -2.2%] 6
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.0%, secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.2%] 32
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 3
Improvements ✅
(primary)
-0.1% [-0.5%, -0.1%] 10
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.0% [-0.5%, 0.2%] 42

Bootstrap: 782.162s -> 782.387s (0.03%)
Artifact size: 365.19 MiB -> 365.21 MiB (0.00%)

@rustbot rustbot added the perf-regression Performance regression. label Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-fmt Area: `core::fmt` disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. to-announce Announce this issue on triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.