|
| 1 | +- Feature Name: (not applicable) |
| 2 | +- Start Date: 2016-05-17 |
| 3 | +- RFC PR: (leave this empty) |
| 4 | +- Rust Issue: (leave this empty) |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Removes the one-type-only restriction on `format_args!` arguments. |
| 10 | +Expressions like `format_args!("{0:x} {0:o}", foo)` now work as intended, |
| 11 | +where each argument is still evaluated only once, in order of appearance |
| 12 | +(i.e. left-to-right). |
| 13 | + |
| 14 | +# Motivation |
| 15 | +[motivation]: #motivation |
| 16 | + |
| 17 | +The `format_args!` macro and its friends historically only allowed a single |
| 18 | +type per argument, such that trivial format strings like `"{0:?} == {0:x}"` or |
| 19 | +`"rgb({r}, {g}, {b}) is #{r:02x}{g:02x}{b:02x}"` are illegal. This is |
| 20 | +massively inconvenient and counter-intuitive, especially considering the |
| 21 | +formatting syntax is borrowed from Python where such things are perfectly |
| 22 | +valid. |
| 23 | + |
| 24 | +Upon closer investigation, the restriction is in fact an artificial |
| 25 | +implementation detail. For mapping format placeholders to macro arguments the |
| 26 | +`format_args!` implementation did not bother to record type information for |
| 27 | +all the placeholders sequentially, but rather chose to remember only one type |
| 28 | +per argument. Also the formatting logic has not received significant attention |
| 29 | +since after its conception, but the uses have greatly expanded over the years, |
| 30 | +so the mechanism as a whole certainly needs more love. |
| 31 | + |
| 32 | +# Detailed design |
| 33 | +[design]: #detailed-design |
| 34 | + |
| 35 | +Formatting is done during both compile-time (expansion-time to be pedantic) |
| 36 | +and runtime in Rust. As we are concerned with format string parsing, not |
| 37 | +outputting, this RFC only touches the compile-time side of the existing |
| 38 | +formatting mechanism which is `libsyntax_ext` and `libfmt_macros`. |
| 39 | + |
| 40 | +Before continuing with the details, it is worth noting that the core flow of |
| 41 | +current Rust formatting is *mapping arguments to placeholders to format specs*. |
| 42 | +For clarity, we distinguish among *placeholders*, *macro arguments* and |
| 43 | +*argument objects*. They are all *italicized* to provide some |
| 44 | +visual hint for distinction. |
| 45 | + |
| 46 | +To implement the proposed design, the following changes in behavior are made: |
| 47 | + |
| 48 | +* implicit references are resolved during parse of format string; |
| 49 | +* named *macro arguments* are resolved into positional ones; |
| 50 | +* placeholder types are remembered and de-duplicated for each *macro argument*, |
| 51 | +* the *argument objects* are emitted with information gathered in steps above. |
| 52 | + |
| 53 | +As most of the details is best described in the code itself, we only |
| 54 | +illustrate some of the high-level changes below. |
| 55 | + |
| 56 | +## Implicit reference resolution |
| 57 | + |
| 58 | +Currently two forms of implicit references exist: `ArgumentNext` and |
| 59 | +`CountIsNextParam`. Both take a positional *macro argument* and advance the |
| 60 | +same internal pointer, but format is parsed before position, as shown in |
| 61 | +format strings like `"{foo:.*} {} {:.*}"` which is in every way equivalent to |
| 62 | +`"{foo:.0$} {1} {3:.2$}"`. |
| 63 | + |
| 64 | +As the rule is already known even at compile-time, and does not require the |
| 65 | +whole format string to be known beforehand, the resolution can happen just |
| 66 | +inside the parser after a *placeholder* is successfully parsed. As a natural |
| 67 | +consequence, both forms can be removed from the rest of the compiler, |
| 68 | +simplifying work later. |
| 69 | + |
| 70 | +## Named argument resolution |
| 71 | + |
| 72 | +Not seen elsewhere in Rust, named arguments in format macros are best seen as |
| 73 | +syntactic sugar, and we'd better actually treat them as such. Just after |
| 74 | +successfully parsing the *macro arguments*, we immediately rewrite every name |
| 75 | +to its respective position in the argument list, which again simplifies the |
| 76 | +process. |
| 77 | + |
| 78 | +## Processing and expansion |
| 79 | + |
| 80 | +We only have absolute positional references to *macro arguments* at this point, |
| 81 | +and it's straightforward to remember all unique *placeholders* encountered for |
| 82 | +each. The unique *placeholders* are emitted into *argument objects* in order, |
| 83 | +to preserve evaluation order, but no difference in behavior otherwise. |
| 84 | + |
| 85 | +# Drawbacks |
| 86 | +[drawbacks]: #drawbacks |
| 87 | + |
| 88 | +Due to the added data structures and processing, time and memory costs of |
| 89 | +compilations may slightly increase. However this is mere speculation without |
| 90 | +actual profiling and benchmarks. Also the ergonomical benefits alone justifies |
| 91 | +the additional costs. |
| 92 | + |
| 93 | +# Alternatives |
| 94 | +[alternatives]: #alternatives |
| 95 | + |
| 96 | +## Do nothing |
| 97 | + |
| 98 | +One can always write a little more code to simulate the proposed behavior, |
| 99 | +and this is what people have most likely been doing under today's constraints. |
| 100 | +As in: |
| 101 | + |
| 102 | +```rust |
| 103 | +fn main() { |
| 104 | + let r = 0x66; |
| 105 | + let g = 0xcc; |
| 106 | + let b = 0xff; |
| 107 | + |
| 108 | + // rgb(102, 204, 255) == #66ccff |
| 109 | + // println!("rgb({r}, {g}, {b}) == #{r:02x}{g:02x}{b:02x}", r=r, g=g, b=b); |
| 110 | + println!("rgb({}, {}, {}) == #{:02x}{:02x}{:02x}", r, g, b, r, g, b); |
| 111 | +} |
| 112 | +``` |
| 113 | + |
| 114 | +Or slightly more verbose when side effects are in play: |
| 115 | + |
| 116 | +```rust |
| 117 | +fn do_something(i: &mut usize) -> usize { |
| 118 | + let result = *i; |
| 119 | + *i += 1; |
| 120 | + result |
| 121 | +} |
| 122 | + |
| 123 | +fn main() { |
| 124 | + let mut i = 0x1234usize; |
| 125 | + |
| 126 | + // 0b1001000110100 0o11064 0x1234 |
| 127 | + // 0x1235 |
| 128 | + // println!("{0:#b} {0:#o} {0:#x}", do_something(&mut i)); |
| 129 | + // println!("{:#x}", i); |
| 130 | + |
| 131 | + // need to consider side effects, hence a temp var |
| 132 | + { |
| 133 | + let r = do_something(&mut i); |
| 134 | + println!("{:#b} {:#o} {:#x}", r, r, r); |
| 135 | + println!("{:#x}", i); |
| 136 | + } |
| 137 | +} |
| 138 | +``` |
| 139 | + |
| 140 | +While the effects are the same and nothing requires modification, the |
| 141 | +ergonomics is simply bad and the code becomes unnecessarily convoluted. |
| 142 | + |
| 143 | +# Unresolved questions |
| 144 | +[unresolved]: #unresolved-questions |
| 145 | + |
| 146 | +None. |
0 commit comments