-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: a new formatting API #1127
Comments
That is a good usecase that is not yet covered by the solution in #902.
While far less common, I think there is some use to keeping the formatting string dynamic. But we can get pretty much the same with const fns?
If we want to allow for an API that doesn't need allocations (in theory) such as we have now it seems to me we have to stick with iterators. Because where can you store a slice of unknown length?
The type that implements I do think the idea a method that binds an iterator of formatting items to an input type is good. And also to at some point make going through the resulting type the only way to create a struct Formatter<I, T> {
items_iter: I,
value: PhantomData<T>,
} |
I'm not sure what you mean by "keeping the formatting string dynamic"?
Yeah, eventually the point is iterating over it, and we could have a version that abstracts over roughly something like
Right, I guess what I'm saying is that
So yes, one idea is binding the allowed input type into a value. But I think the other core aspect is really crisply separating the parsing and formatting stages at least in the internal API. |
So I guess there is a third failure mode here which is that some values are outside the range of some (interpretation of) some format items -- in particular, RFC 3339 does not allow years outside the range 0000-9999. |
I have arrived at an API that seems to do everything I want. Part 1: format string parsingAdd two new fallible methods to #[derive(Clone, Debug)]
pub struct StrftimeItems<'a> { /* ... */ }
impl<'a> StrftimeItems<'a> {
/// Creates a new parsing iterator from the `strftime`-like format string.
/// The iterator will return `Item::Error` when the format string is invalid.
pub const fn new(s: &'a str) -> StrftimeItems<'a>;
/// Creates a new parsing iterator from the `strftime`-like format string.
#[cfg(feature = "unstable-locales")]
pub const fn new_with_locale(s: &'a str, locale: Locale) -> StrftimeItems<'a>;
/// Parse into a `Vec` with formatting items.
/// Returns `Error` if the format string is invalid.
pub fn parse(self) -> Result<Vec<Item<'a>>, ParseError>;
/// Parse formatting items into an existing `[Item]` slice.
/// Returns `Error` if the format string is invalid or if the slice can't fit all elements.
pub fn parse_into_slice<'b>(self, buf: &'b mut [Item<'a>]) -> Result<&'b [Item<'a>], ParseError>;
}
impl<'a> Iterator for StrftimeItems<'a> {
type Item = Item<'a>;
fn next(&mut self) -> Option<Item<'a>>;
} Part 2: bind to input typeAdd a new #[derive(Clone, Debug)]
pub struct FormattingSpec<I, T> {
items: I,
date_time_type: PhantomData<T>,
} The methods to construct it live on the input types, because from there they can be called without type annotations. Example for impl NaiveDate {
/// Create a new `FormattingSpec` that can be use for multiple `NaiveDate`s.
pub fn formatter<'a, I, J, B>(items: I) -> Result<FormattingSpec<J, Self>, ParseError>
where
I: IntoIterator<Item = B, IntoIter = J> + Clone,
J: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>;
}
impl DateTime<Utc> {
/// Create a new `FormattingSpec` that can be use for multiple `DateTime`s.
pub fn formatter<'a, I, J, B>(items: I) -> Result<FormattingSpec<'a, J, Self>, ParseError>
where
I: IntoIterator<Item = B, IntoIter = J>,
J: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>;
} Part 3: new formatterI took this opportunity to add a new I see no harm in making the methods public like they are on #[derive(Debug)]
pub struct Formatter<I, Off> { /* ... */ }
impl<'a, I, B, Off> Formatter<I, Off>
where
I: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>,
Off: Offset + fmt::Display,
{
/// Makes a new `Formatter` value out of local date and time and UTC offset.
pub fn new(
date: Option<NaiveDate>,
time: Option<NaiveTime>,
offset: Option<Off>,
items: I,
) -> Formatter<I, Off>;
/// Makes a new `DelayedFormat` value out of local date and time, UTC offset and locale.
#[cfg(feature = "unstable-locales")]
pub fn new_with_locale(
date: Option<NaiveDate>,
time: Option<NaiveTime>,
offset: Option<Off>,
items: I,
locale: Locale,
) -> Formatter<I, Off>;
}
impl<'a, I, B, Off> fmt::Display for Formatter<I, Off>
where
I: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>,
Off: Offset + fmt::Display; Part 4: new format methodsAdd new Example for impl<Tz: TimeZone> DateTime<Tz>
where
Tz::Offset: fmt::Display,
{
/// Format using a `FormattingSpec` created with `DateTime::formatter`.
pub fn format_with<'a, I, B, Tz2>(
&self,
formatter: &FormattingSpec<I, DateTime<Tz2>>,
) -> Formatter<I, Tz::Offset>
where
I: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>,
Tz2: TimeZone;
/// Format using a `FormattingSpec` created with `DateTime::formatter` and a `locale`.
#[cfg(feature = "unstable-locales")]
pub fn format_locale_with<'a, I, B, Tz2>(
&self,
formatter: &FormattingSpec<'a, I, DateTime<Tz2>>,
locale: Locale,
) -> Formatter<I, Tz::Offset>
where
I: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>,
Tz2: TimeZone;
} Part 5: convenience formatting methodAdd new fallible impl<Tz: TimeZone> DateTime<Tz>
where
Tz::Offset: fmt::Display,
{
pub fn try_format<'a>(&self, fmt: &'a str)
-> Result<Formatter<StrftimeItems<'a>, Tz::Offset>, ParseError>
} Owned vs borrowed argumentsThe If you pass an owned ExamplesExample of using fn test_format_with() -> Result<(), ParseError> {
let fmt_str = "%a %Y-%m-%d";
let dt1 = NaiveDate::from_ymd_opt(2023, 4, 18).unwrap();
let dt2 = NaiveDate::from_ymd_opt(2023, 9, 2).unwrap();
// Parses format string once, allocates
let fmt_items = StrftimeItems::new(&fmt_str).parse()?;
let formatter = NaiveDate::formatter(&fmt_items)?;
assert_eq!(dt1.format_with(&formatter).to_string(), "Tue 2023-04-18");
assert_eq!(dt2.format_with(&formatter).to_string(), "Sat 2023-09-02");
// Reparses format string on creation and every use
let fmt_items = StrftimeItems::new(&fmt_str);
let formatter = NaiveDate::formatter(fmt_items)?;
assert_eq!(dt1.format_with(&formatter).to_string(), "Tue 2023-04-18");
assert_eq!(dt2.format_with(&formatter).to_string(), "Sat 2023-09-02");
let mut buf = [
Item::Error,
Item::Error,
Item::Error,
Item::Error,
Item::Error,
Item::Error,
Item::Error,
];
// parses format string once, into existing slice
let fmt_items = StrftimeItems::new(fmt_str).parse_into_slice(&mut buf)?;
let formatter = NaiveDate::formatter(fmt_items)?;
assert_eq!(dt1.format_with(&formatter).to_string(), "Tue 2023-04-18");
assert_eq!(dt2.format_with(&formatter).to_string(), "Sat 2023-09-02");
Ok(())
} @djc What do you prefer? If this seems okay, shall I make one big PR or split it up into pieces (that may not make much sense by themselves)? |
Some things I have looked at related to creating a new formatting API: Are parsing errors in a format string and unavailable fields on the input type the only reasons for errors?→ #1144 Dates with a year outside of 0..=9999 are difficult, as they are not supported by some specifications. In my opinion making sure a value is valid should be the concern of a higher-level API. Especially if you consider that both problemetic formatting items already have dual puposes (see #1144 for details). Can we parse format strings at compile time?→ This is possible but pretty hard with the current limitations on const methods. We would need to add workarounds for the following limitations:
Can we make
|
I wonder if we should require a different |
Found one more interesting issue while reviewing #1058: not all locales have a 12-hour clock. In that case the local format string for The formatting specifiers Even without deciding on these issues, it may be better to change the API a little so we can catch issues caused by a locale before formatting. I.e. supply a impl DateTime<Utc> {
/// Create a new `FormattingSpec` that can be used for multiple `DateTime`s.
pub fn formatter<'a, I, J, B>(items: I) -> Result<FormattingSpec<'a, J, Self>, ParseError>
where
I: IntoIterator<Item = B, IntoIter = J>,
J: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>;
/// Create a new `FormattingSpec` that can be used for multiple `DateTime`s for a `locale`.
#[cfg(feature = "unstable-locales")]
pub fn formatter_with_locale<'a, I, J, B>(
items: I,
locale: Locale,
) -> Result<FormattingSpec<'a, J, Self>, ParseError>
where
I: IntoIterator<Item = B, IntoIter = J>,
J: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>;
}
impl<Tz: TimeZone> DateTime<Tz>
where
Tz::Offset: fmt::Display,
{
/// Format using a `FormattingSpec` created with `DateTime::formatter`.
pub fn format_with<'a, I, B, Tz2>(
&self,
formatter: &FormattingSpec<I, DateTime<Tz2>>,
) -> Formatter<I, Tz::Offset>
where
I: Iterator<Item = B> + Clone,
B: Borrow<Item<'a>>,
Tz2: TimeZone;
} |
Yes, with a procedural macro The procedural macro would take a string slice ( With this approach, it would also be possible to support runtime formatting by making the format function have a signature of |
I believe this RFC has the potential to make the formatting API much better. After reading through the comments and a couple of issues and pull requests, it would be nice to step back for a second and look at the big picture. Once we agree on the problem and the design goals, finding an appropriate implementation should be easier, so I wrote an executive summary of the discussion and what I understood is the desired design. Please let me know if I missed something @pitdicker and @djc, happy to have your input RFC: A new formatting APIWhy?Currently some methods for formatting timestamps panic or fail in undesirable ways (#956, #575, #419). While the current formatting API is inspired in the classic strftime/strptime found in languages like C and Python and works generally well, handling failures in an idiomatic way or preventing them altogether by using the type system would be preferred. Failure modesThe existing formatting API has the following failure modes:
Prior workSome solutions have been proposed so far, like returning a Result on formatting (#902) or silently skipping improper formatting directives (#614). None of them tries to split the problem into its fundamentals parts, which is what's proposed below Summary of the discussion so farApproximate design:
|
@kamadorueda You write clearer then I do, good summary.
To add to that: Currently the only In #1144 I argued that we can make the items work with any year. I.e. make
I don't see how we can bind to multiple types in the type system. I currently just bind to the one that is to be formatted. (Only the
Just yesterday I was trying to write such a method 😆. I am just hitting one problem: With my current implementation, if you bind an owned value to an input type ( I'll open a PR in a couple of days so you and @djc have something concrete to shoot at. |
@kamadorueda great summary, I think you covered all the bases. Your proposal to have the types bind the format sequence makes sense to me but is also slightly less expressive, I think? For example, ideally the outcome of parsing (This might not be a big downside, but it is a trade-off to be aware of.) |
With an intermediate trait... Something like |
I don't think #1121 is the right approach. The fallibility here is with (a) parsing the format string, which might be invalid, and (b) whether the type we're formatting from contains enough information to fill out the fields the format string has.
Making something work on 0.4.x is "trivial" in a sense because we can just invent new API that does whatever we want. Postponing the failures of parsing from the parsing stage to the formatting stage doesn't sound like a great option, in particular a reasonable approach is to use a long-lived formatter (parsed from a string) for many values. For getting enough data out of the input type, that seems like something we could assert at compile time using generics.
As such, I think the high-level structure here should be:
Vec
of items (reusing the existing item type if we can)Vec
or slice (for the non-alloc
case) to a particular input type, failing if the type does not contain enough data for the input typestruct Widget<'a, I> { items: Cow<'a, [Items]>, input: PhantomData<I> }
(replacing theCow
with just a slice ifalloc
is not enabled, I think?) which implementsDisplay
and will only fail (as you explained in Make StrftimeItems::new return Result #902 (comment)) if the innerFormatter
failsTo the extent that we can't implement this (efficiently) within the current API (from what I've seen, this will just get ugly with the approach chosen in #902), we should invent a new entry point. The new entry point should be able to satisfy all the use cases for the existing entry points so that we can, at some point in the near future, start deprecating the existing entry points. In particular, there should be affordances for the non-
alloc
use case and there should be a clean way (that is, with minimal code duplication) to factor in the localization stuff.cc @jaggededgejustice @pitdicker
The text was updated successfully, but these errors were encountered: