Skip to content

Add mechanism for linking resource keys with data structs #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sffc opened this issue Mar 24, 2021 · 14 comments · Fixed by #1554
Closed

Add mechanism for linking resource keys with data structs #570

sffc opened this issue Mar 24, 2021 · 14 comments · Fixed by #1554
Assignees
Labels
C-data-infra Component: provider, datagen, fallback, adapters S-small Size: One afternoon (small bug fix or enhancement) T-docs-tests Type: Code change outside core library

Comments

@sffc
Copy link
Member

sffc commented Mar 24, 2021

There is currently no documentation that links resource keys, like icu_decimal::provider::key::SYMBOLS_V1, to data structs, like icu_decimal::provider::DecimalSymbolsV1. We should figure out a way to document and enforce this mapping.

@sffc sffc added discuss Discuss at a future ICU4X-SC meeting C-numbers Component: Numbers, units, currencies C-data-infra Component: provider, datagen, fallback, adapters and removed C-numbers Component: Numbers, units, currencies labels Mar 24, 2021
@sffc
Copy link
Member Author

sffc commented Apr 29, 2021

  • @nordzilla: Could we use associated constants? icu_decimal::provider::DecimalSymbolsV1::key::SYMBOLS_V1

@sffc sffc self-assigned this Apr 29, 2021
@sffc sffc added S-small Size: One afternoon (small bug fix or enhancement) T-docs-tests Type: Code change outside core library and removed discuss Discuss at a future ICU4X-SC meeting labels Apr 29, 2021
@sffc sffc added this to the 2021 Q2-m3 milestone Apr 29, 2021
@sffc sffc modified the milestones: 2021 Q3-m1, 2021 Q3 0.4 Sprint B Aug 11, 2021
@sffc
Copy link
Member Author

sffc commented Sep 14, 2021

Here's the approach I'd like to try. I can add a new field, hidden behind a feature flag, to ResourceKey: type_id: core::any::TypeId. This field would store the type information of the struct that we expect to see when the key is used. I can then check at runtime whether the type signature matches the key.

However, I am hitting rust-lang/rust#77125, because we always create ResourceKey in a const context. I'm therefore going to consider this issue blocked on upstream stabilizing const TypeId::of().

@sffc sffc modified the milestones: 2021 Q3 0.4 Sprint B, ICU4X 0.5 Sep 14, 2021
@sffc sffc added the blocked A dependency must be resolved before this is actionable label Sep 14, 2021
@sffc
Copy link
Member Author

sffc commented Sep 14, 2021

CC @Manishearth ^^ see above comment

@sffc sffc changed the title Add docs for linking resource keys with data structs Add mechanism for linking resource keys with data structs Sep 14, 2021
@sffc
Copy link
Member Author

sffc commented Sep 14, 2021

  • @nordzilla: Could we use associated constants? icu_decimal::provider::DecimalSymbolsV1::key::SYMBOLS_V1

No, because there is not a 1-to-1 mapping. A single data struct could have multiple keys. I think it is a good thing that multiple keys could share the same data struct, because it helps unify code and reduce code size. I did a little experimentation and found that duplicating a data struct for two keys results in slightly higher code size than the two keys using the same data struct. I'm willing to share my benchmark if interested. Note to self: it is in the stash named "dup/dry benchmark bin"

sffc added a commit to sffc/omnicu that referenced this issue Sep 14, 2021
@Manishearth
Copy link
Member

We could still use non-trait associated consts

impl FooDataProvider {
   const SYMBOLS_V1 = ...;
}

@sffc
Copy link
Member Author

sffc commented Sep 14, 2021

The problem is there's no impl to put that. To clarify, here is how things currently stand:

  1. ResourceKey is a struct. There are many const instances of that struct; for example, icu_provider::hello_world::key::HELLO_V1 is a const ResourceKey.
  2. There is a one-to-many mapping from ResourceKey to data struct
  3. The data structs all use #[icu_provider::data_struct]

Here are things I've considered:

  • Associated constant on the data struct: does not work because there is a many-to-one mapping from data struct to resource key
  • PhantomData on the resource key: does not work because resource keys can be used in type-erased contexts where generic types are not available
  • type_id field on the resource key: should work, but not the most elegant solution, since it relies on runtime checking, plus const type_id is not yet stable
  • New trait ResourceKeyMarker that combines an associated const ResourceKey with an associated type: maybe? It would require refactoring a bunch of code, so I would want to make sure this is a good idea before doing it

@Manishearth
Copy link
Member

The problem is there's no impl to put that.

I don't understand why this is true. We can write one, or autogenerate one.

Associated constant on the data struct: does not work because there is a many-to-one mapping from data struct to resource key

If it's many-to-one we don't even need to worry about doing inherent impls, we can make this a part of DataMarker or DataStruct or something. But we can always do inherent impls for this.

The goal is simply documentation, right?

@sffc
Copy link
Member Author

sffc commented Sep 14, 2021

There are many keys that map to a single DataMarker. The problem this ticket is trying to solve is how to enforce that a data request for a particular key results in the correct data struct.

The goal is simply documentation, right?

At least documentation, but better would be if we could enforce that keys and structs match each other.

For example, if I write provider.load_payload::<HelloWorldV1>(key::HELLO_WORLD_V1) (not real code), I want to enforce that the type parameter and the key argument agree with each other.

@sffc sffc added discuss Discuss at a future ICU4X-SC meeting and removed discuss Discuss at a future ICU4X-SC meeting labels Sep 22, 2021
@Manishearth
Copy link
Member

@sffc we could have a DataKeyFor trait with an associated const but we won't be able to enforce it at compile time unless we make key::HELLO_WORLD_V1 into a marker type. That could work with a DataKey trait that has a KEY_VALUE associated constant, and you instead do provider.load_payload::<HelloWorldV1, HelloWorldV1Key>() or something. But that seems like overkill to me, and also would be annoying to work with from FFI with runtime keys.

@sffc
Copy link
Member Author

sffc commented Oct 1, 2021

Should we consider something like this?

trait DataMarker<'data> {
    type Yokeable: ...;
    type Cart: ...;
    const key: DataKey;
}

struct HelloWorldV1Marker {}
impl<'data> DataMarker<'data> for HelloWorldV1Marker {
    type Yokeable: HelloWorldV1<'static>;
    type Cart: HelloWorldV1<'data>;
    const key: icu_resource_key!("core", "hello_world", 1);
}

// Make a new marker struct for each key using the same struct
struct HelloWorldV1AltMarker {}
impl<'data> DataMarker<'data> for HelloWorldV1Marker {
    type Yokeable: HelloWorldV1<'static>;
    type Cart: HelloWorldV1<'data>;
    const key: icu_resource_key!("core", "hello_world_alt", 1);
}

and then you have just

provider.load_payload::<HelloWorldV1Marker>()

and we don't need to carry the key in the request because you can get it from the type parameter.

Not sure what this means for FFI, as you said.

@sffc
Copy link
Member Author

sffc commented Oct 1, 2021

also, what I suggested above basically does not work with ErasedDataProvider the way we have it; ErasedDataProvider would have to become its own trait again, with a signature more similar to DataProvider today. But maybe that's actually fine.

@Manishearth
Copy link
Member

I think this works. Yes, this won't work well for FFI.

@Manishearth
Copy link
Member

@sffc actually, what about cases where we want the same data type but multiple key kinds: i think we'll want this for calendar formatting where we'll have a GREGORY and JAPANESE and INDIAN etc keys for the data, even though they're all the same payload type.

@sffc
Copy link
Member Author

sffc commented Oct 1, 2021

The request should carry runtime data, which is the locale and the variant. GREGORY, JAPANESE, and INDIAN are variants. That doesn't affect the DataKey. So it should work.

And for FFI, we don't actually care, because most FFI functions take a concrete type or the ErasedDataProvider type. We would only care if we expose an FFI that directly returns the data structs... which we don't have right now and which I think we can support by other means (such as FFI mutating JSON buffers).

@sffc sffc added needs-approval One or more stakeholders need to approve proposal and removed blocked A dependency must be resolved before this is actionable labels Oct 1, 2021
@sffc sffc modified the milestones: ICU4X 0.5, 2021 Q4 0.5 Sprint A Oct 21, 2021
@sffc sffc added discuss Discuss at a future ICU4X-SC meeting and removed needs-approval One or more stakeholders need to approve proposal labels Nov 5, 2021
@sffc sffc modified the milestones: 2021 Q4 0.5 Sprint B, ICU4X 0.5 Nov 18, 2021
@sffc sffc modified the milestones: ICU4X 0.5, 2021 Q4 0.5 Sprint F Jan 18, 2022
@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Jan 20, 2022
@sffc sffc closed this as completed in #1554 Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-data-infra Component: provider, datagen, fallback, adapters S-small Size: One afternoon (small bug fix or enhancement) T-docs-tests Type: Code change outside core library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants