-
Notifications
You must be signed in to change notification settings - Fork 213
Adding LruDataCache and overlay examples #2914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good example. It ignores all the zero-copying that we've put a lot of effort into, and even needs to clone the locale to do lookup. Let's restrict the example to wrapping an AnyProvider
, and put more work into AnyProvider
<-> BufferProvider
interaction so that we don't allocate left and right.
I disagree on the following points:
|
docs/tutorials/data_provider.md
Outdated
ICU4X has no internal caches because there is no one-size-fits-all solution. It is easy for clients to implement their own cache for ICU4X, and although this is not generally required or recommended, it may be beneficial when latency is of utmost importance and: | ||
|
||
1. A less-efficient data provider is being used, such as JSON | ||
2. Lookups make heavy use of vertical fallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I don't think we should suggest caching the data structs when fallback is the problem. In that case one should cache resolved locales
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolved locales could be different for each key. But yes, we could cache (DataKey, DataLocale) -> DataLocale
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the use case of vertical fallback since this is a good point and we could even consider offering this as an official solution if needed.
@@ -121,6 +145,7 @@ assert_eq!(provider.cache.lock().unwrap().len(), 0); | |||
|
|||
assert_eq!( | |||
"こんにちは世界", | |||
// Note: It is necessary to use `try_new_unstable` with LruDataCache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: this seems weird from client perspective. If we provided a cache we could actually make it implement AnyProvider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to hear how you think we could implement a cache implementing AnyProvider
on top of an inner provider that implements BufferProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the registry we can do it, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly; using the registry is the only way one could do it. Maybe we don't align on whether the registry is a good solution; I think it's a solution that is generally quite bad and should be avoided if there is another viable option. In this case, we can put the cache into userland to circumvent the need for the registry.
Changed this PR to "Fixes #1246" |
docs/tutorials/data_provider.md
Outdated
} | ||
} | ||
// Release the lock to invoke the inner provider | ||
let response: DataResponse<M> = self.provider.load(req)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this type needed? self.provider
is only DataProvider<M>
, so there shouldn't be any ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primarily for clarity. The audience for this tutorial doesn't know what all these functions do or what they return. But if you think it's clear enough I'll remove it.
docs/tutorials/data_provider.md
Outdated
let mut res: DataResponse<DecimalSymbolsV1Marker> = any_res.downcast()?; | ||
if let Some(payload) = &mut res.payload.as_mut() { | ||
payload.with_mut(|data| { | ||
// Change the decimal separators for all Swiss locales to '.' and ',' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seem to be implying that our data needs to be fixed. Maybe replace them by something fun, like 🐮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mooooo 🐮
@@ -61,3 +61,169 @@ impl AdditiveIdentity { | |||
} | |||
} | |||
``` | |||
|
|||
## Caching Data Provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: kinda want this to be a separate page because it's so complicated
though perhaps that can be kept in mind for mdbookifying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #2929 for mdbook, and we can track this there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't exactly sure the best place to put this: it's a bit low-level/detailed for a tutorial, but it's also too big/off-topic for a docs page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i think in the mdbook world we can make this a subchapter
struct CacheKeyWrap(CacheKey<'static>); | ||
|
||
#[derive(Debug, PartialEq, Eq, Hash)] | ||
struct CacheKey<'a>(DataKey, Cow<'a, DataLocale>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: comment explanation on this section about how the borrow works here to make it possible to do non-cloning gets (probably pair CacheKey with the Borrow impl in code organization too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I'm not proud of this code because it uses a #[doc(hidden)]
export, lru::KeyRef
. It's confusing and distracting to have it in the example. I originally skipped this and just cloned the key for lookup, but @robertbastian's review feedback was that my code was cloning too much. I can labor over explaining how the key borrowing works, but it's not the point of this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we shouldn't explain it in depth but every line of code in an example is something the reader may focus on: I think we should move the Borrow impl and the borrowed key type below the main stuff with a one-liner comment that just says it's to avoid clones on lookup , so that the reader doesn't spend too much time trying to figure out how it fits in to the rest
let mut any_res = self.0.load_any(key, req)?; | ||
if key == DecimalSymbolsV1Marker::KEY && req.locale.region() == Some(region!("CH")) { | ||
let mut res: DataResponse<DecimalSymbolsV1Marker> = any_res.downcast()?; | ||
if let Some(payload) = &mut res.payload.as_mut() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering what the error case here is, but there doesn't seem to be one. The design decision to make payload an Option<DataPayload>
doesn't seem to have paid off, it's never None
. Maybe unwrap
here so readers don't have to figure this out themselves.
We should consider making the payload non-optional for 2.0 if there's still no use case for an absent payload by then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect a review on any PRs I send this week until January
Fixes #1246