-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) rustdoc redirect caching in the CDN #1262
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I got halfway through this and then realized that SMaxAge
is the cache time for all proxies, not just Cloudfront, which means there's no way to invalidate any intermediate caches between us and the user. So my thought is we can either:
- Replace the indefinite cache from
/regex/1.3.1
->/regex/1.3.1/regex/
with just a very long time, say a week, or - Add the caching in Cloudfront directly, so we know there will be no caching by anyone else.
I would be happy with either, but I don't think we should have indefinite caches without a way to invalidate them.
475e753
to
39db3b1
Compare
src/web/rustdoc.rs
Outdated
Ok(super::cached_redirect(url, config.cache_rustdoc_redirects)) | ||
Ok(super::cached_redirect( | ||
url, | ||
config.cache_rustdoc_redirects_version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not always correct, for MatchVersion::Semver
it should be cache_crate
, not cache_version
. Can you take this as a parameter instead?
Maybe we should consider changing the name of the parameter too 🤔 I think cache_release
might be easier to think about than cache_version
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jyn514 I was already working on exactly this :) stumbled on the same thing.
Parameter I'll rename.
@jyn514 thanks for the reviews and the support :) I'll continue working on this and will ping you when it's ready for another review |
Won't have time to review the actual changes, but I have a few comments:
While I think it's true that for the warnings on the top bar being out of date for 15-30 minutes is fine, I don't think the same applies for the redirects themselves. People often link to https://docs.rs/crate-name when announcing new releases of their crate, and having inconsistent redirects for it would be a bad user experience.
No, we're in |
@pietroalbini thanks for the input, in my mind there is already a delay due to the queue and build, so adding to that wouldn't be too bad. But up to you :) |
My main worry is inconsistency: someone could see that docs.rs/crate-name redirects to the correct versions and starts sending the link around, while people in other parts of the world (served by different POPs) will get a different link. |
yeah, we will always have that when we start caching without invalidation. In the end, it's speed vs stronger consistency guarantees (and edge-case vs the regular case). Only solvable in a good way with a better CDN. To me that makes the impression that there are different opinions among the owners here which is more important. (no hard feelings :) ) I would actually leave this PR for others until that decision is made. |
the latest-version redirect is the one I'm hitting the most, and likely I'm not the only one. (but that's personal opinion, your constraints for improvements might be different, which is ok) |
After having slept over it this feels like a micro-optimization with the constraints that exist and the effort of finishing this up. I'll drop it for now. I've done multiple setups with Fastly as CDN, where we could have worldwide full page caching with worldwide, free invalidation in < 1 second (and then stable response times ~20ms). That means the pages would be updated <1s after the build finished. |
regularly using docs.rs there was another possible performance improvement that I saw.
I feel currently the server-side performance is (for now) not the issue any more, the next big lever would be working on caching the S3 file downloads. In #1004 if we combine it with local caching on the server, or just a webserver-local LRU cache, or perhaps a proxy between the webserver and S3.
From what I see, for users outside the US most of the time is not spent on the webserver any more, but on the roundtrip to the US, or data-download from the US.
Both could be solved by starting to cache more things on the CDN level.
Coming from previous discussion in the channel, I would rule out:
The remaining thing are (IMHO) the redirects, which are (according to @jyn514 ) also heavily used. When I test these from Berlin, it's about 150-200ms only spent in the redirect.
A simple strategy which has often more impact that we think is setting cache-headers for the CDN to a short amount of time. Even caching a thing for 15, 30, 60 minutes will gain quite much performance for pages that are regularly used, at least because the response is returned from a local POP.
In a discussion with @pietroalbini I got the impression (correct me if I'm wrong please 😄 ) that we can actually assume old version builds won't change any more. Also in that discussion a label like this version is yanked or go to latest version was ok to be outdated for 15 minutes (or so).
This would actually lead me to the conclusion that we can
under the assumption that we cache only on the CDN level, which enables us to manually invalidate the whole cache if we want to (up to 1000 invalidations are free in CloudFront, which likely is fine for manual invalidations/new code, but not for automatic invalidations for every crate release).
If in general this is fine for you, I can continue on working out more details, likely
This is just an offer and I won't take it personally if it doesn't match your strategy for the platform. Also my assumptions or my understand might be wrong, please point me to mistakes I made.
I definitely would be happy to invest more time in this to improve the site.
links
eu-central-1
(Frankfurt) tous-east-1
(are we located there?) are still ~100ms