-
-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise shops page: Enable injected enterprise data to be scoped to specific enterprise ids #12755
Optimise shops page: Enable injected enterprise data to be scoped to specific enterprise ids #12755
Conversation
1bec8fd
to
cdfa30c
Compare
let!(:p1) { | ||
create(:simple_product, primary_taxon_id: t1.id, supplier_id: e.id) | ||
} | ||
RSpec.describe Spree::Taxon do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are a lot of whitespace changes in this file (changed from module Spree
nesting to RSpec.describe Spree::Taxon
to avoid linting problems) - I would recommend hiding whitespace to view, to distinguish the functional differences
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's awesome ! thanks @johansenja
Keep them coming 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've often noticed how slow it was and thought "surely we can improve this..". So thanks for taking on this big task to improve the performance!
Firstly, I have a few general notes:
- When adding specs to cover existing code, it's best to add them in an earlier commit (you can re-arrange commits with git). Then, when you add your changes we can more clearly see what behaviour was existing, and what features you've added.
- It's worth running the linter each time you commit, so that you can catch these adjustments earlier, and keep the git history simple
I also have a suggestion, for bonus points:
We have an rspec matcher to check what DB queries are running. This seems like a great opportunity to test that it's running only the necessary queries. See here for examples, or a real example here
This could be added in a separate PR though; I think it's worth proceeding with manual testing.
@shipping_method_services ||= CacheService.cached_data_by_class("shipping_method_services", | ||
Spree::ShippingMethod) do | ||
@shipping_method_services ||= CacheService.cached_data_by_class( | ||
"shipping_method_services_#{@enterprise_ids.hash}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked up the hash
method to make sure I understood it, and discovered that it only creates a reproducible hash within the current process (ref https://stackoverflow.com/questions/2505824/how-does-rubys-string-hash-method-work)
For example:
~/projects/openfoodnetwork $ ruby -e "puts [1,2,3].hash"
1856927473914424695
~/projects/openfoodnetwork $ ruby -e "puts [1,2,3].hash"
3622279804335312461
As far as I can tell, we have shared cache between server processes (with redis_cache_store), so the benefit of caching will be reduced. It means we'll end up with perhaps 4 times the cache entries (assuming 4 puma processes which don't often crash and restart).
Hmm it should be fine :)
It's probably more performant than generating a more persistent hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a good point. My hope here would be that caching wouldn't even be needed if fetching and processing the data is fast enough - currently, CacheService.cached_data_by_class
performs an extra DB query just to generate the cache key cached_class.maximum(:updated_at).to_f
; on my machine it normally takes a few ms but has taken up to 45ms (though maybe it would be more consistent with an index on updated_at), but it's a similar time to actually pulling all the rows anyway:
Though I'm sure production data would be a bit different!
As far as I can tell, this is the only code that uses CacheService.cached_data_by_class
, so there could be an opportunity to simplify the CacheService if the EnterpriseInjectionData no longer needed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. It seems pointless making an extra DB query to find out if we need to make the real query, when we could just make the real query. Perhaps with further optimisation, the real query might be quite efficient (I just added a comment to Spree::Taxon.supplied_taxons
)
Thanks for the review. It's useful to know about the rspec matchers, because I have a few other changes in mind which could benefit from them. With these changes, I don't think I'll be chaning the number of queries, though there are a few areas which might (see #12724 - there are definitely some n+2s in there but I think I have a better solution for it) |
@shipping_method_services ||= CacheService.cached_data_by_class("shipping_method_services", | ||
Spree::ShippingMethod) do | ||
@shipping_method_services ||= CacheService.cached_data_by_class( | ||
"shipping_method_services_#{@enterprise_ids.hash}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. It seems pointless making an extra DB query to find out if we need to make the real query, when we could just make the real query. Perhaps with further optimisation, the real query might be quite efficient (I just added a comment to Spree::Taxon.supplied_taxons
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you!
Just needs a conflict fix, then we can do a manual test. |
797afe6
to
ce9c5b6
Compare
Hi @johansenja, |
ce9c5b6
to
18c3bc2
Compare
Yup no problem, I have just rebased against master and pushed again. Many thanks! |
I am not sure about the new test failure now - I was wondering if it could be a flake, but it seems suspcious that it is in the order cycles area, though I'm not yet sure how my changes here would affect it. I'll take another look in due course |
Ja, I guess they are flaky. It's all green now. |
Co-authored-by: Gaetan Craig-Riou <[email protected]>
18c3bc2
to
4718fdb
Compare
Hi @johansenja, For now I will move it to Ready To Go, to double check if we need another approval after you fixed the conflict. Thanks again! 🥳 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test performance, roughly, I logged into uk-staging and did a few requests:
time for i in `seq 10`; do curl 'https://staging.openfoodnetwork.org.uk/shops#/' > /dev/null; done
The results are between 2.5s and 3.0s for ten requests but most of the time 2.9s. That's on another pull request.
After staging this pull request, the ten requests finish in 2.5s to 2.7s. That's an improvement.
What? Why?
When visiting https://openfoodnetwork.org.uk/shops, the page load is very slow. Given that this page is linked to from the
CTA above the fold on https://openfoodnetwork.org.uk/, it seems ideal that should load quickly.
It appears that part of the problem is that the
InjectionHelper
(viaOpenFoodNetwork::EnterpriseInjectionData
) fetches information about all enterprises (even ones which don't need to be serialised).In some cases there could be a lot of rows pulled from the DB into memory, especially in the cases where there are joins (eg.
Spree::Taxon.supplied_taxons
pulls in all taxons of all products for all enterprises) - and then sometimes there are application-side operations happening on them too (eg. looping or creating hashes).I opted to make this scoping optional where possible to try to ensure that they are non-breaking for other areas
I did a bit of benchmarking locally using
for this branch against master, and these were the results:
master (1940ms page load average):
this branch (686ms page load average):
What should we test?
inject_group_enterprises
inject_enterprise_and_relatives
The other 2 pages incidentally also seem to be quicker for me, by a similar amount
Release notes
Changelog Category (reviewers may add a label for the release notes):
Dependencies
Documentation updates