-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable URL Metric storage locking by default for administrators #1835
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## trunk #1835 +/- ##
==========================================
+ Coverage 65.86% 65.92% +0.05%
==========================================
Files 88 88
Lines 6873 6885 +12
==========================================
+ Hits 4527 4539 +12
Misses 2346 2346
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
@westonruter I'm not sure about this change. This seems to be something that helps development, so I think having it in the OD Dev Mode plugin makes more sense to me than making it the default for any administrator. Many administrators of production sites are probably not developers. On another note, I think the code snippet referenced in the PR description is for something else than described (it mentions viewport, rather than storage lock)? |
Oops. I copied the wrong code. This has been corrected.
Actually, it's not specifically to help development. If someone creates a new post and then views it on the frontend, this should result in a URL Metric being collected right away without having to wait for any TTL storage lock to expire for the author or for another visitor to come by to view the post. A post author will almost always want to view the published post on the frontend after publishing, so this will result in a URL Metric being stored right away to give the collection a head start. Additionally, an administrator browsing around their site should be able to submit URL Metrics as they go around without having to wait for the storage lock TTL to expire. The purpose of the storage lock is to guard against unlimited DB writes from a frontend visitor. This, however, is not relevant to administrators (or even other roles above subscriber) since they can write to the DB at will in the admin. Actually, come to think of it, |
Your rationale makes sense to me. In that case, maybe a better implementation to achieve that goal would be to invalidate the current user's storage lock transient whenever a post is updated? That way, any fresh content would be able to receive updates immediately, without e.g. "blasting" the server with tons of URL metrics (in case of a very unpopular site, where the administrator visits e.g. 15 URLs quickly to test stuff). |
I'm not sure that would work. The storage lock TTL is sent to the client to prevent the client from even attempting to make a I don't see a problem with allowing an administrator to make multiple DB writes quickly. Consider bulk actions in the admin or quickly moderating through comments, for example. |
Oh, I just realized this won't work as-is since the |
There, this is fixed by 9f89cd4. Note that this could eventually mitigate the need for the Site Health test to check whether anonymous REST API requests are allowed (#1731). We could potentially add a new hook which disables URL Metric collection for anonymous users. This could be attractive to high-traffic sites (cf. #1655) which don't want to expose the REST API to anonymous visitors. Such sites would require logged-in users to collect the URL Metrics by browsing around, but also some automated collection could be implemented in the background, such as by opening the frontend in an |
@felixarntz Anything else you want to see here? |
@westonruter Sorry I didn't get to responding back to this sooner.
But if the storage lock was cleared, the lock wouldn't be there anymore until they send a URL metric again, correct? You mentioned previously as a rationale for this change:
For that reason, I suggested to clear the TTL for the current user when a post is updated. If they then visit the post in the frontend, URL metric data for that post would be sent, no? So I'm thinking that would satisfy this use-case without making broader changes as proposed here.
To me it's less about "allowing" than about the implications on low-quality servers. The examples you're mentioning are still actual actions taking by the user, so there's naturally more of a limitation on how fast the user can perform those. There's a slightly different implication on simply visiting pages, and the administrator (in many cases probably) not knowing that each page load will be a database write. FWIW I'm not strongly opposed to this change, but based on what you described as rationale, I'm thinking there may be a less risky way to make it possible, as outlined above. |
Storage locks are not bound to specific users. They are bound to a hash of the IP address. So if that lock were to be cleared, then anyone from that IP address could then submit a URL Metric, regardless of whether they are an admin or not. Even in the case of an admin, however, clearing the storage lock wouldn't necessarily help because the storage lock TTL is still 60 seconds. This lock is stored client-side in Lastly, if the storage lock were only cleared in response to a post update in the DB, then this wouldn't address the goal of an administrator being able to cause URL Metrics to be collected just by browsing the site. Let's say a user goes to their site, goes to create a post and publishes it. Once they do so and view the post on the frontend, a URL Metric should be collected. If the storage lock were cleared in response to editing a post, then good, the URL Metric should successfully be collected. But what if they then go to the homepage to see that post showing up there? The URL Metric wouldn't be collected because the storage lock would be re-imposed. If, however, the storage lock is disabled by default for admins, then URL Metrics will be collected as needed while navigating around the site without restriction. Given that DB writes happen all the time for admin users, I don't see this as an issue. DB writes happen for admins just by visiting the New Post screen and then they happen in the background for autosaves for that screen too. |
Good point, I didn't consider that - of course not guaranteed that the IP address is the same person. Fair point also on the
I see how that can be valuable, but I'm still wary of unexpected side effects on scalability, depending on a few factors:
|
Good points.
Yeah, I withdraw my idea to have the capability be
This is something I thought about too, where it would be nice if a given user only was allowed to contribute one fresh URL Metric for a given URL. However, in order to do this, we'd have to store some identifier about the user session in the URL Metric to know whether or not we should reject additional URL Metrics from that client. This would be problematic from a privacy perspective, since then the URL Metrics would essentially become a way to track users. We could possibly have a flag in the URL Metric for whether it was was collected from an authenticated user or not (e.g. For the sake of diversity of URL Metric collection, if an anonymous user visits the homepage, clicks to view a post for a minute, then goes back to the homepage, and then reads another post for a minute, before going back to the homepage for a third time: this will result in 3 URL Metrics being collected from that same user. Ideally we wouldn't collect multiple URL Metrics from the same user, but this isn't being thwarted currently anyway. I think it's more important for URL Metrics to be collected from the same user over the alternative of not collecting any URL Metric at all. Nevertheless, here's an alternative approach extending on top of what this PR currently implements. What if we use |
@felixarntz There, what do you think about a57e080? The hashing of the URL in the |
I didn't even think that far - so currently, this PR would result in an administrator being able to send all URL metrics necessary to fill a group? That would indeed be undesirable then, since e.g. refreshing the page a few times would already result in an entirely populated group (based on sample size). I guess I hadn't thought of that scenario yet, but that makes the concern even greater.
This would address the above concern right? Looks like a great solution. I think it's okay that a single administrator can provide a single metric per group per URL, so with this additional commit in place this should be a good approach. |
Yes, the current state of the PR prevents any client (either an admin or not) from filling up the sample of URL Metrics. Previously, an anonymous user could have done so by waiting 60 seconds between page reloads, and initially this PR allowed an admin to do so by reloading multiple times even without waiting. Now only one URL Metric will be submitted for both. The difference is that when navigating on to a second URL, the admin will be able to submit a URL Metric for that one right away, whereas for the anonymous user they'll have to wait 60 seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter This looks good to me now, just one small note that would be good to address. It's doing a bit too much in the map_meta_cap
callback.
plugins/optimization-detective/storage/class-od-storage-lock.php
Outdated
Show resolved
Hide resolved
Co-authored-by: felixarntz <[email protected]>
@felixarntz I added one more improvement in 78a266e which accounts for the case where another plugin may add a secondary capability to require for users who can |
I don't think we should do that: Since If we want to make that possible, we need to instead handle If you agree with that, we could simply solve it like this: add_filter(
'user_has_cap',
function ( $allcaps ) {
if ( isset( $allcaps['manage_options'] ) ) {
$allcaps['od_store_url_metric_now'] = $allcaps['manage_options'];
}
return $allcaps;
}
); FWIW, even WordPress Core has historically been doing it wrong for a bunch of cases - for some reason, |
This reverts commit 78a266e.
I'm not sure I understand. A meta capability maps to one or more primitive capabilities. Developers can add an arbitrary number of additional primitive capabilities to be required for a given meta capability. For example, to prevent a user from being able to do something, a developer can just push the In any case, I reverted my commit.
I don't think that the presence of args is really necessary for Update: After discussing offline, I am largely won over and I've replaced |
Co-authored-by: felixarntz <[email protected]>
@westonruter We chatted about this via DM, but replying here too for visibility:
Yes, but if a developer wanted to rely their own meta capability on
All of these capabilities should really be primitive capabilities. They were probably made to use Obviously most of this can be worked around, but distinguishing between primitive and meta capabilities helps other developers using a plugin to be able to finetune how the (primitive) capabilities are granted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter Thanks for bearing with me through the discussions and clarifying your points, this looks good to me now.
Just one non-blocking comment.
if ( ! is_array( $allcaps ) ) { | ||
$allcaps = array(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point to include this, although technically someone changing this to not be an array would be doing_it_wrong
.
It's fine to keep this, but if we already cater for it, I wonder whether we should instead just return the same value in such a scenario. It's basically broken, and we know the condition below won't be satisfied anyway.
If some code does something unexpected, I'd rather just ignore it and pass through, than tamper with what they did wrong and possibly cause more unexpected issues. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this in many other places actually. It's all for the sake of type safety, as otherwise a fatal error would occur since the function returns an array
. This scenario happens when another plugin incorrectly returns the wrong type on a filter callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it mostly as: If someone else returns something other than an array, we could also return that same thing. The documented filter input type is basically always the same as the documented filter output type, so from that perspective we could also change the return type here to array<string, bool>|mixed
and pass through any value that's not array<string, bool>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, although that would require removing the PHP type hint from the function. All of the other filter callbacks in the repo have taken this approach, so if we want to pass through the value unchanged, then all the other callbacks should be updated.
In any case, this is a safeguard for an edge condition for the sake of type safety.
Fixes #1829
This changes the default value passed into the
od_url_metric_storage_lock_ttl
from60
to0
when the current user is an administrator. Additionally, a newod_store_url_metric_now
meta capability is introduced which maps tomanage_options
by default. This gives sites the ability to customize which users have access to that capability.In practice this will eliminate the need for the following code in the Optimization Detective Dev Mode plugin as long as the developer is testing while being logged-in as an administrator: