Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on Retry mechanisms #260

Open
nicjansma opened this issue Jan 13, 2023 · 3 comments
Open

Clarification on Retry mechanisms #260

nicjansma opened this issue Jan 13, 2023 · 3 comments
Assignees

Comments

@nicjansma
Copy link

Hi!

I'm looking for some clarification on Reporting API retry mechanisms, both from a spec and current-implementation point of view.

Specifically, what happens if the reporting endpoint returns a non-200-level response?

Per the latest spec, there's one note about future clarifications needed:

We don’t specify any retry mechanism here for failed reports. We may want to add one here, or provide some indication that the delivery failed.
https://w3c.github.io/reporting/#send-reports

And later, that 200 is a Success and a 410 Gone possibly suggests "no retry":

It returns "Success" if that delivery succeeds, "Remove Endpoint" if the endpoint explicitly removes itself as a reporting endpoint by sending a 410 response, and "Failure" otherwise.
https://w3c.github.io/reporting/#try-delivery

From what I've gathered in practice, Chrome won't retry for 2xx, but will retry for 404 and not for 410. There's probably other cases like network-level errors where retries may happen?

Ideally, I would like to see some of the expected behavior in the spec, so we (as an endpoint implementor) can plan better.

(for some background, a misconfigured URL from one of our customers hit our services and caused a retry-storm because the URL was wrong/404 and Chrome retried a bunch).

@yoavweiss
Copy link
Contributor

^^ @clelland

@yoavweiss
Copy link
Contributor

Seems like we should define a retry mechanism, as well as define limits for the number and cadence of retries.

@clelland clelland self-assigned this Feb 13, 2023
@nicjansma
Copy link
Author

A small clarification: as a deployer of a Reporting API endpoint, we don't have a strong opinion of whether retry mechanisms should be included in the spec. It seems like they could be helpful in some cases of transient network outages (e.g. for NEL reports), but if retries are enabled, it would be most helpful for deployers of Reporting API endpoints to be made aware of cases where a retry will happen.

In our case specifically, we hadn't thought through the side-effects of returning 404 for unknown URLs. With Chrome behavior (it retries), if a client of ours typo's an endpoint URL, reports to that endpoint will retry a significant number of times (because we're always returning 404s) before giving up.

So the suggestion is to either better clarify in the spec what the retry conditions and mechanism are, or, remove retries all together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants