Clarification on Retry mechanisms #260

nicjansma · 2023-01-13T19:24:55Z

Hi!

I'm looking for some clarification on Reporting API retry mechanisms, both from a spec and current-implementation point of view.

Specifically, what happens if the reporting endpoint returns a non-200-level response?

Per the latest spec, there's one note about future clarifications needed:

We don’t specify any retry mechanism here for failed reports. We may want to add one here, or provide some indication that the delivery failed.
https://w3c.github.io/reporting/#send-reports

And later, that 200 is a Success and a 410 Gone possibly suggests "no retry":

It returns "Success" if that delivery succeeds, "Remove Endpoint" if the endpoint explicitly removes itself as a reporting endpoint by sending a 410 response, and "Failure" otherwise.
https://w3c.github.io/reporting/#try-delivery

From what I've gathered in practice, Chrome won't retry for 2xx, but will retry for 404 and not for 410. There's probably other cases like network-level errors where retries may happen?

Ideally, I would like to see some of the expected behavior in the spec, so we (as an endpoint implementor) can plan better.

(for some background, a misconfigured URL from one of our customers hit our services and caused a retry-storm because the URL was wrong/404 and Chrome retried a bunch).

The text was updated successfully, but these errors were encountered:

yoavweiss · 2023-01-13T20:01:34Z

^^ @clelland

yoavweiss · 2023-01-13T20:05:05Z

Seems like we should define a retry mechanism, as well as define limits for the number and cadence of retries.

nicjansma · 2023-02-13T18:53:27Z

A small clarification: as a deployer of a Reporting API endpoint, we don't have a strong opinion of whether retry mechanisms should be included in the spec. It seems like they could be helpful in some cases of transient network outages (e.g. for NEL reports), but if retries are enabled, it would be most helpful for deployers of Reporting API endpoints to be made aware of cases where a retry will happen.

In our case specifically, we hadn't thought through the side-effects of returning 404 for unknown URLs. With Chrome behavior (it retries), if a client of ours typo's an endpoint URL, reports to that endpoint will retry a significant number of times (because we're always returning 404s) before giving up.

So the suggestion is to either better clarify in the spec what the retry conditions and mechanism are, or, remove retries all together.

clelland self-assigned this Feb 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Retry mechanisms #260

Clarification on Retry mechanisms #260

nicjansma commented Jan 13, 2023

yoavweiss commented Jan 13, 2023

yoavweiss commented Jan 13, 2023

nicjansma commented Feb 13, 2023

Clarification on Retry mechanisms #260

Clarification on Retry mechanisms #260

Comments

nicjansma commented Jan 13, 2023

yoavweiss commented Jan 13, 2023

yoavweiss commented Jan 13, 2023

nicjansma commented Feb 13, 2023