Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPIC: CAA rechecks should be multi-perspective #7808

Open
jsha opened this issue Nov 14, 2024 · 5 comments
Open

MPIC: CAA rechecks should be multi-perspective #7808

jsha opened this issue Nov 14, 2024 · 5 comments

Comments

@jsha
Copy link
Contributor

jsha commented Nov 14, 2024

Right now, VA.PerformValidation checks both the challenge and CAA, for both local and remote perspectives. However, if issuance happens more than 7 hours after validation, the RA rechecks CAA by calling VA.IsCAAValid. This function only checks the local perspective.

We want two things:

  1. Both initial CAA checks and CAA rechecks should check remote perspectives.
  2. Initial CAA checks and CAA rechecks should use very similar code paths.

One way to achieve (1):

  • Add remote perspective checking to VA.IsCAAValid.

Another way to achieve (1) and (2) together:

  • Add a new RPC method VA.IsCAAValidMPIC that does both local and remote.
  • Add a new RPC method VA.PerformValidationWithoutCAA that checks the challenge locally and remotely, but not CAA.
  • Add a feature flag in the RA, features.CheckCAASeparately, that switches from the current behavior to calling VA.PerformValidationWithoutCAA / VA.IsCAAValidMPIC sequentially.

(If we go this route, we should share as much code as possible between PerformValidation and PerformValidationWithoutCAA)

Another approach that achieves (1) and (2) together:

  • Add remote perspective checking to VA.IsCAAValid.
  • Always check CAA at issuance time, regardless of how old the authorization is.

This approach will increase latency for HTTP requests to /acme/finalize/. Currently that endpoint only incurs the overhead of checking DNS for a fraction of requests (those that reuse old authzs). If we take this approach it would incur that overhead for all requests. But it would bring our code more closely in alignment with how the BRs are written. CAA checking is specified as an issuance-time requirement, not a validation-time requirement. The mismatch between our implementation and the BRs led to an incident in 2020.

@jsha
Copy link
Contributor Author

jsha commented Nov 14, 2024

Hah, I missed that we in fact implemented "Add remote perspective checking to VA.IsCAAValid" in #7061 / #7221.

So I think that just leaves us with wanting to achieve:

Initial CAA checks and CAA rechecks should use very similar code paths.

I do still think it will be simplest to move all CAA checking to issuance time.

@aarongable
Copy link
Contributor

I do still think it will be simplest to move all CAA checking to issuance time.

I'd prefer to do the opposite -- move all CAA checking to validation time, and shorted authorization lifetimes to be 7 hours and thereby remove the need for rechecking.

@jsha
Copy link
Contributor Author

jsha commented Nov 14, 2024

They aren't mutually exclusive. We could do one, then switch to the other. An important difference is that "All CAA checking happens at issuance time" is not as user-impacting as "maximum authz reuse is 7 hours," which means we can do it sooner. Given that we're adding complexity for MPIC, I think there's a need in the short term for an offsetting decrease in complexity and risk.

@aarongable
Copy link
Contributor

I worry that the user impact of CAA-during-finalize will be surprisingly large, since CAA can be slow and finalize is (unfortunately) synchronous. I suspect that the number of finalize calls actually doing rechecking today is such a vanishingly small fraction that any failures or timeouts due to it are lost in the noise -- and that any such failures would suddenly represent a large fraction of issuance if every finalize did CAA checks. But hopefully a dig into the data will lay these fears to rest.

@aarongable
Copy link
Contributor

Regardless, I think this exact discussion is why we decided to move forward with the middle option described in the top post here: create new "ValidateChallenge" and "CheckCAA" RPCs which perform only their named function (no validate-and-caa), and then let the RA decide when to call each of them. Then delete the old PerformValidation and IsCAAValid methods. This refactoring simplifies the VA code and allows us to do either caa-at-finalize or short-lived-authzs without having to continue to worry about the VA code: such changes would only be at the RA level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants