Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Token policies for subtrees #1007

Open
s-hamann opened this issue Jan 3, 2025 · 5 comments
Open

Feature request: Token policies for subtrees #1007

s-hamann opened this issue Jan 3, 2025 · 5 comments

Comments

@s-hamann
Copy link

s-hamann commented Jan 3, 2025

I would like to automate DKIM key rotation and create a suitably restricted token for this purpose, i.e. allow writing TXT records under any subname like <selector>._domainkey but nothing else. Unfortunately, token policies currently do not allow that.

Setting a policy for subname _domainkey does not allow writing to its "children".
Setting a policy for *._domainkey only allows writing the wildcard record.
Setting a policy for every selector I need in advance would work but is infeasible (DKIM selectors are commonly dates or random strings).
Setting a policy without a subname works, but obviously also allows writing unrelated TXT records (e.g. DMARC policy, SPF, site verifications, ACME and whatnot). I can block some other subnames with additional policies, but that seems error-prone.

I'd be great if token policies could be extended to have a "and children"-bit. That way I could set up a policy that allows writing to _domainkey "and children".
Given the hierarchical structure of DNS, I believe this might have sensible applications beside DKIM key rotation.

@s-hamann
Copy link
Author

s-hamann commented Jan 3, 2025

A closely related use case would be to have a token restricted to ACME challenges only. As they are located at _acme-challenge.<subname>, the "and children"-bit would not help here.
Maybe there is a better solution that fits both use cases?

@peterthomassen
Copy link
Member

The question is how to efficiently select the correct policy in the database, matching the subname of the RRset that is being modified. We don't want to pull all potential policies from the database with filtering done in Python.

Currently, we have a single database query that, given (domain, subname, type), will return the applicable policy according to their priority order. We'd like to keep it that way.

Just brainstorming, one way would be to use Postgres' LIKE matching, and allow subnames to contain % for wildcard matching. This would also render _ special (in Postgres, this is a single-character wildcard), but we could escape them. This starts to smell somewhat risky, though -- such stuff is where often bugs lie.

Further, if we went this route, the question is then how to apply the LIKE operator in Django. Django does have support via lookups like _contains, _startswith, _endswith, etc., but those only work for patterns such as %a%, a%, and %a, respectively. However, we might want to support things like subname = "_acme-challenge.%.staging", for which there is no lookup in Django.

Finally, more complications arise if multiple wildcards match, e.g., when a policy for _acme-challenge.%.staging is defined, and another one for _acme-challenge.%. Maybe the more specific one. But how to decide between _acme-challenge.% and %.staging?

And how about multiple wildcards, such as %._domainkey.%? (Maybe it's time for separate zones here ...)

Questions over questions. Input appreciated!

@s-hamann
Copy link
Author

The question is how to efficiently select the correct policy in the database, matching the subname of the RRset that is being modified. We don't want to pull all potential policies from the database with filtering done in Python.

Currently, we have a single database query that, given (domain, subname, type), will return the applicable policy according to their priority order. We'd like to keep it that way.

Makes sense. I already had a brief look at the code and wondered how to approach this without resorting to filtering in Python.

Just brainstorming, one way would be to use Postgres' LIKE matching, and allow subnames to contain % for wildcard matching. This would also render _ special (in Postgres, this is a single-character wildcard), but we could escape them. This starts to smell somewhat risky, though -- such stuff is where often bugs lie.

There's also POSIX regex matching, which does seem to not treat _ specially.
Of course, . is special in regex and common in DNS. Maybe require a "regex bit" before interpreting a policy subname as a regex? That would clearly communicate the responsibility for escaping . and other fun characters to the user.

Further, if we went this route, the question is then how to apply the LIKE operator in Django. Django does have support via lookups like _contains, _startswith, _endswith, etc., but those only work for patterns such as %a%, a%, and %a, respectively. However, we might want to support things like subname = "_acme-challenge.%.staging", for which there is no lookup in Django.

Django has a regex lookup. Implementing a generic LIKE would probably require a custom lookup.

But LIKE, regex or even SIMILAR TO: I believe they all look up something in the database that matches a given pattern. We have the reverse use case here, right? We have a value (subname) and need to look up something in the database, that is a matching pattern for this value. I'm not database expert, but that does not sound like something that can be done in an SQL query. Or did I misunderstand how this works?

Finally, more complications arise if multiple wildcards match, e.g., when a policy for _acme-challenge.%.staging is defined, and another one for _acme-challenge.%. Maybe the more specific one. But how to decide between _acme-challenge.% and %.staging?

Tricky, indeed. It's a bit of an edge case though, so it's probably OK to implement it in any way that seems feasible and document it.

And how about multiple wildcards, such as %._domainkey.%? (Maybe it's time for separate zones here ...)

An easy way out would be to forbid multiple wildcards. I think a single wildcard (or even only "wildcard-labels") would be useful enough to have, if it simplifies implementation or eliminates some corner cases.

@peterthomassen
Copy link
Member

peterthomassen commented Jan 21, 2025

Just brainstorming, one way would be to use Postgres' LIKE matching, and allow subnames to contain % for wildcard matching. This would also render _ special (in Postgres, this is a single-character wildcard), but we could escape them. This starts to smell somewhat risky, though -- such stuff is where often bugs lie.

There's also POSIX regex matching, which does seem to not treat _ specially. Of course, . is special in regex and common in DNS. Maybe require a "regex bit" before interpreting a policy subname as a regex? That would clearly communicate the responsibility for escaping . and other fun characters to the user.

We'd prefer not to go full-scale regex, due to DoS risk from exponential backtracking and other things, including perhaps ways that haven't been identified yet.

Further, if we went this route, the question is then how to apply the LIKE operator in Django. Django does have support via lookups like _contains, _startswith, _endswith, etc., but those only work for patterns such as %a%, a%, and %a, respectively. However, we might want to support things like subname = "_acme-challenge.%.staging", for which there is no lookup in Django.

Django has a regex lookup. Implementing a generic LIKE would probably require a custom lookup.

True, and it's simple enough.

But LIKE, regex or even SIMILAR TO: I believe they all look up something in the database that matches a given pattern. We have the reverse use case here, right? We have a value (subname) and need to look up something in the database, that is a matching pattern for this value. I'm not database expert, but that does not sound like something that can be done in an SQL query. Or did I misunderstand how this works?

The pattern is on the right hand side of the LIKE operator, but we could make a custom lookup, say rlike, such that filter(subname_rlike='_acme-challenge.intranet') matches a row which has subname="_acme-challenge.*".

Thinking about this more, I am now realizing that this kind of query probably won't be able to use indexes (at least not the default type), so it may not scale well. Hmm, we have to think harder. I'll look into the possibility of specialized indexes a bit.

And how about multiple wildcards, such as %._domainkey.%? (Maybe it's time for separate zones here ...)

An easy way out would be to forbid multiple wildcards. I think a single wildcard (or even only "wildcard-labels") would be useful enough to have, if it simplifies implementation or eliminates some corner cases.

Sure; OTOH, it's always best to think it over a few more times to see if there isn't a generalized solution (which often also is simpler in code). So, I'd like to let this particular point sit for a bit, to be decided later.

Exchanges like this are helpful, thanks! (For example, I only realized the indexes problem just now. Who knows what else will come to mind as we discuss further.)

@s-hamann
Copy link
Author

I've given the user-visible design some more thought (leaving implementation challenge aside) and the following seems clean to me:

Proving full wildcard-support to the user leads to the questions about precedence that you raised and for which there does not seem to be a obvious and intuitive answer.
Allowing only wildcard-labels, on the other hand, is much simpler and fits well into the DNS world, where we have wildcard records, wildcard X.509 certificates, etc. but no "half-a-label-wildcards".
So, users could use the following two wildcards in the subnames of their policies:

  • *, e.g. _acme-challenge.*.subdomain or *._domainkey - meaning: substitute anything in this label.
  • **, e.g. _acme-challenge.** - meaning: substitute any number of labels
    Syntax is borrowed from shell globbing and may therefore be familiar to some users.
    Catch: If * is the wildcard character, we'd need an additional "wildcard-bit" in the policy to distinguish between a "wildcard-policy" and a policy that explicitly targets the wildcard record. Using some DNS-illegal character (or string) would make that bit unnecessary (at least on the API side, it may or may not be beneficial to have it in the database).

If only a single wildcard-label is allowed, precedence of policies seems straightforward: Non-wildcard policies first, otherwise prefer most specific one, i.e. left-most wildcard. Single-label wildcards (*) should have precedence over multi-label wildcards (**).
The table from the docs would turn into something like this (which unfortunately fails to visualize the wildcard precedences):

Priority domain subname type
1 match match match
2 match wildcard match
3 match match null
4 match wildcard null
5 match null match
6 match null null
7 null match match
8 null wildcard match
9 null match null
10 null wildcard null
11 null null match
12 null null null

Allowing multiple wildcard-label (e.g. *._domainkey.*) might be useful in some cases, but seems rather difficult to design. Fewer wildcards might get priority over more wildcards for being (probably?) more specific. But I'm at a loss on how to determine which of two two-wildcard-policies is more specific.

Implementation-wise, this looks rather tricky, unfortunately.

With these simplified rules, you could translate between "user-rules" and "database-rules" (i.e. patterns or regexes, whichever you prefer) on input and translate back when returning the policies to the user. Regex-escaping is readily available, LIKE-pattern looks easy enough to do without a proven library function. Unescaping, however, is a less common use case. This whole two-way-translation seems easy to mess up.

Using regex matching seems rather safe in this context since you don't have to deal with untrusted regexes (assuming the regex-escaping works as advertised). So you could use them if PostgreSQL has support for left-hand-side matching and if they provide any benefit over the simpler RLIKE.

I do not think the distinction between * and ** can be made using RLIKE alone. It could be done in regex. I'm not sure the benefit of that feature outweighs the costs, though.

I don't have any idea how to get the precedence rules into a database query, as you currently have it.

The more I think about this idea, the more I realize how difficult this gets. But I still think I'd be awesome to have it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants