Allow rule matching using path prefixes #652

dadrus · 2023-05-29T11:27:18Z

Preflight checklist

I agree to follow this project's Code of Conduct.
I have read and am following this repository's Contribution Guidelines."
I have discussed this feature request with the community.

Describe the background of your feature request

NOTE": This FR has been raised for Ory Oathkeeper in ory/oathkeeper#1089. Since heimdall implements the same logic for request to rule matching, it is suffering the same challenges. For that reason, implementation of the suggested functionality would be very beneficial.

The original FR description from @davidspek (all credits to him) slightly adjusted:

Currently rules are matched to an incoming request based on regex or glob pattern matching. Since only a single rule is allowed to match a request, the regex or glob patterns must be very precise and become increasingly complex as different paths need to be matched. While the used regex library supports negative lookahead, creating rules where requests are matched based on path prefixes is still difficult since you'd need to provide negative matches for all other rule regexes with the same base path. As glob doesn't support negative lookahead doing path prefix based routing is not possible. Even when using regex patterns with negative lookahead, the issue that arises is that the end user ends up needing to manage system with the state of all rules to then be able to create rules with the appropriate regexes including negative lookahead patterns for all other rules. Since the most common type of routing people likely would want to implement is prefixed based routing, the fact that doing so with heimdall is difficult and error prone is in my views its biggest drawback.

Describe your idea

Goals and non-goals
Goals:

Allow for easily matching routing rules using path prefixes.

Non-goals:

Allow for using regex in the scheme section of the URL, since that would make it much more difficult to enter rules into the Trie.
Allow for regex or glob patterns in places other in the host section of the URL, since making a Trie compatible with pattern matching is likely very difficult if not impossible.
Allow for regex or glob patterns in places other than the last section of the path, since making a Trie compatible with pattern matching is likely very difficult if not impossible.

The design

In order to allow for easily matching route prefixes we can make use of a Trie Data Structure. These are commonly used for things like search engines where you'd want to quickly find entries that contain a word without looping through all entries within the database. We will use a Trie to find the heimdall rule with the longest matching prefix. Since the time complexity is O(n) the matching of rules should be very fast, and likely much faster than what is currently being done since currently heimdall loops through each rule to compile a list of matches.

TODO: Draw a diagram

Entering rules into the Trie

To simplify and speed up matching, the first node in our Trie will be the http scheme used (for example: https, as parsed by url.Parse), followed by a node containing the host (as parsed by `url.Parse) and then the paths.

Now for the paths. First, we will cleanup the path and keep anything up to the first regex or glob pattern. Since we are only interested in full paths that match, we will not be entering each character of the path into the Trie. Rather, we will split the path of a rule by the / character and enter each complete path into the Trie as a node. On the last path we will add the rule object to the Trie node so we can retrieve it. Technically a Trie node can contain multiple rules if patterns are used somewhere in the paths.

Matching incoming requests

When a request comes in and prefix matching is enabled we will first search the Trie for the rule(s) that match the protocol, method, sheme, hostname and path. Next, we will evaluate the rule(s) for regex or glob pattern matching the same way that is done currently. If after the pattern matching there are still multiple rules that are matched an error will occur the same way this happens currently.

Examples

Using the above example Trie diagram the following requests would en up at the following nodes in the Trie.

Scheme	Hostname	Path	Node Number
https	example.com	/some/path	9
https	example.com	/some/other/path	12
https	example.com	/something/else	8
https	example.com	/some/thing	5
https	example.com	/unknown/path	1
https	example.com	/something/else/and/very/long	8
https	example.com	/	2
https	example.com	/some/long/path	7
https	example.com	/some/path	11

Are there any workarounds or alternatives?

Use the currently supported matching Methode with the challenges described above.

Version

v0.7.0-alpha

Additional Context

No response

davidspek · 2023-06-21T15:04:39Z

For reference the upstream PR that implements this is ory/oathkeeper#1073. @dadrus sorry for not looking at implementing this here yet, I've been very busy lately and I'm not sure when I'd have the time to implement it here.

dadrus · 2023-06-22T08:02:03Z

You don't have to feel sorry ;)
And thank you very much for your comment!

davidspek · 2023-07-10T11:45:17Z

FYI, you have a mistake in referencing me in the OP :).

dadrus · 2023-07-10T12:34:35Z

Oh... Sorry! Fixed.

dadrus added the feature Used for new features label May 29, 2023

dadrus added this to the v0.8.0-alpha milestone May 29, 2023

dadrus modified the milestones: v0.9.0-alpha, v0.10.0-alpha Jun 23, 2023

dadrus modified the milestones: v0.11.0-alpha, v0.12.0-alpha Aug 4, 2023

dadrus modified the milestones: v0.12.0-alpha, v0.13.0-alpha Sep 15, 2023

dadrus modified the milestones: v0.13.0-alpha, v0.14.0-alpha Nov 15, 2023

dadrus modified the milestones: v0.14.0-alpha, next Jan 3, 2024

dadrus mentioned this issue Apr 12, 2024

refactor!: Rules are loaded into a radix trie instead of a list #1358

Merged

6 tasks

dadrus closed this as completed in #1358 Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow rule matching using path prefixes #652

Allow rule matching using path prefixes #652

dadrus commented May 29, 2023 •

edited

Loading

davidspek commented Jun 21, 2023

dadrus commented Jun 22, 2023

davidspek commented Jul 10, 2023

dadrus commented Jul 10, 2023

Allow rule matching using path prefixes #652

Allow rule matching using path prefixes #652

Comments

dadrus commented May 29, 2023 • edited Loading

Preflight checklist

Describe the background of your feature request

Describe your idea

Are there any workarounds or alternatives?

Version

Additional Context

davidspek commented Jun 21, 2023

dadrus commented Jun 22, 2023

davidspek commented Jul 10, 2023

dadrus commented Jul 10, 2023

dadrus commented May 29, 2023 •

edited

Loading