Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rule matching using path prefixes #652

Closed
3 tasks done
dadrus opened this issue May 29, 2023 · 4 comments · Fixed by #1358
Closed
3 tasks done

Allow rule matching using path prefixes #652

dadrus opened this issue May 29, 2023 · 4 comments · Fixed by #1358
Labels
feature Used for new features
Milestone

Comments

@dadrus
Copy link
Owner

dadrus commented May 29, 2023

Preflight checklist

Describe the background of your feature request

NOTE": This FR has been raised for Ory Oathkeeper in ory/oathkeeper#1089. Since heimdall implements the same logic for request to rule matching, it is suffering the same challenges. For that reason, implementation of the suggested functionality would be very beneficial.

The original FR description from @davidspek (all credits to him) slightly adjusted:

Currently rules are matched to an incoming request based on regex or glob pattern matching. Since only a single rule is allowed to match a request, the regex or glob patterns must be very precise and become increasingly complex as different paths need to be matched. While the used regex library supports negative lookahead, creating rules where requests are matched based on path prefixes is still difficult since you'd need to provide negative matches for all other rule regexes with the same base path. As glob doesn't support negative lookahead doing path prefix based routing is not possible. Even when using regex patterns with negative lookahead, the issue that arises is that the end user ends up needing to manage system with the state of all rules to then be able to create rules with the appropriate regexes including negative lookahead patterns for all other rules. Since the most common type of routing people likely would want to implement is prefixed based routing, the fact that doing so with heimdall is difficult and error prone is in my views its biggest drawback.

Describe your idea

Goals and non-goals
Goals:

  • Allow for easily matching routing rules using path prefixes.

Non-goals:

  • Allow for using regex in the scheme section of the URL, since that would make it much more difficult to enter rules into the Trie.
  • Allow for regex or glob patterns in places other in the host section of the URL, since making a Trie compatible with pattern matching is likely very difficult if not impossible.
  • Allow for regex or glob patterns in places other than the last section of the path, since making a Trie compatible with pattern matching is likely very difficult if not impossible.

The design

In order to allow for easily matching route prefixes we can make use of a Trie Data Structure. These are commonly used for things like search engines where you'd want to quickly find entries that contain a word without looping through all entries within the database. We will use a Trie to find the heimdall rule with the longest matching prefix. Since the time complexity is O(n) the matching of rules should be very fast, and likely much faster than what is currently being done since currently heimdall loops through each rule to compile a list of matches.

TODO: Draw a diagram

Entering rules into the Trie

To simplify and speed up matching, the first node in our Trie will be the http scheme used (for example: https, as parsed by url.Parse), followed by a node containing the host (as parsed by `url.Parse) and then the paths.

Now for the paths. First, we will cleanup the path and keep anything up to the first regex or glob pattern. Since we are only interested in full paths that match, we will not be entering each character of the path into the Trie. Rather, we will split the path of a rule by the / character and enter each complete path into the Trie as a node. On the last path we will add the rule object to the Trie node so we can retrieve it. Technically a Trie node can contain multiple rules if patterns are used somewhere in the paths.

Matching incoming requests

When a request comes in and prefix matching is enabled we will first search the Trie for the rule(s) that match the protocol, method, sheme, hostname and path. Next, we will evaluate the rule(s) for regex or glob pattern matching the same way that is done currently. If after the pattern matching there are still multiple rules that are matched an error will occur the same way this happens currently.

Examples

Using the above example Trie diagram the following requests would en up at the following nodes in the Trie.

Scheme Hostname Path Node Number
https example.com /some/path 9
https example.com /some/other/path 12
https example.com /something/else 8
https example.com /some/thing 5
https example.com /unknown/path 1
https example.com /something/else/and/very/long 8
https example.com / 2
https example.com /some/long/path 7
https example.com /some/path 11

Are there any workarounds or alternatives?

Use the currently supported matching Methode with the challenges described above.

Version

v0.7.0-alpha

Additional Context

No response

@dadrus dadrus added the feature Used for new features label May 29, 2023
@dadrus dadrus added this to the v0.8.0-alpha milestone May 29, 2023
@davidspek
Copy link

For reference the upstream PR that implements this is ory/oathkeeper#1073. @dadrus sorry for not looking at implementing this here yet, I've been very busy lately and I'm not sure when I'd have the time to implement it here.

@dadrus
Copy link
Owner Author

dadrus commented Jun 22, 2023

You don't have to feel sorry ;)
And thank you very much for your comment!

@dadrus dadrus modified the milestones: v0.9.0-alpha, v0.10.0-alpha Jun 23, 2023
@davidspek
Copy link

FYI, you have a mistake in referencing me in the OP :).

@dadrus
Copy link
Owner Author

dadrus commented Jul 10, 2023

Oh... Sorry! Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Used for new features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants