Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caddyhttp: Escaping placeholders in CEL, add vars and vars_regexp #6594

Merged
merged 9 commits into from
Oct 2, 2024

Conversation

francislavoie
Copy link
Member

@francislavoie francislavoie commented Sep 27, 2024

Related to #6584, prior work in #4715 (comment)

Our current CEL placeholder regexp is too naive. It replaces everything that looks like a placeholder, and offers no good option to opt out if a literal placeholder-looking input is given to some CEL functions as input.

To solve this, we can do the placeholder replacement in two steps.

  • First, we match whether the placeholder is not preceded by a \ (or start of a line) and replace it and preserving the preceding character which is matched (we now have two capture groups so ${1} has to go to the start)
  • Second, we match for a placeholder which does have a preceding \, and drop the \.

That gives a way to escape through the caddyPlaceholder() replacement and get an input directly to the given matcher.

One interesting quirk: CEL itself has its own \\ escape sequence, so depending on whether we want the placeholder to be replaced by the matcher (e.g. header matcher does do placeholder replacements on its inputs at runtime) or we want the value to be raw determines how many backslashes we need. The test shows this pretty well I think:

  • If the header matcher should take the value as-is and not perform placeholder replacement (because the match value is also placeholder-like), then we need three backslashes, like \\\{foobar}. This is because the last one of the three is for escaping caddyPlaceholder(), then the prior two are collapsed into one by CEL's parsing itself, then the last one is to escape the placeholder replacer, and the result is a clean {foobar} matching value.
  • If the header matcher itself should perform placeholder replacement (not done in the CEL matcher, but deeper in the matcher itself) then a single backslash is used, like \{http.request.uri.path} (or \{path} in the Caddyfile). This only escapes past the caddyPlaceholder() regexp, but does not escape past the placeholder replacer which runs inside header.

We'll need to document this, but it's tricky 😅

After doing the above, I decided to also implement vars and vars_regexp support in CEL, which we skipped implementing because of the above placeholder limitations.

@francislavoie francislavoie added the feature ⚙️ New feature or request label Sep 27, 2024
@francislavoie francislavoie changed the title caddyhttp: Escaping placeholders in CEL caddyhttp: Escaping placeholders in CEL, add vars and vars_regexp Sep 27, 2024
@francislavoie
Copy link
Member Author

Couple more commits:

  • I noticed we were repeating "request" a lot, i.e. the variable we push into the CEL context. I made a const for that.
  • Similarly, I noticed we were using "caddyPlaceholder" in a couple places outside of the caddyhttp package (file matcher) so I renamed the const and exported it.
  • I renamed request to req, and caddyPlaceholder to placeholder, because actually users can use it in their own CEL without the magic shortcut, so making it more ergonomic is nice.
  • Bumped the CEL version to latest
  • Registered a few CEL extensions to provide some more functions to users (see https://github.com/google/cel-go/blob/master/ext/README.md)

@francislavoie francislavoie force-pushed the cel-placeholder-escape branch 7 times, most recently from 1fb98c3 to 7178c48 Compare October 1, 2024 05:48
Comment on lines 159 to 184
# To shorten the following lines
ssh_opts="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null"
ssh_host="[email protected]"

# The environment is fresh, so there's no point in keeping accepting and adding the key.
rsync -arz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --progress --delete --exclude '.git' . "$CI_USER"@ci-s390x.caddyserver.com:/var/tmp/"$short_sha"
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -t "$CI_USER"@ci-s390x.caddyserver.com "cd /var/tmp/$short_sha; go version; go env; printf "\n\n";CGO_ENABLED=0 go test -p 1 -tags nobadger -v ./..."
rsync -arz -e "ssh $ssh_opts" --progress --delete --exclude '.git' . "$ssh_host":/var/tmp/"$short_sha"
ssh $ssh_opts -t "$ssh_host" bash <<EOF
cd /var/tmp/$short_sha
go version
go env
printf "\n\n"
CGO_ENABLED=0 go test -p 1 -tags nobadger -v ./...
EOF
test_result=$?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this! 😀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, nicer to read it out multiline!

The point of this is I was trying to implement a while loop to retry the tests (to smooth over the flakiness short-term) but it wasn't working 🤦‍♂️ just a mess of dealing with the exit code coming through the shell script. I'll paste you what I had in slack if you wanna take a crack at it.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I guess!

Did you still want to shorten things to ph and req?

.github/workflows/ci.yml Show resolved Hide resolved
@francislavoie
Copy link
Member Author

req yes, it's not ph anymore I went with placeholder... but get was suggested, idk what you think of that.

@francislavoie
Copy link
Member Author

After a quick chat with Matt on slack, decided to go with ph(req, <string>) as the shortcut (FYI @polarathene). Also added a matching shortcut to templates cause I noticed we had placeholder as a function in there as well. Cause why not. Some semblance of consistency I guess.

@mholt mholt merged commit 792f1c7 into master Oct 2, 2024
33 checks passed
@mholt mholt deleted the cel-placeholder-escape branch October 2, 2024 12:34
@mholt mholt added this to the v2.9.0-beta.1 milestone Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants