Skip to content

mgmt: Validate VpcPeering & friends #445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 4, 2025
Merged

Conversation

qmonnet
Copy link
Member

@qmonnet qmonnet commented May 8, 2025

This Pull Request implements some validation for VpcExpose, VpcManifest, VpcPeering, and VpcTable objects received from the gRPC server.

Fixes: #442

Please refer to individual commit descriptions for details.

@qmonnet qmonnet self-assigned this May 8, 2025
@qmonnet qmonnet requested a review from a team as a code owner May 8, 2025 02:44
@qmonnet qmonnet added the area/nat Related to Network Address Translation (NAT) label May 8, 2025
@Fredi-raspall
Copy link
Contributor

@qmonnet I would strongly recommend rebasing this work on #441
Otherwise you're going to be fixing nits already fixed and add changes to code that has been significantly modified.

Copy link
Contributor

@Fredi-raspall Fredi-raspall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is not useful to review this PR until rebased.

@qmonnet
Copy link
Member Author

qmonnet commented May 8, 2025

I'm not rebasing on the top of three different PRs. Rebases in any of the three will cause me to need new rebases. (Although, there doesn't seem to be that many conflicts between your code and mine - mostly the changes and renaming of ApiError, I think).

If you prefer your sets to go in first - totally understandable - I'll wait for them to get merged, and then rebase.

@Fredi-raspall Fredi-raspall force-pushed the pr/qmonnet/nat-validation branch from 4da325d to 3898e9c Compare May 8, 2025 12:54
@qmonnet
Copy link
Member Author

qmonnet commented May 8, 2025

Oh, you did rebase 🤯
Thank you!

@Fredi-raspall Fredi-raspall self-requested a review May 8, 2025 13:00
@Fredi-raspall Fredi-raspall force-pushed the pr/qmonnet/nat-validation branch 4 times, most recently from 31b64ef to 6c1bc96 Compare May 8, 2025 14:17
@Fredi-raspall Fredi-raspall changed the base branch from main to pr/fredi/mgmt-cleanup May 8, 2025 14:18
@Fredi-raspall Fredi-raspall force-pushed the pr/fredi/mgmt-cleanup branch from 0a28eb3 to e382a20 Compare May 8, 2025 18:11
@qmonnet qmonnet marked this pull request as draft May 8, 2025 19:43
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch from 6c1bc96 to 6ea7f0b Compare May 9, 2025 13:47
@qmonnet qmonnet requested a review from Fredi-raspall May 9, 2025 13:47
@qmonnet qmonnet marked this pull request as ready for review May 9, 2025 13:47
@qmonnet
Copy link
Member Author

qmonnet commented May 9, 2025

OK, we should be good for a second review

Copy link
Collaborator

@daniel-noland daniel-noland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work!

I especially appreciate the design comments.

@@ -187,6 +281,144 @@ mod tests {
);
}

#[test]
fn test_validate_expose() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two things

  1. why not make these distinct tests?
  2. why not use bolero?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Not sure what you mean, you propose to move each of the different case to a separate function? They felt close enough (all about testing the validation for VpcExpose) that I could group them in a single function, but if you think it's better to have them separated into many smaller functions, I guess that's also a possibility.

  2. Because I'm not familiar with bolero and haven't looked into it (or even thought about it, for that matter 🫣). I can, if you think it's worth it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I won't reject the PR on this basis.

I just think that bolero would be notably easier and more effective than this path.

Perhaps I should do a tutorial on how to use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you did a presentation on it at some point? I need to find the link again. If I'm getting confused and you haven't, then yes, I'd gladly watch/read it!

@@ -144,10 +144,21 @@ impl VpcManifest {
}
}
pub fn add_expose(&mut self, expose: VpcExpose) -> ConfigResult {
expose.validate()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it your intention to make it possible to construct invalid configurations?

Why remove the validation check here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on Fredi's feedback from my first version:

The gist is that we prefer having validation in a single place, and that, yes, it's acceptable to build a broken config that we will reject at validation time, if only to show it back to the user.

Copy link
Collaborator

@daniel-noland daniel-noland May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

I can live with that.

It might be good to have a validated type which wraps this one in that case to express the static requirement that you can't feed an invalid config to the dataplane worker

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can look into it as a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should be able to construct objects that are not necessarily validate and then centralize the validation logic. It makes it much easier to fuzz test the gRPC converter for example because the fuzzer doesn't need a globally valid configuration.

Copy link
Member Author

@qmonnet qmonnet Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fredi-raspall Fredi-raspall force-pushed the pr/fredi/mgmt-cleanup branch 2 times, most recently from a2d87d8 to fac46e6 Compare May 10, 2025 16:08
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch 2 times, most recently from db6bb9f to dae4cfd Compare May 12, 2025 10:23
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch from f9f7223 to fe9eb86 Compare May 16, 2025 16:42
@qmonnet qmonnet marked this pull request as ready for review May 16, 2025 16:43
@qmonnet qmonnet requested a review from Fredi-raspall May 16, 2025 16:43
@qmonnet
Copy link
Member Author

qmonnet commented May 16, 2025

I've incorporated Fredi's feedback I think, this is ready for another review.

@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch from fe9eb86 to f536525 Compare May 20, 2025 14:27
@qmonnet qmonnet added this to the GW R1 milestone May 21, 2025
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch 3 times, most recently from a89f119 to 3f13088 Compare May 23, 2025 16:32
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch from 3f13088 to c500427 Compare June 3, 2025 11:50
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch 2 times, most recently from 08b228b to 7318736 Compare June 4, 2025 12:19
Implement the skeleton function to validate a VpcExpose object. This
validation function is automatically called when adding a VpcExpose to a
VpcManifest object. The function performs the following steps:

 1. Make sure that all prefixes and exclusion prefixes for this
    VpcExpose are of the same IP version.
 2. Make sure that all prefixes (or exclusion prefixes) in each list
    (ips/nots/as_range/not_as) don't overlap with other prefixes (or
    exclusion prefixes, respectively) of this list.
 3. Make sure that all exclusion prefixes are contained within existing
    prefixes, unless the list of allowed elements is empty.
 4. Make sure exclusion prefixes in a list don't exclude all of the
    prefixes in the associated prefixes list.
 5. Make sure we have the same number of addresses available on each
    side (public/private), taking exclusion prefixes into account.

Add related ConfigError enum variants, as necessary to cover the
different error cases.

Add unit tests to check validation for a number of VpcExpose objects.

Some other unit tests in the crate need minor adjustments, because they
use incorrect VpcExpose objects with exclusion prefixes not covering
allowed prefixes, or with a mismatch between the number of addresses in
the public and private sets (which we reject because of static NAT).

Signed-off-by: Quentin Monnet <[email protected]>
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch from 7318736 to 2fcbe18 Compare June 4, 2025 12:52
Copy link
Contributor

@Fredi-raspall Fredi-raspall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd see if we can make the validations less heavy and ponder whether all are required. In general, I'd not "parent" users. We should just reject stuff that will cause failures and not forbid configs which, while seemingly useless, are legitimate.
Requesting @mvachhar a second opinion on the complexity / performance cost of the validations. Other than that, looks good to me @qmonnet !

qmonnet added 4 commits June 4, 2025 15:11
Just like we did for VpcExpose, implement validation for VpcManifest.
There's not much to do this time, because most of the work consists in
validating the VpcExpose objects. Make sure the name is not empty, and
done.

Note: We should also prevent prefix overlap between the different
exposes in the manifest; this is not trivial, and will be added as a
separate commit.

Signed-off-by: Quentin Monnet <[email protected]>
When validating a VpcExpose object, we ensure that the prefixes in each
category (ips, as_range, nots, not_as) do not overlap. However, within a
VpcManifest, we must also ensure that prefixes from different VpcExpose
objects do not collide. In particular, we want to make sure that the
available addresses in the "ips" lists do not collide with "ips" lists
from other VpcExpose (or if the prefixes collide, they should also be
covered by exclusion prefixes). Same applies to the "as" lists.

Introduce a function to check overlap between two sets of addresses (one
set = allowed prefixes + exclusion prefixes). See comments in the code
for details on the implementation. Call this function as part of the
validation process for a VpcManifest, to ensure there is no collision.

Signed-off-by: Quentin Monnet <[email protected]>
When we try to insert a peering with a name that already exists for
another peering in a VpcPeeringTable, this results in an error, but it
also results in the peering being inserted anyway, as we only detect the
name collision when inserting (and we do not implement rollback).

As a consequence, we keep an incorrect peering entry in the table. This
can cause unexpected collisions on prefixes when trying to add
subsequent, legitimate peerings to the table, for example.

Instead of inserting unconditionally, let's check first for the
existence of the peering name in the table.

Fixes: e45c0ef (feat(mgmt): define external model)
Signed-off-by: Quentin Monnet <[email protected]>
In previous commits, we introduced validation for VpcExpose and
VpcManifest objects; now we do the same for a VpcTable.

Here, we simply need to ensure, after building the table, that for any
two given Vpcs, we have at most a single Peering object between the two:
having more than one is not currently supported.

Signed-off-by: Quentin Monnet <[email protected]>
@qmonnet qmonnet force-pushed the pr/qmonnet/nat-validation branch from 2fcbe18 to 58e2c60 Compare June 4, 2025 14:18
@qmonnet qmonnet requested a review from Fredi-raspall June 4, 2025 14:18
@qmonnet
Copy link
Member Author

qmonnet commented Jun 4, 2025

Both comments from Fredi should be addressed now.

Copy link
Contributor

@Fredi-raspall Fredi-raspall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @qmonnet , thanks for addressing my comments!

Copy link
Contributor

@mvachhar mvachhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but please see issue #554 to have a validated and unvalidated set of types here in the future. The unvalidated types make it much easier to test the gRPC conversion where we want to make sure the data gets translated but aren't as worried about higher level semantics.

@mvachhar mvachhar added this pull request to the merge queue Jun 4, 2025
Merged via the queue into main with commit db09bc3 Jun 4, 2025
17 checks passed
@mvachhar mvachhar deleted the pr/qmonnet/nat-validation branch June 4, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nat Related to Network Address Translation (NAT)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validate gRPC objects
4 participants