Skip to content

Routed ingress primary udn with static ips #1793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

maiqueb
Copy link
Contributor

@maiqueb maiqueb commented May 8, 2025

No description provided.

@openshift-ci openshift-ci bot requested review from abhat and trozet May 8, 2025 06:55
Copy link
Contributor

openshift-ci bot commented May 8, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign trozet for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@maiqueb
Copy link
Contributor Author

maiqueb commented May 8, 2025

/cc @kyrtapz

@openshift-ci openshift-ci bot requested a review from kyrtapz May 8, 2025 08:03
Comment on lines 133 to 136
A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be
created, and is associated to a UDN (both UDN, and C-UDN). This CRD holds the
association of MAC address to IPs for a UDN. When importing the VM into
OpenShift Virt, MTV will provision / update this object with this information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also be a modificable field at the UDN/CUDN, since implicitly is affecting the network

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it surely can, but I think:

  • that'll cause marshal/unmarshal of the (c)UDN object to cost more
  • there's value in separating the concepts
  • (c) UDN is an OVN-K entity. I don't see why this needs to be an OVN-K entity - OVN-K doesn't need write / read access to it. This is for the admin (or MTV, or any other introspection tool) to configure, and for the ipam-extensions to read / act upon its data

The `ipam-extensions` mutating webhook will kick in whenever a virt launcher
pod is created - it will identify when the VM has a primary UDN attachment
(already happens today), and will also identify when the pod network attachment
has a MAC address configuration request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw we have see that qemu do also get configured with macAddress field on some envs, we should tackle that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC you're saying there's value outside of this enhancement (preserve IP / MAC / GW net config of imported VMs to L2 overlays w/ IPAM) in allowing the MAC address of the primary UDN attachment to be configurable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that configuring macAddress at the VMI interface has consequences, that we should be aware of.

Comment on lines 164 to 230
OVN-Kubernetes will then act upon this information, by configuring the
requested MAC and IPs in the pod. If the allocation of the IP is successful,
said IPs will be persisted in the corresponding `IPAMClaim` CR (which already
happens today). If it fails (e.g. that IP address is already in use in the
subnet), the CNI will fail, crash-looping the pod. The error condition will be
reported in the associated `IPAMClaim` CR, and an event logged in the pod.

This flow is described in the following sequence diagram:
```mermaid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for live migration/restart we are going to have both NSE address and the ipamclaims address, we should take care of that, maybe just checking that both are the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point. First thing that comes to mind is we could report an error condition in the IPAMClaim + event on the pod when they differ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be handled in detail in the upstream OVN-Kubernetes OKEP.

Copy link
Contributor Author

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qinqon tks for the review.

Comment on lines 133 to 136
A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be
created, and is associated to a UDN (both UDN, and C-UDN). This CRD holds the
association of MAC address to IPs for a UDN. When importing the VM into
OpenShift Virt, MTV will provision / update this object with this information.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it surely can, but I think:

  • that'll cause marshal/unmarshal of the (c)UDN object to cost more
  • there's value in separating the concepts
  • (c) UDN is an OVN-K entity. I don't see why this needs to be an OVN-K entity - OVN-K doesn't need write / read access to it. This is for the admin (or MTV, or any other introspection tool) to configure, and for the ipam-extensions to read / act upon its data

The `ipam-extensions` mutating webhook will kick in whenever a virt launcher
pod is created - it will identify when the VM has a primary UDN attachment
(already happens today), and will also identify when the pod network attachment
has a MAC address configuration request.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC you're saying there's value outside of this enhancement (preserve IP / MAC / GW net config of imported VMs to L2 overlays w/ IPAM) in allowing the MAC address of the primary UDN attachment to be configurable.

Comment on lines 164 to 230
OVN-Kubernetes will then act upon this information, by configuring the
requested MAC and IPs in the pod. If the allocation of the IP is successful,
said IPs will be persisted in the corresponding `IPAMClaim` CR (which already
happens today). If it fails (e.g. that IP address is already in use in the
subnet), the CNI will fail, crash-looping the pod. The error condition will be
reported in the associated `IPAMClaim` CR, and an event logged in the pod.

This flow is described in the following sequence diagram:
```mermaid
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point. First thing that comes to mind is we could report an error condition in the IPAMClaim + event on the pod when they differ.

pod is created - it will identify when the VM has a primary UDN attachment
(already happens today), and will also identify when the pod network attachment
has a MAC address configuration request.
It will then access the `IPPool` (or `DHCPLeaseConfig` for the UDN) to extract
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw this CRD should be at the same "namespace" as IPAMClaim it should not be a OVN thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, not 100% sure yet.

How will it work for C-UDNs ? We need the MAC:IPs association per network, which is a non-namespaced object.

Furthermore, I envision associating this pool to the network using the NAD, enabling this feature to work outside OVN-K (i.e. making it a part of k8snetworkplumbingwg, as IPAMClaims are).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I mean the kind namespace, not the resource namespace like muiltus.io/IPAMClaim or the like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then I agree. I don't see a reason for this CRD to be added on OVN-K. I think we can get it on k8snetworkplumbing - as IPAMClaim is.

@maiqueb maiqueb force-pushed the routed-ingress-primary-udn-with-static-ips branch from 4a7aa05 to 3b854af Compare May 8, 2025 14:28
Copy link
Contributor

@tssurya tssurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't review fully , stopped at the API starting point so that i can have context for today's meeting

solution, and OpenShift Virtualization currently supports these features on its
primary UserDefinedNetworks (UDNs).

These users have additional requirements, like routed ingress into their VMs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reachability to the VM is what we are after when you say "routed" ?
routed ingress here and everywhere would be good to clarify if its simple external to pod or something else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reachability to the VM without NAT, yes.


## Motivation

Some users are running VMs in virtualization platforms having a managed IP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here their own managed IP configuration rather than the CNI doing it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the words, but I'm afraid I don't follow what the sentence means ...
Could you clarify ?

address (the one assgined to the VM).
- As the owner of a VM running in a traditional virtualization platform, I want
to import said VM into Kubernetes, attaching it to an overlay network. I want to
have managed IPs for my VMs, since that feature was already available in my
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this would be the IPAM disable mode? cause once migraton is done we don't let them do any IPAM management, new ones will get random IPs from IPAM
TBD to be revisited as spoken with @maiqueb on slack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not at all.

IPAM disabled mode is OVN-K is 100% unaware of what IPs are in the workloads.
IP spoofing protection is off, network policies will only allow IPBlock selectors, and so forth.

association of MAC address to IPs for a UDN. When importing the VM into
OpenShift Virt, MTV will provision / update this object with this information.
This object is providing to the admin user a single place to check the IP
address MAC to IPs mapping. On an first implementation phase, we can have the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i don't think manual population like that will be acceptable will it? for ppl bringing in 5000 vms is this even possible?

Copy link
Contributor Author

@maiqueb maiqueb May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a later implementation can focus on that. E.g. introspect the src cluster (check out ARP requests / replies on the switch) and provision this on behalf of the user.

... or something like that. That should be covered on an MTV enhancement.

required MAC address. We would need MTV to somehow also figure out what IP
addresses are on the aforementioned interfaces.

A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm why not make IPAMClaim's v2alpha2 API AddressClaims so enhance IPAMClaim ? :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. that's not a centralized place with the MAC => IPs association. There's value in that. Plus, it's closer to what existing platforms provide.
  2. I feel we'd do that because "there's a box we're opening already. Let's put this disjoint information into the same box, and retrieve both". I.e. we're bending what we have so we reach the goal we think we have faster.

Having said that, re-purposing IPAMClaims to something more generic is an option. I just feel we're not doing it for the right reasons. I'm afraid we might be doing it because it fits the time-frame rather than the use case.

FWIW, the issue of "importing 5000" VMs is still there if we take this route. How will the 5000 AddressClaims be provisioned ? By whom ? With what data ?

@maiqueb maiqueb force-pushed the routed-ingress-primary-udn-with-static-ips branch 2 times, most recently from 067607d to cc333a0 Compare May 12, 2025 14:40
to import said VM - whose IPs were statically configured - into Kubernetes,
attaching it to an overlay network. I want to ingress/egress using the same IP
address (the one assgined to the VM).
- As the owner of a VM running in a traditional virtualization platform, I want
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this localnet ipamful ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - "As the owner of a VM running in a traditional virtualization platform, I want to import said VM into Kubernetes, attaching it to an overlay network"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is said

I want to
have managed IPs for my VMs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I still don't see the implication.

If it helps, this is about importing a VM into a layer2 overlay with IPAM.


- Preserve the original VM's MAC address
- Preserve the original VM's IP address
- Specify the gateway IP address of the imported VM so it can keep the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if the ip address is at the same subnet ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we'll only take care of that scenario.

I'll add it to the list of non-goals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also put at the non goals adding non default gw routes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

be able to consume Kubernetes features like network policies, and services, to
benefit from the Kubernetes experience.

### Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to preserve the other DHCP stuff like DNS servers and MTU ? like imaging that we configure DHCPOPtions with bad MTU, after the lease we may break stuff ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean, but I think we need to keep advertising MTU / DNS / other properties as we are already doing for primary UDN.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh after migration virtual machine get restarted so it will take the new MTU for new system also the new DNS servers, nah all good here.


### Non-Goals

Handle importing VMs without a managed IP experience - i.e. IPs were defined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enhacement is about static IPs but we talk all over about managed IPs, is quite confusing for me, static IPs does not feel like managed IPs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're not. Managed IPs are about having IPAM in the UDN, while static IPs are about having the ability of specifying which IPs is your workload going to have.

I'll add a disclaimer / explanation to a nomenclature section.

OpenShift Virt, MTV (or a separate, dedicated component) will provision /
update this object with the required information (MAC to IPs association).
This object is providing to the admin user a single place to check the IP
address MAC to IPs mapping. On an first implementation phase, we can have the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the step about MTV creating the VirtualMachine with macAddress field so ipam extensions can map it to IPs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 174 to 177
This approach requires the `IPAMClaim` CRD to be updated, specifically its
status sub-resource - we need to introduce `Conditions` so we can report errors
when allocating IPs which were requested by the user - what if the address is
already in use within the UDN ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful always I think, for centralized and not centralized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mentioning it somewhere else that this is needed for both options, as you're pointing out.

OVN-Kubernetes would read the CR, attempt to reserve the requested MAC and IP,
and then persist that information in the IPAMClaim status - reporting a
successful sync in the `IPAMClaim` status - or a failure otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing the mapping mechanism between VM and IPAMClaim

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be elaborated on the implementation details section. Does it make sense to split it like this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok, implementation details is the place.


type IPPoolSpec struct {
NetworkName string `json:"network-name"`
Entries map[net.HardwareAddr][]net.IP `json:"entries"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this to be a CIDR ?

Suggested change
Entries map[net.HardwareAddr][]net.IP `json:"entries"`
Entries map[net.HardwareAddr][]net.IPNet `json:"entries"`

The NSE ip request is a cidr

annotations:
    k8s.v1.cni.cncf.io/networks: '[
            {
                "name": "macvlan1-config",
                "ips": [ "10.1.1.11/24" ],
                "interface": "net1"
            }
    ]'

Maybe is dangerous to do so.


type IPPoolStatus struct {
Conditions []Condition
AssociatedNADs []NADInfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means we will need to reconcile on nad creation even if i's not attached to any VM, not sure it's worth it, I would just not include that and add if only if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good information to have and present to the admin.

Still, I might flag it for an optional future improvement - since the feature itself does not require it.

Does that work for you @qinqon ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, if it's not really needed, let's mark it as opìonal, we have already a lot of stuff at our plate.

Comment on lines 422 to 427
The `IPPool` CRD will have at least the following conditions:
- DuplicateMACAddresses: will indicate to the admin that a MAC address appears
multiple times in the `Entries` list
- DuplicateIPAddresses: will indicate to the admin that an IP address appears
multiple times associated to different MAC addresses in the `Entries` list
- Success: the data present in the spec is valid (no duplicate MACs or IPs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show yaml with them to see typoe, reason and message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 429 to 431
We plan on reporting in the `IPPool` the name of the NADs which are holding the
configuration for the network which this pool stores the MAC <=> IPs
associations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said this is a not needed extra work that need a new reconcile logic to keep this up to date.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listed these as optional improvements for improving the UX.

aligned with existing user expectations? Will it be a significant maintenance
burden? Is it likely to be superceded by something else in the near future?

## Alternatives (Not Implemented)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can put here the alternative of chaning kubevirt VMI api ? stating the problems with ip pool depletion and stuff

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll refer that alternative. FWIW, I assume you're mentioning adding an IP / IPRequests attribute to the KubeVirt Interface definition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, thanks.

Copy link
Contributor Author

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qinqon please take another look


- Preserve the original VM's MAC address
- Preserve the original VM's IP address
- Specify the gateway IP address of the imported VM so it can keep the same
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

OVN-Kubernetes would read the CR, attempt to reserve the requested MAC and IP,
and then persist that information in the IPAMClaim status - reporting a
successful sync in the `IPAMClaim` status - or a failure otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be elaborated on the implementation details section. Does it make sense to split it like this ?


The `ipam-extensions` mutating webhook will kick in whenever a virt launcher
pod is created - it will identify when the VM has a primary UDN attachment
(already happens today), and will also identify when the pod network attachment
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to disambiguate by pointing at the KubeVirt API reference attribute where the MAC is actually defined.

Comment on lines 164 to 230
OVN-Kubernetes will then act upon this information, by configuring the
requested MAC and IPs in the pod. If the allocation of the IP is successful,
said IPs will be persisted in the corresponding `IPAMClaim` CR (which already
happens today). If it fails (e.g. that IP address is already in use in the
subnet), the CNI will fail, crash-looping the pod. The error condition will be
reported in the associated `IPAMClaim` CR, and an event logged in the pod.

This flow is described in the following sequence diagram:
```mermaid
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be handled in detail in the upstream OVN-Kubernetes OKEP.


#### New IPPool CRD

The IPPool CRD will operate as a place to store the MAC to IP addresses
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

address from the range that VMs have already been assigned outside of the
cluster (or for secondary IP addresses assigned to the VM's interfaces)

### Non-Goals
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

address from the range that VMs have already been assigned outside of the
cluster (or for secondary IP addresses assigned to the VM's interfaces)

### Non-Goals
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

OpenShift Virt, MTV (or a separate, dedicated component) will provision /
update this object with the required information (MAC to IPs association).
This object is providing to the admin user a single place to check the IP
address MAC to IPs mapping. On an first implementation phase, we can have the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

MTV -->> VM Owner: OK
```

Hence, the required changes would be:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 388 to 391
- SuccessfulAllocation: reports the IP address was successfully allocated for
the workload
- AllocationConflict: reports the requested allocation was not successful - i.e.
the requested IP address is already present in the network
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
@maiqueb maiqueb force-pushed the routed-ingress-primary-udn-with-static-ips branch 2 times, most recently from d695f23 to fac8ab7 Compare May 13, 2025 15:09
Comment on lines +595 to +611
There should be no hypershift platform-specific considerations with this
feature.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qinqon please keep me honest here.

Copy link
Contributor

@qinqon qinqon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VMs + ipv6 + default gw is a big issue.

Comment on lines 70 to 73
- As the owner of a VM running in a traditional virtualization platform, I want
to import said VM into Kubernetes, attaching it to an overlay network. I want
to have managed IPs (IPAM) for my VMs, since that feature was already available
in my previous platform.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still very confusing to me, like this case we import the VM, but users are fine with ovn-kubernetes assigning IPs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - users want to specify their IPs.

But OVN-K is fully aware of which IPs are in which pods, thus it can have network policies (with pod / namespace selectors) and have IP spoofing protection enabled.

I'll try to reword the user story.

Copy link
Contributor Author

@maiqueb maiqueb May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll drop it, I think this user story is represented in the other two:

  • As the owner of a VM running in a traditional virtualization platform, I want
    to import said VM - whose IPs were statically configured - into Kubernetes,
    attaching it to an overlay network. I want to ingress/egress using the same IP
    address (the one assigned to the VM).
  • As the owner of a VM running in a traditional virtualization platform, I want
    to import said VM into Kubernetes, attaching it to an overlay network. I want to
    be able to consume Kubernetes features like network policies, and services, to
    benefit from the Kubernetes experience.

be able to consume Kubernetes features like network policies, and services, to
benefit from the Kubernetes experience.

### Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh after migration virtual machine get restarted so it will take the new MTU for new system also the new DNS servers, nah all good here.

OVN-Kubernetes would read the CR, attempt to reserve the requested MAC and IP,
and then persist that information in the IPAMClaim status - reporting a
successful sync in the `IPAMClaim` status - or a failure otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok, implementation details is the place.

This object is providing to the admin user a single place to check the IP
address MAC to IPs mapping. On an first implementation phase, we can have the
admin provision these CRs manually. Later on, MTV (or any other cluster
introspection tool) can provision these on behalf of the admin.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah all good, I was just pointing out stuff that was missing, but if it's going to be clarified I am all good.


The `ipam-extensions` mutating webhook will kick in whenever a virt launcher
pod is created - it will identify when the VM has a primary UDN attachment
(already happens today), and will also identify when the pod network attachment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use the word "attachment" since it's confusing with the multus idiom.

Comment on lines +329 to +340
- ipam-extensions (CNV component) will now also read the `IPPool` CRs for VMs
having primary UDNs in their namespaces, and requesting a specific MAC address
in their specs. These CRs will be used to generate the multus default network
annotation, which will be set in the pods by the mutating webhook.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fail if the VMI has macAddress but there is no entry at the mac -> ip mapping ?, like maybe the admin forgot so VM keeps trying and then after admin fix the issue vm is corrected.

Also we have to think about ippool mutability, like maybe we should be able to add new entries but not to modify or remve them ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fail if the VMI has macAddress but there is no entry at the mac -> ip mapping ?, like maybe the admin forgot so VM keeps trying and then after admin fix the issue vm is corrected.

For this scenario, I think we should just create the request for the MAC to be preserved. I.e. you'd get the original VM MAC, but an IP that was assigned by the primary UDN IP allocator.

This is getting into the behavior we expect, we should iron this out ASAP.

@tssurya / @kyrtapz thoughts ?

Also we have to think about ippool mutability, like maybe we should be able to add new entries but not to modify or remve them ?

For that we should consider what we want the pool for: e.g. do we want to sync the pool also w/ the dynamic allocations ? That would be good for observability - but it is IMHO a follow-up feature we may want to pursue.

That would require the pool to be subject to updates.

Having the pool immutable (only data provisioned during creation) would only allow you to fulfill the importing use case.

We should also consider that if we want to be able to create a VM attached to a primary UDN with a dedicated MAC / IP addresses we need the pool to be mutable.

We need to get our goals set up ASAP.

What are your thoughts @tssurya @kyrtapz ?

Comment on lines +238 to +243
The IPPool CRD is a cluster-scoped object associated to a UDN via the logical
network name (`NAD.Spec.Config.Name` attribute), since we want to have this
feature upstream in the k8snetworkplumbingwg, rather than in OVN-Kubernetes.

The `IPPool` spec will have an attribute via which the admin can point to a
cluster UDN - by the logical network name. The admin (which is the only actor
able to create the `IPPool`) has read access to all NADs in all namespaces,
hence they can inspect the NAD object to extract the network name. We could
even update the cluster UDN type to feature the generated network name in its
status sub-resource, to simplify the UX of the admin user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the fact that networkName ippool field is not indexed a problem ? at the end ipam extensions will have to iterate them instead of using "List", but I understand that they will be at the informer's cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the fact that networkName ippool field is not indexed a problem ? at the end ipam extensions will have to iterate them instead of using "List", but I understand that they will be at the informer's cache.

It is what it is ... we either use label selectors, we list then filter, or we associate using a different mean - i.e. point at the NAD or cluster UDN name (both are bad IMHO).

I'm going with what I think is the least worse option. Maybe relying on selectors is not that bad though ...

Waiting for more feedback.

Preserving the gateway will require changes to the OVN-Kubernetes API. The
cluster UDN CRD should be updated - adding a gateway definition - and the
OVN-Kubernetes
[NetConf](https://github.com/ovn-kubernetes/ovn-kubernetes/blob/2643dabe165bcb2d4564866ee1476a891c316fe3/go-controller/pkg/cni/types/types.go#L10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can show an example with it ?

apiVersion: v1
kind: Pod
metadata:
  name: pod-example
  annotations:
    v1.multus-cni.io/default-network: '{
      "name": "isolated-net",
      "namespace": "myisolatedns",
      ...
     "default-route":[
        "192.0.2.5",
        "fd90:1234::5"
      ]
}'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's really not what I mean here - you're showing the network selection element for the multus default network.

I'm saying the NAD should have a place for the user to indicate which GW to use on all attachments to that network. I don't see how showing an NSE can be an example of that.

@maiqueb maiqueb force-pushed the routed-ingress-primary-udn-with-static-ips branch from 5c37dfe to 12b1a2c Compare May 14, 2025 12:10
maiqueb added 5 commits May 14, 2025 17:45
This commits fills out the following sections:
- summary
- motivation
- user stories
- goals
- non-goals

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
This commit describes how an imported VM's gateway will be preserved,
which is required for the network configuration on the guests to be
identical on both the source/destination clusters.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
maiqueb added 17 commits May 14, 2025 17:45
Indicate the changes to the `IPAMClaim` CRD, as well as the definition
of the proposed new `IPPool` CRD.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
@maiqueb maiqueb force-pushed the routed-ingress-primary-udn-with-static-ips branch from 01ad846 to c117013 Compare May 14, 2025 16:45
Copy link
Contributor

openshift-ci bot commented May 14, 2025

@maiqueb: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

- Preserve the original VM's IP address
- Specify the gateway IP address of the imported VM so it can keep the same
default route
- Allow excludeSubnets to be used with L2 UDNs to ensure OVNK does not use an IP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trozet @tssurya @maiqueb
We need to get clarity on whether this is a requirement for this enhancement.
We talked about it offline and we do not see this as a must for importing VMs.


- Preserve the original VM's MAC address
- Preserve the original VM's IP address
- Specify the gateway IP address of the imported VM so it can keep the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the gateway IP we want to allow the user to specify is the one of the UDN, we do not want to allow each imported VM to specify it's own. Could you clarify that here?


### Non-Goals

- Handle importing VMs without a managed IP experience - i.e. IPs were defined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a limitation of MTV?

- Importing a VM whose gateway is outside the subnet of the network.
- Adding non default routes to the VM when importing it into OpenShift Virt.
- Modifying the default gateway and management IPs of a primary UDN after it was created.
- Modifying a pod's network configuration after the pod was created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is worth it to specify that we are not going to support "live" migrating VMs? i.e importing VMs without a reboot.

- Modifying the default gateway and management IPs of a primary UDN after it was created.
- Modifying a pod's network configuration after the pod was created.

**NOTE:** implementing support on UDNs (achieving the namespace isolation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach requires having N CRs with a 1:1 association between a primary
UDN attachment and the MAC and IPs it had on the original platform.

We could either introduce a new CRD with IPs and MAC association (which would
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The de-cetralized approach described here changes how CNV integrates with OVN-K.
There would no longer be a need to pass the IP/MAC requests through the v1.multus-cni.io/default-network annotation.


#### Centralized IP management workflow

The [centralized IP management](#centralized-ip-management) flow is described
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a step for UDN provisioning?

#### Centralized IP management

A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be
created, and is associated to a cluster UDN. This CRD holds the association of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CRD would be cluster scoped and limited to work with cluster UDNs only?
Are there any specific reasons limiting it to cluster UDNs?

Interface string `json:"interface"`
+ // The IPs requested by the user
+ // +optional
+ IPRequests []CIDR `json:"ipRequests,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De-centralized IP management chapter states that the IPAMClaimSpec would also include the MAC address, it's missing here.


type IPPoolSpec struct {
NetworkName string `json:"network-name"`
Entries map[net.HardwareAddr][]net.IPNet `json:"entries"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Indentation is broken in this code snippet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants