-
Notifications
You must be signed in to change notification settings - Fork 499
Routed ingress primary udn with static ips #1793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Routed ingress primary udn with static ips #1793
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @kyrtapz |
A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be | ||
created, and is associated to a UDN (both UDN, and C-UDN). This CRD holds the | ||
association of MAC address to IPs for a UDN. When importing the VM into | ||
OpenShift Virt, MTV will provision / update this object with this information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can also be a modificable field at the UDN/CUDN, since implicitly is affecting the network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it surely can, but I think:
- that'll cause marshal/unmarshal of the (c)UDN object to cost more
- there's value in separating the concepts
- (c) UDN is an OVN-K entity. I don't see why this needs to be an OVN-K entity - OVN-K doesn't need write / read access to it. This is for the admin (or MTV, or any other introspection tool) to configure, and for the ipam-extensions to read / act upon its data
The `ipam-extensions` mutating webhook will kick in whenever a virt launcher | ||
pod is created - it will identify when the VM has a primary UDN attachment | ||
(already happens today), and will also identify when the pod network attachment | ||
has a MAC address configuration request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw we have see that qemu do also get configured with macAddress field on some envs, we should tackle that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC you're saying there's value outside of this enhancement (preserve IP / MAC / GW net config of imported VMs to L2 overlays w/ IPAM) in allowing the MAC address of the primary UDN attachment to be configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that configuring macAddress at the VMI interface has consequences, that we should be aware of.
OVN-Kubernetes will then act upon this information, by configuring the | ||
requested MAC and IPs in the pod. If the allocation of the IP is successful, | ||
said IPs will be persisted in the corresponding `IPAMClaim` CR (which already | ||
happens today). If it fails (e.g. that IP address is already in use in the | ||
subnet), the CNI will fail, crash-looping the pod. The error condition will be | ||
reported in the associated `IPAMClaim` CR, and an event logged in the pod. | ||
|
||
This flow is described in the following sequence diagram: | ||
```mermaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for live migration/restart we are going to have both NSE address and the ipamclaims address, we should take care of that, maybe just checking that both are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point. First thing that comes to mind is we could report an error condition in the IPAMClaim + event on the pod when they differ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be handled in detail in the upstream OVN-Kubernetes OKEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qinqon tks for the review.
A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be | ||
created, and is associated to a UDN (both UDN, and C-UDN). This CRD holds the | ||
association of MAC address to IPs for a UDN. When importing the VM into | ||
OpenShift Virt, MTV will provision / update this object with this information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it surely can, but I think:
- that'll cause marshal/unmarshal of the (c)UDN object to cost more
- there's value in separating the concepts
- (c) UDN is an OVN-K entity. I don't see why this needs to be an OVN-K entity - OVN-K doesn't need write / read access to it. This is for the admin (or MTV, or any other introspection tool) to configure, and for the ipam-extensions to read / act upon its data
The `ipam-extensions` mutating webhook will kick in whenever a virt launcher | ||
pod is created - it will identify when the VM has a primary UDN attachment | ||
(already happens today), and will also identify when the pod network attachment | ||
has a MAC address configuration request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC you're saying there's value outside of this enhancement (preserve IP / MAC / GW net config of imported VMs to L2 overlays w/ IPAM) in allowing the MAC address of the primary UDN attachment to be configurable.
OVN-Kubernetes will then act upon this information, by configuring the | ||
requested MAC and IPs in the pod. If the allocation of the IP is successful, | ||
said IPs will be persisted in the corresponding `IPAMClaim` CR (which already | ||
happens today). If it fails (e.g. that IP address is already in use in the | ||
subnet), the CNI will fail, crash-looping the pod. The error condition will be | ||
reported in the associated `IPAMClaim` CR, and an event logged in the pod. | ||
|
||
This flow is described in the following sequence diagram: | ||
```mermaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point. First thing that comes to mind is we could report an error condition in the IPAMClaim + event on the pod when they differ.
pod is created - it will identify when the VM has a primary UDN attachment | ||
(already happens today), and will also identify when the pod network attachment | ||
has a MAC address configuration request. | ||
It will then access the `IPPool` (or `DHCPLeaseConfig` for the UDN) to extract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw this CRD should be at the same "namespace" as IPAMClaim it should not be a OVN thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, not 100% sure yet.
How will it work for C-UDNs ? We need the MAC:IPs association per network, which is a non-namespaced object.
Furthermore, I envision associating this pool to the network using the NAD, enabling this feature to work outside OVN-K (i.e. making it a part of k8snetworkplumbingwg, as IPAMClaims
are).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I mean the kind namespace, not the resource namespace like muiltus.io/IPAMClaim or the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then I agree. I don't see a reason for this CRD to be added on OVN-K. I think we can get it on k8snetworkplumbing - as IPAMClaim is.
4a7aa05
to
3b854af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't review fully , stopped at the API starting point so that i can have context for today's meeting
solution, and OpenShift Virtualization currently supports these features on its | ||
primary UserDefinedNetworks (UDNs). | ||
|
||
These users have additional requirements, like routed ingress into their VMs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reachability to the VM is what we are after when you say "routed" ?
routed ingress here and everywhere would be good to clarify if its simple external to pod or something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reachability to the VM without NAT, yes.
|
||
## Motivation | ||
|
||
Some users are running VMs in virtualization platforms having a managed IP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here their own managed IP configuration rather than the CNI doing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the words, but I'm afraid I don't follow what the sentence means ...
Could you clarify ?
address (the one assgined to the VM). | ||
- As the owner of a VM running in a traditional virtualization platform, I want | ||
to import said VM into Kubernetes, attaching it to an overlay network. I want to | ||
have managed IPs for my VMs, since that feature was already available in my |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine this would be the IPAM disable mode? cause once migraton is done we don't let them do any IPAM management, new ones will get random IPs from IPAM
TBD to be revisited as spoken with @maiqueb on slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not at all.
IPAM disabled mode is OVN-K is 100% unaware of what IPs are in the workloads.
IP spoofing protection is off, network policies will only allow IPBlock
selectors, and so forth.
association of MAC address to IPs for a UDN. When importing the VM into | ||
OpenShift Virt, MTV will provision / update this object with this information. | ||
This object is providing to the admin user a single place to check the IP | ||
address MAC to IPs mapping. On an first implementation phase, we can have the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm i don't think manual population like that will be acceptable will it? for ppl bringing in 5000 vms is this even possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a later implementation can focus on that. E.g. introspect the src cluster (check out ARP requests / replies on the switch) and provision this on behalf of the user.
... or something like that. That should be covered on an MTV enhancement.
required MAC address. We would need MTV to somehow also figure out what IP | ||
addresses are on the aforementioned interfaces. | ||
|
||
A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm why not make IPAMClaim's v2alpha2 API AddressClaims
so enhance IPAMClaim ? :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- that's not a centralized place with the MAC => IPs association. There's value in that. Plus, it's closer to what existing platforms provide.
- I feel we'd do that because "there's a box we're opening already. Let's put this disjoint information into the same box, and retrieve both". I.e. we're bending what we have so we reach the goal we think we have faster.
Having said that, re-purposing IPAMClaims to something more generic is an option. I just feel we're not doing it for the right reasons. I'm afraid we might be doing it because it fits the time-frame rather than the use case.
FWIW, the issue of "importing 5000" VMs is still there if we take this route. How will the 5000 AddressClaims
be provisioned ? By whom ? With what data ?
067607d
to
cc333a0
Compare
to import said VM - whose IPs were statically configured - into Kubernetes, | ||
attaching it to an overlay network. I want to ingress/egress using the same IP | ||
address (the one assgined to the VM). | ||
- As the owner of a VM running in a traditional virtualization platform, I want |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this localnet ipamful ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - "As the owner of a VM running in a traditional virtualization platform, I want to import said VM into Kubernetes, attaching it to an overlay network"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But is said
I want to
have managed IPs for my VMs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I still don't see the implication.
If it helps, this is about importing a VM into a layer2 overlay with IPAM.
|
||
- Preserve the original VM's MAC address | ||
- Preserve the original VM's IP address | ||
- Specify the gateway IP address of the imported VM so it can keep the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know if the ip address is at the same subnet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we'll only take care of that scenario.
I'll add it to the list of non-goals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also put at the non goals adding non default gw routes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
be able to consume Kubernetes features like network policies, and services, to | ||
benefit from the Kubernetes experience. | ||
|
||
### Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to preserve the other DHCP stuff like DNS servers and MTU ? like imaging that we configure DHCPOPtions with bad MTU, after the lease we may break stuff ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean, but I think we need to keep advertising MTU / DNS / other properties as we are already doing for primary UDN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh after migration virtual machine get restarted so it will take the new MTU for new system also the new DNS servers, nah all good here.
|
||
### Non-Goals | ||
|
||
Handle importing VMs without a managed IP experience - i.e. IPs were defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enhacement is about static IPs but we talk all over about managed IPs, is quite confusing for me, static IPs does not feel like managed IPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they're not. Managed IPs are about having IPAM in the UDN, while static IPs are about having the ability of specifying which IPs is your workload going to have.
I'll add a disclaimer / explanation to a nomenclature section.
OpenShift Virt, MTV (or a separate, dedicated component) will provision / | ||
update this object with the required information (MAC to IPs association). | ||
This object is providing to the admin user a single place to check the IP | ||
address MAC to IPs mapping. On an first implementation phase, we can have the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing the step about MTV creating the VirtualMachine with macAddress
field so ipam extensions can map it to IPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This approach requires the `IPAMClaim` CRD to be updated, specifically its | ||
status sub-resource - we need to introduce `Conditions` so we can report errors | ||
when allocating IPs which were requested by the user - what if the address is | ||
already in use within the UDN ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful always I think, for centralized and not centralized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mentioning it somewhere else that this is needed for both options, as you're pointing out.
OVN-Kubernetes would read the CR, attempt to reserve the requested MAC and IP, | ||
and then persist that information in the IPAMClaim status - reporting a | ||
successful sync in the `IPAMClaim` status - or a failure otherwise. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am missing the mapping mechanism between VM and IPAMClaim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be elaborated on the implementation details section. Does it make sense to split it like this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh ok, implementation details is the place.
|
||
type IPPoolSpec struct { | ||
NetworkName string `json:"network-name"` | ||
Entries map[net.HardwareAddr][]net.IP `json:"entries"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this to be a CIDR ?
Entries map[net.HardwareAddr][]net.IP `json:"entries"` | |
Entries map[net.HardwareAddr][]net.IPNet `json:"entries"` |
The NSE ip request is a cidr
annotations:
k8s.v1.cni.cncf.io/networks: '[
{
"name": "macvlan1-config",
"ips": [ "10.1.1.11/24" ],
"interface": "net1"
}
]'
Maybe is dangerous to do so.
|
||
type IPPoolStatus struct { | ||
Conditions []Condition | ||
AssociatedNADs []NADInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means we will need to reconcile on nad creation even if i's not attached to any VM, not sure it's worth it, I would just not include that and add if only if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is good information to have and present to the admin.
Still, I might flag it for an optional future improvement - since the feature itself does not require it.
Does that work for you @qinqon ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, if it's not really needed, let's mark it as opìonal, we have already a lot of stuff at our plate.
The `IPPool` CRD will have at least the following conditions: | ||
- DuplicateMACAddresses: will indicate to the admin that a MAC address appears | ||
multiple times in the `Entries` list | ||
- DuplicateIPAddresses: will indicate to the admin that an IP address appears | ||
multiple times associated to different MAC addresses in the `Entries` list | ||
- Success: the data present in the spec is valid (no duplicate MACs or IPs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Show yaml with them to see typoe, reason and message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
We plan on reporting in the `IPPool` the name of the NADs which are holding the | ||
configuration for the network which this pool stores the MAC <=> IPs | ||
associations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said this is a not needed extra work that need a new reconcile logic to keep this up to date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listed these as optional improvements for improving the UX.
aligned with existing user expectations? Will it be a significant maintenance | ||
burden? Is it likely to be superceded by something else in the near future? | ||
|
||
## Alternatives (Not Implemented) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can put here the alternative of chaning kubevirt VMI api ? stating the problems with ip pool depletion and stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll refer that alternative. FWIW, I assume you're mentioning adding an IP
/ IPRequest
s attribute to the KubeVirt Interface
definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qinqon please take another look
|
||
- Preserve the original VM's MAC address | ||
- Preserve the original VM's IP address | ||
- Specify the gateway IP address of the imported VM so it can keep the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
OVN-Kubernetes would read the CR, attempt to reserve the requested MAC and IP, | ||
and then persist that information in the IPAMClaim status - reporting a | ||
successful sync in the `IPAMClaim` status - or a failure otherwise. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be elaborated on the implementation details section. Does it make sense to split it like this ?
|
||
The `ipam-extensions` mutating webhook will kick in whenever a virt launcher | ||
pod is created - it will identify when the VM has a primary UDN attachment | ||
(already happens today), and will also identify when the pod network attachment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to disambiguate by pointing at the KubeVirt API reference attribute where the MAC is actually defined.
OVN-Kubernetes will then act upon this information, by configuring the | ||
requested MAC and IPs in the pod. If the allocation of the IP is successful, | ||
said IPs will be persisted in the corresponding `IPAMClaim` CR (which already | ||
happens today). If it fails (e.g. that IP address is already in use in the | ||
subnet), the CNI will fail, crash-looping the pod. The error condition will be | ||
reported in the associated `IPAMClaim` CR, and an event logged in the pod. | ||
|
||
This flow is described in the following sequence diagram: | ||
```mermaid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be handled in detail in the upstream OVN-Kubernetes OKEP.
|
||
#### New IPPool CRD | ||
|
||
The IPPool CRD will operate as a place to store the MAC to IP addresses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
address from the range that VMs have already been assigned outside of the | ||
cluster (or for secondary IP addresses assigned to the VM's interfaces) | ||
|
||
### Non-Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
address from the range that VMs have already been assigned outside of the | ||
cluster (or for secondary IP addresses assigned to the VM's interfaces) | ||
|
||
### Non-Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
OpenShift Virt, MTV (or a separate, dedicated component) will provision / | ||
update this object with the required information (MAC to IPs association). | ||
This object is providing to the admin user a single place to check the IP | ||
address MAC to IPs mapping. On an first implementation phase, we can have the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
MTV -->> VM Owner: OK | ||
``` | ||
|
||
Hence, the required changes would be: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- SuccessfulAllocation: reports the IP address was successfully allocated for | ||
the workload | ||
- AllocationConflict: reports the requested allocation was not successful - i.e. | ||
the requested IP address is already present in the network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: Miguel Duarte Barroso <[email protected]>
d695f23
to
fac8ab7
Compare
There should be no hypershift platform-specific considerations with this | ||
feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qinqon please keep me honest here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VMs + ipv6 + default gw is a big issue.
- As the owner of a VM running in a traditional virtualization platform, I want | ||
to import said VM into Kubernetes, attaching it to an overlay network. I want | ||
to have managed IPs (IPAM) for my VMs, since that feature was already available | ||
in my previous platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still very confusing to me, like this case we import the VM, but users are fine with ovn-kubernetes assigning IPs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - users want to specify their IPs.
But OVN-K is fully aware of which IPs are in which pods, thus it can have network policies (with pod / namespace selectors) and have IP spoofing protection enabled.
I'll try to reword the user story.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll drop it, I think this user story is represented in the other two:
- As the owner of a VM running in a traditional virtualization platform, I want
to import said VM - whose IPs were statically configured - into Kubernetes,
attaching it to an overlay network. I want to ingress/egress using the same IP
address (the one assigned to the VM). - As the owner of a VM running in a traditional virtualization platform, I want
to import said VM into Kubernetes, attaching it to an overlay network. I want to
be able to consume Kubernetes features like network policies, and services, to
benefit from the Kubernetes experience.
be able to consume Kubernetes features like network policies, and services, to | ||
benefit from the Kubernetes experience. | ||
|
||
### Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh after migration virtual machine get restarted so it will take the new MTU for new system also the new DNS servers, nah all good here.
OVN-Kubernetes would read the CR, attempt to reserve the requested MAC and IP, | ||
and then persist that information in the IPAMClaim status - reporting a | ||
successful sync in the `IPAMClaim` status - or a failure otherwise. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh ok, implementation details is the place.
This object is providing to the admin user a single place to check the IP | ||
address MAC to IPs mapping. On an first implementation phase, we can have the | ||
admin provision these CRs manually. Later on, MTV (or any other cluster | ||
introspection tool) can provision these on behalf of the admin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah all good, I was just pointing out stuff that was missing, but if it's going to be clarified I am all good.
|
||
The `ipam-extensions` mutating webhook will kick in whenever a virt launcher | ||
pod is created - it will identify when the VM has a primary UDN attachment | ||
(already happens today), and will also identify when the pod network attachment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use the word "attachment" since it's confusing with the multus idiom.
enhancements/network/routed-ingress-support-for-primary-udn-attached-vms-with-static-ips.md
Show resolved
Hide resolved
- ipam-extensions (CNV component) will now also read the `IPPool` CRs for VMs | ||
having primary UDNs in their namespaces, and requesting a specific MAC address | ||
in their specs. These CRs will be used to generate the multus default network | ||
annotation, which will be set in the pods by the mutating webhook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail if the VMI has macAddress but there is no entry at the mac -> ip mapping ?, like maybe the admin forgot so VM keeps trying and then after admin fix the issue vm is corrected.
Also we have to think about ippool mutability, like maybe we should be able to add new entries but not to modify or remve them ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail if the VMI has macAddress but there is no entry at the mac -> ip mapping ?, like maybe the admin forgot so VM keeps trying and then after admin fix the issue vm is corrected.
For this scenario, I think we should just create the request for the MAC to be preserved. I.e. you'd get the original VM MAC, but an IP that was assigned by the primary UDN IP allocator.
This is getting into the behavior we expect, we should iron this out ASAP.
@tssurya / @kyrtapz thoughts ?
Also we have to think about ippool mutability, like maybe we should be able to add new entries but not to modify or remve them ?
For that we should consider what we want the pool for: e.g. do we want to sync the pool also w/ the dynamic allocations ? That would be good for observability - but it is IMHO a follow-up feature we may want to pursue.
That would require the pool to be subject to updates.
Having the pool immutable (only data provisioned during creation) would only allow you to fulfill the importing use case.
We should also consider that if we want to be able to create a VM attached to a primary UDN with a dedicated MAC / IP addresses we need the pool to be mutable.
We need to get our goals set up ASAP.
The IPPool CRD is a cluster-scoped object associated to a UDN via the logical | ||
network name (`NAD.Spec.Config.Name` attribute), since we want to have this | ||
feature upstream in the k8snetworkplumbingwg, rather than in OVN-Kubernetes. | ||
|
||
The `IPPool` spec will have an attribute via which the admin can point to a | ||
cluster UDN - by the logical network name. The admin (which is the only actor | ||
able to create the `IPPool`) has read access to all NADs in all namespaces, | ||
hence they can inspect the NAD object to extract the network name. We could | ||
even update the cluster UDN type to feature the generated network name in its | ||
status sub-resource, to simplify the UX of the admin user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the fact that networkName ippool field is not indexed a problem ? at the end ipam extensions will have to iterate them instead of using "List", but I understand that they will be at the informer's cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the fact that networkName ippool field is not indexed a problem ? at the end ipam extensions will have to iterate them instead of using "List", but I understand that they will be at the informer's cache.
It is what it is ... we either use label selectors, we list then filter, or we associate using a different mean - i.e. point at the NAD or cluster UDN name (both are bad IMHO).
I'm going with what I think is the least worse option. Maybe relying on selectors is not that bad though ...
Waiting for more feedback.
Preserving the gateway will require changes to the OVN-Kubernetes API. The | ||
cluster UDN CRD should be updated - adding a gateway definition - and the | ||
OVN-Kubernetes | ||
[NetConf](https://github.com/ovn-kubernetes/ovn-kubernetes/blob/2643dabe165bcb2d4564866ee1476a891c316fe3/go-controller/pkg/cni/types/types.go#L10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can show an example with it ?
apiVersion: v1
kind: Pod
metadata:
name: pod-example
annotations:
v1.multus-cni.io/default-network: '{
"name": "isolated-net",
"namespace": "myisolatedns",
...
"default-route":[
"192.0.2.5",
"fd90:1234::5"
]
}'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's really not what I mean here - you're showing the network selection element for the multus default network.
I'm saying the NAD should have a place for the user to indicate which GW to use on all attachments to that network. I don't see how showing an NSE can be an example of that.
enhancements/network/routed-ingress-support-for-primary-udn-attached-vms-with-static-ips.md
Show resolved
Hide resolved
5c37dfe
to
12b1a2c
Compare
This commits fills out the following sections: - summary - motivation - user stories - goals - non-goals Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
This commit describes how an imported VM's gateway will be preserved, which is required for the network configuration on the guests to be identical on both the source/destination clusters. Signed-off-by: Miguel Duarte Barroso <[email protected]>
…tion Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Indicate the changes to the `IPAMClaim` CRD, as well as the definition of the proposed new `IPPool` CRD. Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
…ternative Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
…ay" section Signed-off-by: Miguel Duarte Barroso <[email protected]>
01ad846
to
c117013
Compare
@maiqueb: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
- Preserve the original VM's IP address | ||
- Specify the gateway IP address of the imported VM so it can keep the same | ||
default route | ||
- Allow excludeSubnets to be used with L2 UDNs to ensure OVNK does not use an IP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
- Preserve the original VM's MAC address | ||
- Preserve the original VM's IP address | ||
- Specify the gateway IP address of the imported VM so it can keep the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that the gateway IP we want to allow the user to specify is the one of the UDN, we do not want to allow each imported VM to specify it's own. Could you clarify that here?
|
||
### Non-Goals | ||
|
||
- Handle importing VMs without a managed IP experience - i.e. IPs were defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a limitation of MTV?
- Importing a VM whose gateway is outside the subnet of the network. | ||
- Adding non default routes to the VM when importing it into OpenShift Virt. | ||
- Modifying the default gateway and management IPs of a primary UDN after it was created. | ||
- Modifying a pod's network configuration after the pod was created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is worth it to specify that we are not going to support "live" migrating VMs? i.e importing VMs without a reboot.
- Modifying the default gateway and management IPs of a primary UDN after it was created. | ||
- Modifying a pod's network configuration after the pod was created. | ||
|
||
**NOTE:** implementing support on UDNs (achieving the namespace isolation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this note, is it the same as https://github.com/openshift/enhancements/pull/1793/files#diff-ef7ea92bb7f0198174529787f912ec7898b84fa65e3e1b102ce30ed76735ab43R90-R91?
This approach requires having N CRs with a 1:1 association between a primary | ||
UDN attachment and the MAC and IPs it had on the original platform. | ||
|
||
We could either introduce a new CRD with IPs and MAC association (which would |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The de-cetralized approach described here changes how CNV integrates with OVN-K.
There would no longer be a need to pass the IP/MAC requests through the v1.multus-cni.io/default-network
annotation.
|
||
#### Centralized IP management workflow | ||
|
||
The [centralized IP management](#centralized-ip-management) flow is described |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a step for UDN provisioning?
#### Centralized IP management | ||
|
||
A new CRD - named `IPPool`, or `DHCPLeaseConfig` (or the like) - will be | ||
created, and is associated to a cluster UDN. This CRD holds the association of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CRD would be cluster scoped and limited to work with cluster UDNs only?
Are there any specific reasons limiting it to cluster UDNs?
Interface string `json:"interface"` | ||
+ // The IPs requested by the user | ||
+ // +optional | ||
+ IPRequests []CIDR `json:"ipRequests,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-centralized IP management chapter states that the IPAMClaimSpec would also include the MAC address, it's missing here.
|
||
type IPPoolSpec struct { | ||
NetworkName string `json:"network-name"` | ||
Entries map[net.HardwareAddr][]net.IPNet `json:"entries"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Indentation is broken in this code snippet
No description provided.