-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SR Linux asymmetric routing in evpn vxlan #173
Conversation
This post dives deeper into the asymmetric routing model on SR Linux. | ||
The topology in use is a 3-stage Clos fabric with BGP EVPN and VXLAN, with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping this out for now as I feel it bloats the post and adds more questions in the minds of the reader.
|
||
When routing between VNIs, in a VXLAN fabric, there are two major routing models that can be used - asymmetric and symmetric. Asymmetric routing, which is the focus of this post, uses a `bridge-route-bridge` model, implying that the ingress leaf bridges the packet into the Layer 2 domain, routes it from one VLAN/VNI to another and then bridges the packet across the VXLAN fabric to the destination. | ||
|
||
Such a design naturally implies that both the source and the destination IRBs (and the corresponding Layer 2 domains and bridge tables) must exist on all leafs hosting servers that need to communicate with each other. While this increases the operational state on the leafs themselves (ARP state and MAC address state is stored everywhere), it does offer configuration and operational simplicity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit I did a quick read, but I failed to see the explanation why this mode is simpler config and oper-wise?
Would be good to mention that this mode has scale considerations as you duplicate state of each BD, which, in larger fabrics, may play a critical role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already added how mac and arp state is compounded in such designs.
|
||
# Asymmetric routing with SR Linux in EVPN VXLAN fabrics | ||
|
||
This post dives deeper into the asymmetric routing model on SR Linux. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment from Jorge
In the intro, "bridge-route-bridge model, implying that the ingress leaf bridges the packet into the Layer 2 domain, routes it from one VLAN/VNI to another and then bridges the packet across the VXLAN fabric to the destination." --> asymmetric really implies a different number and type of lookups on ingress leaf when compared to egress leaf. The former doing mac-lookup, ip-lookup, mac-lookup and the latter (egress leaf) just doing mac-lookup. I think you mean the same, but it would be good to elaborate.
also please refer to RFC9135 which is the RFC that introduces the concept of asymetric and symmetric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added clarification.
Check out a separate post on [CLI Ranges and Wildcards](../2023/cli-ranges.md). | ||
/// | ||
|
||
Remember, by default, there is no global routing instance/table in SR Linux. A `network-instance` of type `default` must be configured and these interfaces, including the `system0` interface need to be added to this network instance for point-to-point connectivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from Jorge
about this: "Remember, by default, there is no global routing instance/table in SR Linux" -> this is confusing since the default network instance is really equivalent to the global routing table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but there is no default network instance defined as of 24.7.1. Only the mgmt network-instance exists by default, which is why I made a note of ensuring the user creates this default network-instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change it to
A network-instance
named default
must be configured...
since the default type will be implicitly set for a netinst named default
Similar to how ranges can be used to pull configuration state from multiple interfaces as an example, in this case a wildcard `*` is used to select multiple routing-policies. The wildcard `spine-*` matches both policies named `spine-import` and `spine-export`. | ||
/// | ||
|
||
### Host connectivity and ESI LAG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from Jorge
"ESI LAG" -> we don't use this in the documentation. I'd call it "LAG Ethernet Segment"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added LAG Ethernet Segment but kept ESI LAG as well for familiarity with operators used to other vendors.
|
||
There is a lot going on here, so let's breakdown some of the configuration options: | ||
|
||
`anycast-gw [true|false]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from Jorge
..derived from the VRRP MAC address group range, as specified by RFC9135.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, added.
|
||
: This enables the node to learn the IP-to-MAC binding from any ARP packet and not just ARP requests. | ||
|
||
`arp host-route populate [dynamic|static|evpn]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from jorge
This command is really not adding anything in asymmetric mode:
"arp host-route populate [dynamic|static|evpn]
This enables the node to insert a host route (/32 for IPv4 and /128 for IPv6) in the routing table from dynaimc, static or EVPN-learnt ARP entries."
the reason being that the routing is always done at ingress based on the destination subnet, and then based on the arp resolution. Having /32s serve no purpose. In symmetric mode it is important to avoid tromboning if multiple leaves are attached to the same subnet, but I don't see the use in asymmetric mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, I will remove this.
* The corresponding IRB subinterface is bound to the MAC VRF using the `interface` configuration option. | ||
* The VXLAN tunnel subinterface is bound to the MAC VRF using the `vxlan-interface` configuration option. | ||
* BGP EVPN learning is enabled for the MAC VRF using the `protocols bgp-evpn` hierarchy and the MAC VRF is bound to an EVI (EVPN virtual instance). | ||
* The `ecmp` configuration option determines how many VTEPs can be considered for load-balancing by the local VTEP (more on this in the validation section). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
form Jorge
please mention this ecmp refers to the overlay ecmp-set for multihoming aliasing (so that people do not confuse with the underlay ecmp setting)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified.
* The VXLAN tunnel subinterface is bound to the MAC VRF using the `vxlan-interface` configuration option. | ||
* BGP EVPN learning is enabled for the MAC VRF using the `protocols bgp-evpn` hierarchy and the MAC VRF is bound to an EVI (EVPN virtual instance). | ||
* The `ecmp` configuration option determines how many VTEPs can be considered for load-balancing by the local VTEP (more on this in the validation section). | ||
* Route distinguishers and route targets are configured for the MAC VRF using the `protocols bgp-vpn` hierarchy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from Jorge
--> would be nice to let the system autoderive RDs (RTs are manually configured if the ASN is different on each leaf)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not making this change for now but adding a comment that RDs can be auto-derived as well.
No description provided.