Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SR Linux asymmetric routing in evpn vxlan #173

Merged
merged 11 commits into from
Oct 16, 2024

Conversation

aninchat
Copy link
Contributor

No description provided.

@hellt hellt changed the title aninchat added srlinux-asymmetric blog post SR Linux asymmetric routing in evpn vxlan Oct 14, 2024
Comment on lines 12 to 13
This post dives deeper into the asymmetric routing model on SR Linux.
The topology in use is a 3-stage Clos fabric with BGP EVPN and VXLAN, with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to provide some intro here.
Maybe move it from the Reviewing the asymmetric routing model section.

e.g. why the two modes exist, what are the differences between them.

you can use this info
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping this out for now as I feel it bloats the post and adds more questions in the minds of the reader.


When routing between VNIs, in a VXLAN fabric, there are two major routing models that can be used - asymmetric and symmetric. Asymmetric routing, which is the focus of this post, uses a `bridge-route-bridge` model, implying that the ingress leaf bridges the packet into the Layer 2 domain, routes it from one VLAN/VNI to another and then bridges the packet across the VXLAN fabric to the destination.

Such a design naturally implies that both the source and the destination IRBs (and the corresponding Layer 2 domains and bridge tables) must exist on all leafs hosting servers that need to communicate with each other. While this increases the operational state on the leafs themselves (ARP state and MAC address state is stored everywhere), it does offer configuration and operational simplicity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I did a quick read, but I failed to see the explanation why this mode is simpler config and oper-wise?

Would be good to mention that this mode has scale considerations as you duplicate state of each BD, which, in larger fabrics, may play a critical role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already added how mac and arp state is compounded in such designs.


# Asymmetric routing with SR Linux in EVPN VXLAN fabrics

This post dives deeper into the asymmetric routing model on SR Linux.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment from Jorge

In the intro, "bridge-route-bridge model, implying that the ingress leaf bridges the packet into the Layer 2 domain, routes it from one VLAN/VNI to another and then bridges the packet across the VXLAN fabric to the destination." --> asymmetric really implies a different number and type of lookups on ingress leaf when compared to egress leaf. The former doing mac-lookup, ip-lookup, mac-lookup and the latter (egress leaf) just doing mac-lookup. I think you mean the same, but it would be good to elaborate.
also please refer to RFC9135 which is the RFC that introduces the concept of asymetric and symmetric

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added clarification.

Check out a separate post on [CLI Ranges and Wildcards](../2023/cli-ranges.md).
///

Remember, by default, there is no global routing instance/table in SR Linux. A `network-instance` of type `default` must be configured and these interfaces, including the `system0` interface need to be added to this network instance for point-to-point connectivity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Jorge

about this: "Remember, by default, there is no global routing instance/table in SR Linux" -> this is confusing since the default network instance is really equivalent to the global routing table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but there is no default network instance defined as of 24.7.1. Only the mgmt network-instance exists by default, which is why I made a note of ensuring the user creates this default network-instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change it to

A network-instance named default must be configured...

since the default type will be implicitly set for a netinst named default

Similar to how ranges can be used to pull configuration state from multiple interfaces as an example, in this case a wildcard `*` is used to select multiple routing-policies. The wildcard `spine-*` matches both policies named `spine-import` and `spine-export`.
///

### Host connectivity and ESI LAG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Jorge

"ESI LAG" -> we don't use this in the documentation. I'd call it "LAG Ethernet Segment"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added LAG Ethernet Segment but kept ESI LAG as well for familiarity with operators used to other vendors.


There is a lot going on here, so let's breakdown some of the configuration options:

`anycast-gw [true|false]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Jorge

..derived from the VRRP MAC address group range, as specified by RFC9135.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, added.


: This enables the node to learn the IP-to-MAC binding from any ARP packet and not just ARP requests.

`arp host-route populate [dynamic|static|evpn]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from jorge

This command is really not adding anything in asymmetric mode:
"arp host-route populate [dynamic|static|evpn]
This enables the node to insert a host route (/32 for IPv4 and /128 for IPv6) in the routing table from dynaimc, static or EVPN-learnt ARP entries."

the reason being that the routing is always done at ingress based on the destination subnet, and then based on the arp resolution. Having /32s serve no purpose. In symmetric mode it is important to avoid tromboning if multiple leaves are attached to the same subnet, but I don't see the use in asymmetric mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, I will remove this.

* The corresponding IRB subinterface is bound to the MAC VRF using the `interface` configuration option.
* The VXLAN tunnel subinterface is bound to the MAC VRF using the `vxlan-interface` configuration option.
* BGP EVPN learning is enabled for the MAC VRF using the `protocols bgp-evpn` hierarchy and the MAC VRF is bound to an EVI (EVPN virtual instance).
* The `ecmp` configuration option determines how many VTEPs can be considered for load-balancing by the local VTEP (more on this in the validation section).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

form Jorge

please mention this ecmp refers to the overlay ecmp-set for multihoming aliasing (so that people do not confuse with the underlay ecmp setting)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified.

* The VXLAN tunnel subinterface is bound to the MAC VRF using the `vxlan-interface` configuration option.
* BGP EVPN learning is enabled for the MAC VRF using the `protocols bgp-evpn` hierarchy and the MAC VRF is bound to an EVI (EVPN virtual instance).
* The `ecmp` configuration option determines how many VTEPs can be considered for load-balancing by the local VTEP (more on this in the validation section).
* Route distinguishers and route targets are configured for the MAC VRF using the `protocols bgp-vpn` hierarchy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Jorge

--> would be nice to let the system autoderive RDs (RTs are manually configured if the ASN is different on each leaf)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not making this change for now but adding a comment that RDs can be auto-derived as well.

@hellt hellt merged commit 0f829fa into srl-labs:main Oct 16, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants