Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatched local address when peering with external router on difference network #1371

Closed
pmint93 opened this issue Oct 11, 2022 · 18 comments
Closed
Labels

Comments

@pmint93
Copy link

pmint93 commented Oct 11, 2022

What happened?

kube-router v1.5 did not peer with external router on difference network (Mismatched local address)
My cluster setup has worker nodes join 2 difference networks (see figure bellow) and kube-router was configured to peer with external router https://github.com/cloudnativelabs/kube-router/blob/master/docs/bgp.md#peering-outside-the-cluster

What did you expect to happen?

Peer state is Established

How can we reproduce the behavior you experienced?

Steps to reproduce the behavior:

  1. Step 1: Setup k8s cluster with workers join 2 difference networks as diagram bellow
  2. Step 2: Install kube-router with params --peer-router-ips --peer-router-asns to peer with external router
  3. Step 3: Check BGP neighbors on kube-router gobgp neighbor to see that the peer state did not Established

Screenshots / Architecture Diagrams / Network Topologies

                      ----|-----------------|--------------- network A (10.21.136.0/24)
                      workerA             workerB                 
--------|-----------------|-----------------|--------------- network B (10.99.0.0/24)
    External Peer

System Information (please complete the following information):

  • Kube-Router Version: v1.5.0
  • Kube-Router Parameters: --peer-router-ips=<EXTERNAL_PEER_IP>
  • Kubernetes Version: v1.22.10
  • Cloud Type: on premise
  • Kubernetes Deployment Type: custom
  • Kube-Router Deployment Type: DaemonSet
  • Cluster Size: 2

Logs, other output, metrics

time="2022-10-10T09:45:09Z" level=info msg="Mismatched local address" Addr=10.99.0.6 BindInterface= Configured addr=10.21.136.204 Key=10.99.0.2 Topic=Peer
Peer         AS Up/Down State       |#Received  Accepted
10.99.0.2 64513   never Active      |        0         0
10.99.0.3 64513   never Active      |        0         0

Additional context

This PR seem to be the reason #777

@pmint93 pmint93 added the bug label Oct 11, 2022
@aauren
Copy link
Collaborator

aauren commented Oct 20, 2022

@pmint93 - Thanks for doing some leg work on tracking down the issue. I agree with you, that #777 is likely is stopping you from peering on all interfaces.

One of the questions that I had while looking at your scenario, is how valuable it would be to actually be able to create this peering? I ask this because there may be places in the code that makes assumptions about the external peers being able to reach the network of the primary node IP like this one: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/routing/ecmp_vip.go#L32 So, even if you are able to peer from a different address, if your node isn't capable of routing to the network of the primary IP you may still run into routing issues.

Have you tried reverting #777 and building locally to ensure that kube-router will meet your expectations with the local-address restriction not applied?

Going back to the expectation in this issue, while I could maybe see that the current setting is perhaps a bit restrictive, I'm not a huge fan of going back to allowing BGP to bind to all addresses on the node, that seems a bit over-permissive. Do you have a recommendation on what you think would be a fair compromise in order to fix your use-case (preferably something that didn't introduce a new global flag to kube-router)?

@tamihiro
Copy link
Contributor

tamihiro commented Oct 20, 2022

@pmint93 How about running a dynamic routing daemon and advertising a loopback address from both interfaces on each worker? You can use their loopback address as nodeIP, so workerA and workerB peer with each other from their loopback address, and they peer with the external router from their loopback address.

@pmint93
Copy link
Author

pmint93 commented Oct 24, 2022

@aauren
What I'm trying to archive is a bit of complicated, but in short i'm trying to separate k8s control plane from it's nodes network and those control planes are external peers i've mention earlier.

the code that makes assumptions about the external peers being able to reach the network of the primary node IP like this one: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/routing/ecmp_vip.go#L32

I'm using --override-nexthop to overcome issue with POD CIDR, and not using service VIP. So, v1.4.0 works well for me

Have you tried reverting #777 and building locally to ensure that kube-router will meet your expectations with the local-address restriction not applied?

Not yet, but i can confirm v1.4.0 is working as expected

Do you have a recommendation on what you think would be a fair compromise in order to fix your use-case

I already use kube-router.io/bgp-local-addresses=0.0.0.0 annotation to tell BGP to listen on all interfaces. I know peer local-address we discussing now is difference but when it come to peer on multi-homed environment, both setting should be 0.0.0.0. So, perhaps adding a case if bgp-local-addresses=0.0.0.0 then peer local-address should be the same ?

@pmint93
Copy link
Author

pmint93 commented Oct 24, 2022

@tamihiro Perhaps it will work but i think that just a workaround to abstract away actual network topology

@pmint93
Copy link
Author

pmint93 commented Oct 24, 2022

After reading code changes in PR #777, the exactly lines of code that run me into this issue are:

// Create and set Global Peer Router complete configs
nrc.globalPeerRouters, err = newGlobalPeers(peerIPs, peerPorts, peerASNs, peerPasswords, nrc.bgpHoldtime,
nrc.nodeIP.String())
if err != nil {

That assuming external peers (globalPeerRouters) can reach nrc.nodeIP.String() which is not always correct. I agree that internal peers should explicit set local-address to nodeIP, but for external peers i suggest:

a. Detect the correct interfaces which can route to --peer-router-ips (set to Transport.BindInterface) or local-address (set to Transport.LocalAddress)

OR

b. Introduce new flags --peer-router-ifs which tell bgp to use correct interface for a list of external peer ips in --peer-router-ips

OR

c. Revert those changes except for the internal peers. I think issue #778 only related to internal peers

@tamihiro
Copy link
Contributor

@pmint93

What I'm trying to archive is a bit of complicated, but in short i'm trying to separate k8s control plane from it's nodes network and those control planes are external peers i've mention earlier.

Since I'm yet to understand exactly what you mean by "trying to separate k8s control plane" so I don't have further suggestion, but

Perhaps it will work but i think that just a workaround to abstract away actual network topology

peering from a loopback address is not really an uncommon practice even when it's external. It could keep network design and configurations relatively simple in many cases besides offering L3 redundancy.

@rkojedzinszky
Copy link
Contributor

rkojedzinszky commented Nov 2, 2022

@aauren
Copy link
Collaborator

aauren commented Nov 5, 2022

Thanks for putting together a PR @rkojedzinszky

@pmint93 / @rkojedzinszky - as it concerns the change though, from a user perspective, it seems a bit confusing to me that we would have both kube-router.io/peer.localips and kube-router.io/bgp-local-addresses there seems to be a lot of overlap in the two.

For instance, what would be the purpose of allowing GoBGP to listen on an address with kube-router.io/bgp-local-addresses, but not able to peer with it if it didn't match the same address in kube-router.io/peer.localips? My thought here is that the same option kube-router.io/bgp-local-addresses should likely be used for both configuring which IPs to listen on as well as which IPs to peer from.

Any thoughts here?

@aauren
Copy link
Collaborator

aauren commented Nov 5, 2022

@pmint93 Additionally, I still really like your option a above:

a. Detect the correct interfaces which can route to --peer-router-ips (set to Transport.BindInterface) or local-address (set to Transport.LocalAddress)

This seems like a pretty natural way to infer what the bind address should be. Although, it should probably also be constrained by kube-router.io/bgp-local-addresses or nodeIP. So essentially:

  1. We always listen on either the NodeIP or bgp-local-addresses when starting GoBGP
  2. Attempt to lookup available interfaces whose subnets contain the peer's address
  3. Then cross-check it against either bgp-local-addresses or the primary node IP to see if we are configured to peer from that address
  4. If we are not configured to peer from that address or we don't have an interface that contains the peer IP within its subnet, then we emit warnings
  5. Otherwise we configure the address as the Transport.LocalAddress

Thoughts?

@rkojedzinszky
Copy link
Contributor

Thanks for putting together a PR @rkojedzinszky

@pmint93 / @rkojedzinszky - as it concerns the change though, from a user perspective, it seems a bit confusing to me that we would have both kube-router.io/peer.localips and kube-router.io/bgp-local-addresses there seems to be a lot of overlap in the two.

For instance, what would be the purpose of allowing GoBGP to listen on an address with kube-router.io/bgp-local-addresses, but not able to peer with it if it didn't match the same address in kube-router.io/peer.localips? My thought here is that the same option kube-router.io/bgp-local-addresses should likely be used for both configuring which IPs to listen on as well as which IPs to peer from.

Any thoughts here?

As I understand, the two configuration options serves different purposes. Of course, they may overlap, but there also might be cases when one will list 2 addresses in kube-router.io/bgp-local-addresses, and having more external peers will require that many peer.localips entries.

As I understand originally #777 intended to fix internal bgp connections, howewer, it also did set localaddress for external networks. So, perhaps we could relax this a bit, not enforcing nodeIP as localaddress for external peers.

@rkojedzinszky
Copy link
Contributor

@pmint93 Additionally, I still really like your option a above:

a. Detect the correct interfaces which can route to --peer-router-ips (set to Transport.BindInterface) or local-address (set to Transport.LocalAddress)

This seems like a pretty natural way to infer what the bind address should be. Although, it should probably also be constrained by kube-router.io/bgp-local-addresses or nodeIP. So essentially:

  1. We always listen on either the NodeIP or bgp-local-addresses when starting GoBGP
  2. Attempt to lookup available interfaces whose subnets contain the peer's address
    That might be a complex task, as we might have multihop peers too.
  3. Then cross-check it against either bgp-local-addresses or the primary node IP to see if we are configured to peer from that address
  4. If we are not configured to peer from that address or we don't have an interface that contains the peer IP within its subnet, then we emit warnings
  5. Otherwise we configure the address as the Transport.LocalAddress

Thoughts?

I would prefer simpler solutions where less logic is involved.
Even gobgp could have been started with binding to 0.0.0.0, and on specific cases (e.g. internal connections) specify localaddress implicitly to nodeIP (which would fix #778), and just let the user choose localaddress for their peers if they have a reason for it.

@aauren
Copy link
Collaborator

aauren commented Nov 15, 2022

@rkojedzinszky - Sorry it has taken me a while to come back to this.

I see your point and think that the PR that you've suggested is probably a good enough compromise and allows us to stay true to the configuration syntax in GoBGP. I personally don't like starting another list based peering annotation, but I can see that it would give some users the flexibility they desire and it doesn't really introduce all that much overhead for the project to maintain one more annotation.

I left one comment on your PR, once that is cleared up, we should be able to merge it.

@pmint93
Copy link
Author

pmint93 commented Nov 15, 2022

@aauren Sorry for late reply, i missed your last mention.

IMO,:

  1. The two configuration kube-router.io/bgp-local-addresses and kube-router.io/peer.localips may overlapping each other but it serve difference purposes. So using only kube-router.io/bgp-local-addresses is not so good choice, but acceptable if we want simplicity
  2. My option a above actually complicated, especially when dynamic routing, NAT-ing, ... etc
  3. Explicit set local-ip for external peers in start peering connection to neighbors from node's advertise-ip #777 has no meaning for gobgp's neighbor config needs local-address  #778 (which only related to internal peers), and cause another issue (this issue). So why bother keep it ?
  4. I prefer new flag over annotations, because we must annotate the node before running kube-router. When adding new node to the cluster, there is a short time period to do so

@aauren
Copy link
Collaborator

aauren commented Nov 15, 2022

This should be fixed with the merging of #1392

This should be test-able using the kube-router-git container from Docker Hub.

@pmint93 in regards to annotations not being ideal, I think that this will get fixed other ways in the future. There are already two open issues / comments for this issue:

I intend to look into both of them once I'm done finishing up the dual-stack work that the project has been working towards.

In the meantime, not upgrading to v1.5.0 or ensuring that the annotation is present before kube-router starts, or doing a rollout restart should fix the issue.

Closing this issue for now just as a matter of house-cleaning, but @rkojedzinszky / @pmint93 feel free to comment if you discover further issues after testing the fix and I'll re-open.

@aauren aauren closed this as completed Nov 15, 2022
@aauren
Copy link
Collaborator

aauren commented Nov 20, 2022

This was released as part of v1.5.3: https://github.com/cloudnativelabs/kube-router/releases/tag/v1.5.3

@pmint93
Copy link
Author

pmint93 commented Nov 29, 2022

Seem v1.5.3 did not working. I keep seeing error message Mismatched local address. After digging into the code, it seem:

This line of code (which was introduced in #1392) did not get called

nrc.globalPeerRouters, err = newGlobalPeers(peerIPs, peerPorts, peerASNs, peerPasswords, peerLocalIPs,
nrc.bgpHoldtime, nrc.nodeIP.String())

due to precondition

if len(nrc.globalPeerRouters) == 0 {

because nrc.globalPeerRouters already assigned in

nrc.globalPeerRouters, err = newGlobalPeers(kubeRouterConfig.PeerRouters, peerPorts,
peerASNs, peerPasswords, nil, nrc.bgpHoldtime, nrc.nodeIP.String())

and with localips set to nil

@rkojedzinszky Does v1.5.3 working for you ?

@rkojedzinszky
Copy link
Contributor

@pmint93 This is intended to work when only annotations are used to configure external peers. Does your setup use external connections configured from command line?

Right now we are not using v1.5.3 anywhere. I need some time to test it.

@pmint93
Copy link
Author

pmint93 commented Nov 29, 2022

@rkojedzinszky Yes, I did config kube-router with --peer-router-ips and make sure annotation kube-router.io/peer.localips appeared before running it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants