Cilium Cluster Scope IPAM melts DRBD storage fabric #705

bernardgut · 2024-09-07T08:40:55Z

Is there an existing issue for this?

I have searched the existing issues

Version

Cilium equal or higher than v1.16.0 and lower than v1.17.0
Piraeus Operator: v2.6.0

What happened?

Cross posting cilium/cilium#34745 here, because maybe it is a DRBD config issue more than a Cilium issue.

Anyway, when I create cluster with ipam.mode as default (not set) my interfaces on my nodes get assigned these random IPs in 10.0.0.0/8. Another cluster exists on the same subnet (and that is fine) but when this other cluster also has DRBD nodes somehow DRBD goes of the rails with a lot of errors and taints the nodes with quorum lost.

In dmesg I see errors like "unexpected connection from..."

When I check the node pool in DRBD I see weird things such as this :

k linstor node list                                                                                             
╭───────────────────────────────────────────────────╮
┊ Node ┊ NodeType  ┊ Addresses             ┊ State  ┊
╞═══════════════════════════════════════════════════╡
┊ n1   ┊ SATELLITE ┊ FD00::215:3367 (SSL)  ┊ Online ┊
┊ n2   ┊ SATELLITE ┊ 10.0.1.130:3367 (SSL) ┊ Online ┊
┊ n3   ┊ SATELLITE ┊ 10.0.0.100:3367 (SSL) ┊ Online ┊

The nodes have picked up cilium clusterpool IPs (?) instead of an IP from the services CIDR (?).
FYI my cluster is configured with

nodes: 192.168.1.1/21, 2a02::...
pods:  - fd00:0:a:1000::/64 10.229.0.0/16
services:  fd00:0:a:2000::/108 10.116.0.0/16

and the other cluster has completely different IPAM CIDRS. They simply can talk to each other (2 switches appart, without firewall)

logs on the storage controller of the melted nodes are an infinite loop of

k logs -n piraeus-datastore ha-controller-nmj7l                                                                                 
I0906 20:17:11.721589       1 agent.go:201] version: v1.2.1
I0906 20:17:11.721633       1 agent.go:202] node: n2
I0906 20:17:11.721666       1 agent.go:228] waiting for caches to sync
I0906 20:17:11.822332       1 agent.go:230] caches synced
I0906 20:17:11.822343       1 agent.go:253] starting reconciliation
I0906 20:17:21.721850       1 agent.go:253] starting reconciliation
I0906 20:17:31.722435       1 agent.go:253] starting reconciliation
I0906 20:17:41.722249       1 agent.go:253] starting reconciliation
I0906 20:17:51.725813       1 agent.go:253] starting reconciliation
I0906 20:18:01.722503       1 agent.go:253] starting reconciliation
I0906 20:18:11.722730       1 agent.go:253] starting reconciliation
I0906 20:18:21.721834       1 agent.go:253] starting reconciliation
I0906 20:18:31.725762       1 agent.go:253] starting reconciliation
I0906 20:18:41.721830       1 agent.go:253] starting reconciliation
I0906 20:18:51.721842       1 agent.go:253] starting reconciliation
I0906 20:19:01.725122       1 agent.go:253] starting reconciliation
I0906 20:19:11.722531       1 agent.go:253] starting reconciliation
I0906 20:19:21.725799       1 agent.go:253] starting reconciliation
I0906 20:19:31.721836       1 agent.go:253] starting reconciliation
I0906 20:19:31.721888       1 agent.go:440] updating node taints
I0906 20:19:41.721829       1 agent.go:253] starting reconciliation
I0906 20:19:51.725696       1 agent.go:253] starting reconciliation
I0906 20:20:01.721817       1 agent.go:253] starting reconciliation
I0906 20:20:11.721682       1 agent.go:253] starting reconciliation
I0906 20:20:21.721857       1 agent.go:253] starting reconciliation
I0906 20:20:31.725702       1 agent.go:253] starting reconciliation
I0906 20:20:41.721834       1 agent.go:253] starting reconciliation
I0906 20:20:51.725135       1 agent.go:253] starting reconciliation
I0906 20:21:01.721821       1 agent.go:253] starting reconciliation
I0906 20:21:11.721821       1 agent.go:253] starting reconciliation

If I use ipam.mode=kubernetes, I will see :

 k linstor node list                                                         
╭──────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType  ┊ Addresses                        ┊ State  ┊
╞══════════════════════════════════════════════════════════════╡
┊ n1   ┊ SATELLITE ┊ FD00:0:A:1000::C89E:3367 (SSL)   ┊ Online ┊
┊ n2   ┊ SATELLITE ┊ FD00:0:A:1000:1::E131:3367 (SSL) ┊ Online ┊
┊ n3   ┊ SATELLITE ┊ FD00:0:A:1000:2::A7CB:3367 (SSL) ┊ Online ┊
╰──────────────────────────────────────────────────────────────╯

which is the correct values from the pod CIDR. No quorum lost there.

I actually started noticing this when I turned on encryption strict mode in prod cluster. But after digging a bit I am thinking there is an issue with ipam.mode=clusterpool somehow. This severely breaks storage accross ALL clusters in 10.0.0.0/8 apparently. You have been warned.

Right now I am trying to figure out what is going on, so this Bug report will be very vague and I am posting here to see if anyone has any idea what is up with this.

How can we reproduce the issue?

deploy 2 clusters in adjacent networks with ipam.mode=clusterpool
deploy linstor-operator with full drbd config and encrypted dataplane on both
watch as over time your storage fabric melts.

Kernel Version

6.6.43-talos

Kubernetes Version

1.30.3

The text was updated successfully, but these errors were encountered:

bernardgut · 2024-09-07T09:00:42Z

I guess my first question here is this :

Does linstor/drbd supports operating two separate instances in the same subnet ?

(In my case it seems like it is two separate clusters pods subnets that accidentally happened to be able to talk to each-other through some Cilium config trick)

WanzenBug · 2024-09-09T05:57:07Z

Does linstor/drbd supports operating two separate instances in the same subnet ?

Sure, we don't do anything fancy with network. For LINSTOR/DRBD there are just the other cluster nodes that are identified by their IP address.

What you seem to be experiencing is that there are multiple Pods that are assigned the same IP address. I'm guessing it's that a Pod in cluster A is assigned the same IP address as a Pod in cluster B. This would explain the "unexpected connection from..." messages. This would then cause some weird states in DRBD I guess...

bernardgut · 2024-09-09T09:28:52Z

Update:
After testing It seems the following Cilium configs break DRBD quorum without recovery:

ipam.mode=clusterScope
encryption.enabled=True -> breaks after a while. Interestingly no unexpected connection from.. messages. The quorum is just lost without a lot of explanation.

Note: these problems don't appear in a virtual environment (In my case a proxmox/qemu machines) deployment of Cilium. Only on bare-metal.

For anyone reading this trying to deploy this in Prod you've been warned.

Re : confliction IP : I dont think two nodes had the same IP though... I think they just were assigned IP from the wrong pool but thats it. I can test again if I re-create another cluster on bare-metal but for now I have to move on with this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cilium Cluster Scope IPAM melts DRBD storage fabric #705

Cilium Cluster Scope IPAM melts DRBD storage fabric #705

bernardgut commented Sep 7, 2024 •

edited

Loading

bernardgut commented Sep 7, 2024

WanzenBug commented Sep 9, 2024

bernardgut commented Sep 9, 2024 •

edited

Loading

Cilium Cluster Scope IPAM melts DRBD storage fabric #705

Cilium Cluster Scope IPAM melts DRBD storage fabric #705

Comments

bernardgut commented Sep 7, 2024 • edited Loading

Is there an existing issue for this?

Version

What happened?

How can we reproduce the issue?

Kernel Version

Kubernetes Version

bernardgut commented Sep 7, 2024

WanzenBug commented Sep 9, 2024

bernardgut commented Sep 9, 2024 • edited Loading

bernardgut commented Sep 7, 2024 •

edited

Loading

bernardgut commented Sep 9, 2024 •

edited

Loading