Skip to content

Commit

Permalink
Add scrips for migrating from other CNI to Antrea
Browse files Browse the repository at this point in the history
Add scrtip and antctl subcommand "migrate" to migrate from other
CNI(Calico, flannel) to Antrea. It also supports conditional
NetworkPolicy conversion.

Signed-off-by: hjiajing <[email protected]>
  • Loading branch information
hjiajing committed Nov 8, 2023
1 parent 82627e3 commit 3f8c169
Show file tree
Hide file tree
Showing 14 changed files with 1,253 additions and 8 deletions.
61 changes: 61 additions & 0 deletions build/images/scripts/restart_sandbox
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/usr/bin/env bash

# Copyright 2022 Antrea Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

set -e

SANDBOX_ID_ANNOTATION="io.kubernetes.cri.sandbox-id"
CONTAINERD_BINARIES_URL="https://github.com/containerd/containerd/releases/download/v1.7.7/containerd-1.7.7-linux-amd64.tar.gz"

# Remove rules of other CNI in CHAIN CNI-HOSTPORT-DNAT
iptables -t nat -F CNI-HOSTPORT-DNAT || true
chains=$(iptables -t nat -L | grep CNI-DN | grep -v "antrea" | awk '{print $2}')
for chain in $chains; do
iptables -t nat -X "$chain" || true
done

wget $CONTAINERD_BINARIES_URL
tar xvf containerd-1.7.7-linux-amd64.tar.gz
mv ./bin/ctr /usr/local/bin/ctr

pause_container_ids=$(ctr -n k8s.io containers ls | grep -v "CONTAINER" | grep "registry.k8s.io/pause" | awk '{print $1}')
# If the container's linux.namespaces fields contains "network", it means that the container uses CNI network.
# In order to switch CNI, we need to kill the corresponding tasks to restart the Pods.
for container_id in $pause_container_ids; do
container_info=$(ctr -n k8s.io containers info "$container_id" --spec)
sandbox_id=$(echo "$container_info" | jq -r .annotations.\"$SANDBOX_ID_ANNOTATION\")
namespaces=$(echo "$container_info" | jq .linux.namespaces | jq -c -r '.[]')
for namespace in $namespaces; do
if [[ "$(echo "$namespace" | jq .type)" == "\"network\"" ]]; then
echo "Container $container_id uses CNI network, kill the corresponding task"
ctr -n k8s.io tasks kill "$sandbox_id" || true
ctr -n k8s.io container remove "$container_id" || true
fi
done
done

# After restart, some containers may be dangling without a pause container, remove them
tasks=$(ctr -n k8s.io tasks list -q)
container_ids=$(ctr -n k8s.io containers ls | grep -v "CONTAINER" | grep -v "registry.k8s.io/pause" | awk '{print $1}')
for container_id in $container_ids; do
sandbox_id=$(ctr -n k8s.io containers info "$container_id" --spec | jq -r .annotations.\"$SANDBOX_ID_ANNOTATION\")
if [[ ! "$tasks" =~ $sandbox_id ]]; then
echo "Container $container_id has no corresponding task, remove it"
ctr -n k8s.io tasks kill "$container_id" || true
fi
done

rm /host/etc/cni/net.d/10-calico.conflist || true
rm /host/etc/cni/net.d/10-flannel.conflist || true
59 changes: 59 additions & 0 deletions build/yamls/antrea-migrator.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
kind: DaemonSet
apiVersion: apps/v1
metadata:
labels:
app: antrea
component: antrea-migrator
name: antrea-migrator
namespace: kube-system
spec:
selector:
matchLabels:
app: antrea
component: antrea-migrator
template:
metadata:
labels:
app: antrea
component: antrea-migrator
spec:
hostPID: true
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- effect: NoSchedule
operator: Exists
- effect: NoExecute
operator: Exists
serviceAccountName: antrea-agent
volumes:
- name: containerd
hostPath:
path: /run/containerd
initContainers:
# initContainer will kill all sandboxes container
- name: antrea-migrator-init
image: antrea/antrea-ubuntu:latest
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
capabilities:
add:
# SYS_MODULE is required to load the OVS kernel module.
- SYS_MODULE
command:
- restart_sandbox
volumeMounts:
- mountPath: /run/containerd
name: containerd
containers:
# If the all antrea-migrator Pods are running, it means that all initContainers have been completed
- image: antrea/antrea-ubuntu:latest
imagePullPolicy: IfNotPresent
name: antrea-migrator
command:
- "sleep"
- "infinity"
85 changes: 85 additions & 0 deletions docs/migrate-to-antrea.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Migrate from other CNI to Antrea

This document describes how to migrate from other CNIs to Antrea. Currently, we support migration from Calico and
Flannel.
As for Calico, NetworkPolicy conversion is conditional supported. As for Flannel, only Pod networking migration is
supported.

With the help of migration [script](../hack/migrate-to-antrea.sh) ant `antctl` subcommand `migrate`, the migration
process is fully automated.
The migration process is divided into four steps:

1. Install Antrea in the cluster.
2. If necessary, convert NetworkPolicy from other CNI to Antrea.
3. Restart all Pods in the cluster in-place.
4. Uninstall the old CNI.

## Install Antrea

The script will check the requirement of CNI migration at first. If we are migrating from Calico, the script will list
all Calico NetworkPolicy to check if there is any unsupported feature. If we are migrating from Flannel, the script will
check if Antrea is installed in the cluster. If Antrea is already installed, the script will exit with an error message.

## Convert NetworkPolicy

If the old CNI is Calico, the script will convert Calico NetworkPolicy to Antrea NetworkPolicy. The conversion is based
on
Calico NetworkPolicy CRD, so the script will check if the Calico APIServer is running in the cluster. If the Calico
APIServer
is not running, the script will exit with an error message. If the Calico APIServer is running, the script will convert
Calico
NetworkPolicy to Antrea NetworkPolicy using `antctl migrate convert-networkpolicy`. If the conversion fails, the script
will exit
with an error message.

Please Note: Not all Calico NetworkPolicy features are supported by Antrea NetworkPolicy. Please refer to the following
table for
the unsupported features:

| Calico NetworkPolicy Feature | Antrea NetworkPolicy Support |
|---------------------------------------------------------|------------------------------|
| "or" expression in any selector | NOT |
| "()" expression in any selector | NOT |
| "starts with" expression in any selector | NOT |
| "ends with" expression in any selector | NOT |
| `spec.PreDNAT` | NOT |
| `spec.ApplyOnForward` | NOT |
| `spec.DoNotTrack` | NOT |
| `spec.Ingress.ICMP` or `spec.Egress.ICMP` | NOT |
| `spec.Ingress.NotICMP` or `spec.Egress.NotICMP` | NOT |
| `spec.Ingress.NotProtocol` or `spec.Egress.NotProtocol` | NOT |
| `spec.Ingress.NotPorts` or `spec.Egress.NotPorts` | NOT |
| `spec.Ingress.Metadata` or `spec.Egress.Metadata` | NOT |
| `spec.Ingress.NotSelector` or `spec.Egress.NotSelector` | NOT |
| `spec.Ingress.HTTP` or `spec.Egress.HTTP` | NOT |
| `spec.Ingress.Nets` or `spec.Egress.Nets` | NOT |
| `spec.Ingress.NotNets` or `spec.Egress.NotNets` | NOT |
| All other features | YES |

## Restart Pods

After Antrea is installed in the cluster, the script will restart all Pods in the cluster in-place by deploying a
DaemonSet named `antrea-migrator`, which will run a Pod on each Node. The Pod will kill all Pods' containerd task on the
Node, and the containerd task will be restarted by the containerd service. In this way, all Pods in the cluster will be
restarted
in-place and do not need to be rescheduled and recreated.

The restart result is as follows:

```bash
$ kubectl get pod -n migrate-test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
migrate-exmaple-6d6b97f96b-29qbq 1/1 Running 1 (24s ago) 2m5s 10.10.1.3 test-worker <none> <none>
migrate-exmaple-6d6b97f96b-dqx2g 1/1 Running 1 (23s ago) 2m5s 10.10.1.6 test-worker <none> <none>
migrate-exmaple-6d6b97f96b-jpflg 1/1 Running 1 (23s ago) 2m5s 10.10.1.5 test-worker <none> <none>
```

## Uninstall old CNI

After all Pods are restarted, the script will uninstall the old CNI by using `kubectl delete -f <old-cni-yaml>`. If the
old CNI
is Calico, the script will also delete the Calico iptables rules on each Node by following commands:

```bash
kubectl exec -n kube-system {ANTREA_AGENT} -- /bin/bash -c 'iptables-save | grep -v cali | iptables-restore' --kubeconfig $KUBECONFIG
```
5 changes: 3 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ require (
github.com/onsi/ginkgo/v2 v2.13.0
github.com/onsi/gomega v1.29.0
github.com/pkg/sftp v1.13.6
github.com/projectcalico/api v0.0.0-20230602153125-fb7148692637
github.com/prometheus/client_golang v1.17.0
github.com/prometheus/common v0.45.0
github.com/sirupsen/logrus v1.9.3
Expand Down Expand Up @@ -74,7 +75,7 @@ require (
k8s.io/component-base v0.26.4
k8s.io/klog/v2 v2.100.1
k8s.io/kube-aggregator v0.26.4
k8s.io/kube-openapi v0.0.0-20221012153701-172d655c2280
k8s.io/kube-openapi v0.0.0-20230303024457-afdc3dddf62d
k8s.io/kubectl v0.26.4
k8s.io/kubelet v0.26.4
k8s.io/utils v0.0.0-20230209194617-a36077c30491
Expand Down Expand Up @@ -140,7 +141,7 @@ require (
github.com/google/cel-go v0.12.6 // indirect
github.com/google/gnostic v0.5.7-v3refs // indirect
github.com/google/go-cmp v0.6.0 // indirect
github.com/google/gofuzz v1.1.0 // indirect
github.com/google/gofuzz v1.2.0 // indirect
github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1 // indirect
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 // indirect
github.com/gregjones/httpcache v0.0.0-20180305231024-9cad4c3443a7 // indirect
Expand Down
9 changes: 6 additions & 3 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -605,8 +605,9 @@ github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-containerregistry v0.5.1/go.mod h1:Ct15B4yir3PLOP5jsy0GNeYVaIZs/MK/Jz5any1wFW0=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/gofuzz v1.1.0 h1:Hsa8mG0dQ46ij8Sl2AYJDUv1oA9/d6Vk+3LG99Oe02g=
github.com/google/gofuzz v1.1.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/gofuzz v1.2.0 h1:xRy4A+RhZaiKjJ1bPfwQ8sedCA+YS2YcCHW6ec7JMi0=
github.com/google/gofuzz v1.2.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs=
github.com/google/martian/v3 v3.0.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
github.com/google/martian/v3 v3.1.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
Expand Down Expand Up @@ -960,6 +961,8 @@ github.com/pkg/sftp v1.13.6/go.mod h1:tz1ryNURKu77RL+GuCzmoJYxQczL3wLNNpPWagdg4Q
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/pquerna/cachecontrol v0.0.0-20171018203845-0dec1b30a021/go.mod h1:prYjPmNq4d1NPVmpShWobRqXY3q7Vp+80DqgxxUrUIA=
github.com/projectcalico/api v0.0.0-20230602153125-fb7148692637 h1:F48and+6vKJsRMl95Y/XKVik0Kwhos8YShTH9Fsdqlw=
github.com/projectcalico/api v0.0.0-20230602153125-fb7148692637/go.mod h1:d3yVTVhVHDawgeKrru/ZZD8QLEtiKQciUaAwnua47Qg=
github.com/prometheus/client_golang v0.0.0-20180209125602-c332b6f63c06/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw=
github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw=
github.com/prometheus/client_golang v0.9.3/go.mod h1:/TN21ttK/J9q6uSwhBd54HahCDft0ttaMvbicHlPoso=
Expand Down Expand Up @@ -1841,8 +1844,8 @@ k8s.io/kube-openapi v0.0.0-20200121204235-bf4fb3bd569c/go.mod h1:GRQhZsXIAJ1xR0C
k8s.io/kube-openapi v0.0.0-20200410145947-61e04a5be9a6/go.mod h1:GRQhZsXIAJ1xR0C9bd8UpWHZ5plfAS9fzPjJuQ6JL3E=
k8s.io/kube-openapi v0.0.0-20200805222855-6aeccd4b50c6/go.mod h1:UuqjUnNftUyPE5H64/qeyjQoUZhGpeFDVdxjTeEVN2o=
k8s.io/kube-openapi v0.0.0-20201113171705-d219536bb9fd/go.mod h1:WOJ3KddDSol4tAGcJo0Tvi+dK12EcqSLqcWsryKMpfM=
k8s.io/kube-openapi v0.0.0-20221012153701-172d655c2280 h1:+70TFaan3hfJzs+7VK2o+OGxg8HsuBr/5f6tVAjDu6E=
k8s.io/kube-openapi v0.0.0-20221012153701-172d655c2280/go.mod h1:+Axhij7bCpeqhklhUTe3xmOn6bWxolyZEeyaFpjGtl4=
k8s.io/kube-openapi v0.0.0-20230303024457-afdc3dddf62d h1:VcFq5n7wCJB2FQMCIHfC+f+jNcGgNMar1uKd6rVlifU=
k8s.io/kube-openapi v0.0.0-20230303024457-afdc3dddf62d/go.mod h1:y5VtZWM9sHHc2ZodIH/6SHzXj+TPU5USoA8lcIeKEKY=
k8s.io/kubectl v0.26.4 h1:A0Oa0u/po4KxXnXsNCOwLojAe9cQR3TJNJabEIf7U1w=
k8s.io/kubectl v0.26.4/go.mod h1:cWtp/+I4p+h5En3s2zO1zCry9v3/6h37EQ2tF3jNRnM=
k8s.io/kubelet v0.26.4 h1:SEQPfjN4lu4uL9O8NdeN7Aum3liQ4kOnp/yC3jMRMUo=
Expand Down
Loading

0 comments on commit 3f8c169

Please sign in to comment.