-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORE-8617 BPF Without Disruption #2921
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@StevenTigera Ok, first pass is done!
The majority of the feedback here falls into one of a few categories:
- Golang coding conventions
- Better factoring the code for re-use, readability, and ongoing maintenance.
- Some minor bugs
Let me know if you have any questions at all on this and I'm happy to help work through the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments, but they are mostly small. This is looking much better!
I haven't looked at the test code in detail yet, that will be my next pass.
Expect(c.Create(ctx, fc)).NotTo(HaveOccurred()) | ||
|
||
// Act. | ||
_, err := utils.PatchFelixConfiguration(ctx, r.client, func(fc *crdv1.FelixConfiguration) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I think this isn't quite what I was imagining for this test. It's not wrong per-se, but it's doing more than it needs to do.
This test should probably be trimmed down to this:
fc := v3.FelixConfiguration{}
ds := DaemonSet{}
shouldUpdate := r.setDefaultsOnFelixConfiguration(cr, ds, fc, reqLogger)
Expect(shouldUpdate).To(Equal(true))
Expect(fc.Spec.BPFEnabled).To(Equal(true))
Basically, the tests in this file should be as focused on testing the specific functions in bpf.go
.
UTs for utils.PatchFelixConfiguration
should exist, but I won't ask you to add them in this PR. Really we want a separate suite of tests for that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once it's turned into that, it becomes much more clear that this test is actually just testing setDefaultsOnFelixConfiguration
, which should have its own Context() for a complete suite of tests (but it doesn't right now)
}) | ||
|
||
func createInstallation(c client.Client, ctx context.Context, dp operator.LinuxDataplaneOption) *operator.Installation { | ||
ca, err := tls.MakeCA("test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this CA actually needed? This code doesn't have anything to do with the CertificateManagement feature.
Expect(fc.Spec.BPFEnabled).To(BeNil()) | ||
}) | ||
|
||
It("should query calico-node DS in BPF dataplane and if DS status rolling out then verify rollout not complete", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is also largely testing PatchFelixConfiguration
- if we just want to test setBPFUpdatesOnFelixConfiguration
here, we don't need to create the installation, we don't need to create the DS, we don't need to create the FelixConfig (in fact the fake client isn't needed at all).
Like I implied in the other comment, testing for PatchFelixConfiguration
belongs in the utils
package. The tests here should primarily be around the functions in this package.
Expect(fc.Annotations[render.BPFOperatorAnnotation]).To(Equal("false")) | ||
}) | ||
|
||
It("should query calico-node DS in Iptables dataplane and steer Felix Config when bpfEnabled false", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "steer" means in this context as opposed to the other tests.
Expect(fc.Annotations[render.BPFOperatorAnnotation]).To(Equal("true")) | ||
}) | ||
|
||
It("should query FelixConfig annotation is nil and spec is nil then outcome is valid", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend splitting the different functions under test into separate contexts or describes. "outcome is valid" isn't very clear unless the higher level context explains which function is being test.
Context("updateBPFEnabledALlowed")
Context("setDefaultsOnFelixConfiguration") // This one probably belongs in core_controller_test.go
Context("setBPFUpdatesOnFelixConfiguration")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compare that to how it's laid out currently, where the Describe() and the Context() are essentially duplicate versions of "test BPF upgrade without disruption"
var _ = Describe("Testing BPF Upgrade without disruption during core-controller installation", func() {
...
Context("Reconcile tests BPF Upgrade without disruption", func() {
if (fc.Spec.BPFEnabled == nil || !(*fc.Spec.BPFEnabled)) && isRolloutComplete(ds) { | ||
err := setBPFEnabled(fc, bpfEnabledOnInstall) | ||
if err != nil { | ||
reqLogger.Error(err, "Unable to enable eBPF data plane") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I think we need to propagate this error back up to utils.PatchFelixConfiguration
. It means a slight tweak to the patchFn
definition
@@ -347,6 +350,33 @@ func installResourceCRD(c client.Client, mgr manager.Manager, ctx context.Contex | |||
err := c.Create(context.Background(), instance) | |||
Expect(err).NotTo(HaveOccurred()) | |||
|
|||
ns := &corev1.Namespace{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this change. This function is meant to be a helper for installing the Installation
CR - why does it also need to create the DaemonSet now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to revert this - it's a blanket change for all of the existing tests that AFAICT isn't really appropriate.
If you want to run an FV for this case, we need to have a separate test case targeted at it.
Description
Currently, transitioning from iptables to ebpf mode with the operator is disruptive. This is because the operator enables BPF mode using a rolling update of the calico-node daemonset (to add the relevant environment variable). However, clusters with some nodes in iptbales mode and some in BPF mode have broken networking. Therefore, when enabling eBPF mode, all should nodes transition at the same time so that there is minimal iptables/eBPF mode crossover.
Note: [as outlined in epic]
Once roll out is finished, cluster quickly transitions to eBPF mode. Some small disruption may occur as nodes transition within a few seconds of each other.
For PR author
make gen-files
make gen-versions
For PR reviewers
A note for code reviewers - all pull requests must have the following:
kind/bug
if this is a bugfix.kind/enhancement
if this is a a new feature.enterprise
if this PR applies to Calico Enterprise only.