-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Centralized leaked ENI cleanup- refactor periodic cleanup & add node termination cleaner #505
base: eni-cleanup
Are you sure you want to change the base?
Conversation
…termination cleaner
func (e *ClusterENICleaner) Start(ctx context.Context) error { | ||
e.Log.Info("starting eni clean up routine") | ||
|
||
// Start routine to listen for shut down signal, on receiving the signal it set shutdown to true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think we need this comments which just describes code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, this was just copied over from the existing eni_cleanup file.
VpcCniAvailableClusterENICnt = prometheus.NewGauge( | ||
prometheus.GaugeOpts{ | ||
Name: "vpc_cni_created_available_eni_count", | ||
Help: "The number of available ENIs created by VPC-CNI that will tried to be deleted by the controller", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename this to vpc_cni_created_pending_deletion_eni_count?. here it looks like just the number of eni created by vpc_cni.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
available
here indicates the state of the network interface created by the corresponding component. "pending delete" is not accurate as the ENI may not be necessarily deleted if it is re-attached to an instance.
VpcRcAvailableClusterENICnt = prometheus.NewGauge( | ||
prometheus.GaugeOpts{ | ||
Name: "vpc_rc_created_available_eni_count", | ||
Help: "The number of available ENIs created by VPC-RC that will tried to be deleted by the controller", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
}() | ||
// Perform ENI cleanup after fixed time intervals till shut down variable is set to true on receiving the shutdown | ||
// signal | ||
for !e.shutdown { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can still result in race condition.
Sequence for loop check can happen first and then shutdown gets sets to true.
It would be better if we use for select
pattern here.
}) | ||
if err != nil { | ||
if !strings.Contains(err.Error(), ec2Errors.NotFoundInterfaceID) { // ignore InvalidNetworkInterfaceID.NotFound error | ||
// append err and continue, we will retry deletion in the next period/reconcile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully we move to comparing error type in sdkgo v2
if !strings.Contains(err.Error(), ec2Errors.NotFoundInterfaceID) { // ignore InvalidNetworkInterfaceID.NotFound error | ||
// append err and continue, we will retry deletion in the next period/reconcile | ||
leakedENICount += 1 | ||
errors = append(errors, fmt.Errorf("failed to delete leaked network interface %v:%v", *nwInterface.NetworkInterfaceId, err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will count eni that are being re tried too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on this? In each periodic cleanup routine, this is the number of leaked ENIs that failed to be deleted.
If these ENIs are still in leaked state, then yes, deletion will be re-tried on the next cleanup. And if it still fails deletion, it will be counted as leaked ENI which is the intended behavior. Do you have any concerns on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I see we set metric instead of addition after every run. We should be good then. It will just count leaked eni for particular run only.
@@ -238,12 +238,14 @@ func (b *branchENIProvider) ProcessAsyncJob(job interface{}) (ctrl.Result, error | |||
|
|||
// DeleteNode deletes all the cached branch ENIs associated with the trunk and removes the trunk from the cache. | |||
func (b *branchENIProvider) DeleteNode(nodeName string) (ctrl.Result, error) { | |||
trunkENI, isPresent := b.getTrunkFromCache(nodeName) | |||
_, isPresent := b.getTrunkFromCache(nodeName) | |||
if !isPresent { | |||
return ctrl.Result{}, fmt.Errorf("failed to find node %s", nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this error be requeued? If yes, I wonder if this is a recoverable error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be re-queued since it returns an error, but jobs are re-tried max 5 times and then forgotten.
I wonder if this is a recoverable error
That's a good point, if the trunk is removed from cache, it will not be added back unless the node is initialized again. Will check if this is a valid case. Btw this is existing logic, the only change is we omit calling DeleteAllBranchENIs
in this workflow as finalizer routine will delete the ENIs on node termination.
@@ -822,3 +909,19 @@ func (e *ec2Wrapper) getRegionalStsEndpoint(partitionID, region string) (endpoin | |||
} | |||
return res, nil | |||
} | |||
|
|||
func (e *ec2Wrapper) DisassociateTrunkInterface(input *ec2.DisassociateTrunkInterfaceInput) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, the naming is a bit confusing. Looks like it is about disassociating branch interface from trunk interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, this is the API call to remove association between branch and trunk network interfaces. Adding this per EC2 team recommendation to call DisassociateTrunkInterface
before deleting the ENIs.
@@ -126,6 +124,8 @@ type trunkENI struct { | |||
uidToBranchENIMap map[string][]*ENIDetails | |||
// deleteQueue is the queue of ENIs that are being cooled down before being deleted | |||
deleteQueue []*ENIDetails | |||
// nodeName tag is the tag added to trunk and branch ENIs created on the node | |||
nodeNameTag []*awsEC2.Tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me why we need node name tag for trunk ENI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the node termination cleaner (finalizer routine), we use the tag name to reduce the scope of ENIs to delete per node.
Per offline discussion, this should be updated to use subnet IDs instead and we can remove the nodeName tag.
start := time.Now() | ||
// Using the instance role | ||
_, err := e.instanceServiceClient.DisassociateTrunkInterface(input) | ||
ec2APICallLatencies.WithLabelValues("disassociate_branch_from_trunk").Observe(timeSinceMs(start)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to get SDK's metrics for free. Maybe a feature for v2?
VPCID: vpcID, | ||
}).SetupWithManager(ctx, mgr, healthzHandler); err != nil { | ||
} | ||
cleaner.ENICleaner = &eniCleaner.ENICleaner{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is cluster ENICleaner for cluster wide 30 minutes cleanup? I am trying to understand the pattern here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is right. ClusterENICleaner
is used in the periodic cleanup routine.
In the next PR, I will add the NodeTerminationCleaner
which is used in the finalizer routine.
Description of changes:
PR 2, adding support for Centralize leaked ENI cleanup in the controller to delete leaked ENIs provisioned by the controller and VPC-CNI:
DisassociateTrunkInterface
, call DisassociateTrunkInterface to remove association between branch and trunk ENI before deleting the branch ENI as per recommendation by EC2 team.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.