Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-v0.6.x] fix: properly garbage collecting orphaned network interfaces (#642) #668

Merged
merged 3 commits into from
Jan 31, 2025

Conversation

Bryce-Soghigian
Copy link
Collaborator

@Bryce-Soghigian Bryce-Soghigian commented Jan 30, 2025

Description

This pr cherry-picks in the network interface garbage collection controller. #642

How was this change tested?

  • scale to 500 nodes and see all nics are removed eventually

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

Release Note


* feat: adding ListNics to instanceprovider interface alongside a refactor of arg related functions to their own file

* feat: adding garbage collection logic for network interfaces and refactoring gc functions slightly

* feat: working poc of ARG Queries and nic garbage collection, need to fix tests

* fix: tests failing due to ListVM

* feat: split nic and vm into their own gc controllers, added shared state between them to prevent conflicts in nic deletion calls

* feat: Add DeleteNic option to instance provider

* docs(code-comment): adding clarification to unremovableNics

* test: instanceprovider.ListNics barebones test and arg fake list nic impl

* fix: bootstrap.sh

* feat: adding in VM List into the networkinterface.garbagecollection controller to avoid attempts to delete nics that are managed by a vm

* refactor: unremovableNics out of the vm gc controller in favor for a cleaner state list

* fix:updating references to cache

* Update pkg/controllers/nodeclaim/garbagecollection/nic_garbagecollection.go

* test: adding composable network interface options to our test utils based on existing karp-core pattern

* test: adding test we don't include unmanaged nics

* test: adding network garbage collection suite and happy path

* test: adding tests for unremovable nics

* test: adding coverage that vm controller cleans up nics

* refactor: renaming controller

* fix: refactor name

* refactor: using import directly

* ci: checking error for controller

* fix: ci

* fix: addressing comments

* Update pkg/controllers/nodeclaim/garbagecollection/instance_garbagecollection.go

* refactor: removing name constant

* refactor: moving to test utils

* fix: removing GetZoneID

* style: improving the readability of the network interface garbage collection tests

* revert: removing lo.FromPtr checks for nodeclaim creation to avoid creating nodeclaims with empty required properties

* refactor: VirtualmachineProperties needs a default for time created

* fix: modifying the tests to be aware of time created to properly simulate vm gc

* refactor: added nodepoolName to test.VirtualMachine and test.Interface

* refactor: moving vm gc tests to use test.VirtualMachine

* refactor: renaming arg files to azureresourcegraph

* refactor: using deleteIfNicExists directly in cleanupAzureResources

* test: createNicFromQueryResponseData missing id, missing name and happy case

* refactor: using fake.Region for the default region for test.VirtualMachine and test.NetworkInterface

* fix: using input.InterfaceName rather than the outer scope interface name

* test: modifying fake to only initialize the query once
@Bryce-Soghigian
Copy link
Collaborator Author

TODO: Migrate from awslabs operatorpkg back to controller runtime to resolve go.mod versioning conflicts, just checking in my progress for today

@Bryce-Soghigian Bryce-Soghigian marked this pull request as ready for review January 31, 2025 07:52
@Bryce-Soghigian
Copy link
Collaborator Author

image

@Bryce-Soghigian Bryce-Soghigian changed the title [release v0.6.x] fix: properly garbage collecting orphaned network interfaces (#642) [release-v0.6.x] fix: properly garbage collecting orphaned network interfaces (#642) Jan 31, 2025
@tallaxes tallaxes added area/networking Issues or PRs related to networking area/error-handling Issues or PRs related to handling of errors labels Jan 31, 2025
@tallaxes tallaxes merged commit db330a7 into release-v0.6.x Jan 31, 2025
12 checks passed
@tallaxes tallaxes deleted the bsoghigian/nic-gc-cherry-pick branch January 31, 2025 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/error-handling Issues or PRs related to handling of errors area/networking Issues or PRs related to networking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants