Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit configuration interface #2815

Open
maxwolffe opened this issue Feb 28, 2023 · 7 comments
Open

Explicit configuration interface #2815

maxwolffe opened this issue Feb 28, 2023 · 7 comments

Comments

@maxwolffe
Copy link
Contributor

Moving some private discussion I've had with @alexeldeib into a public ticket to increase bus factor.

Is your feature request related to a problem?/Why is this needed
Our team uses a custom AKS image which has a few dependencies which are currently provided by AgentBaker's customnodedata:

We currently depend on the following files to be provided via nodecustomdata:
/etc/default/kubelet
/var/lib/kubelet/bootstrap-kubeconfig 
/etc/kubernetes/certs/ca.crt

Within those files we depend on:
/etc/default/kubelet:

  • node labels
  • kubelet command line parameters

/var/libe/kubelet/bootstrap-kubeconfg:

  • cluster-server:
name: localcluster
  cluster:
    certificate-authority: /etc/kubernetes/certs/ca.crt
    server: https://dev-azure-westus-xxx-cc782afe.hcp.westus.azmk8s
  • token
- name: kubelet-bootstrap
  user:
    token: "sbiizf.avcjfgfj5h3oni"
  • ca.crt - we depend on the whole file for the bootstrap kube-config.

We explicitly prevent the CustomScriptExtension from running by touching the /opt/azure/containers/provision.complete file which CSE checks prior to running. We don't want CSE to run because it does node level configuration which conflicts with our own.

Describe the solution you'd like in detail
Ideally, the interface in AgentBaker we come up with:

  • Allows us to test that these files are present prior to CSE running 
  • Allows us to confirm that CSE does not run if the following file is already set: /opt/azure/containers/provision.complete 

There are a few options which stand out to me:

  1. Write a test just confirming that those files are populated in nodedata
  2. Write a new file interface (bootstrap.cfg) which either:
    a. Has those files included in it
    b. Has the fields we need only included
  3. Some mechanism to fetch these dynamically from within the node, so we can completely remove the dependency on data provided by AgentBaker (though we’d shift the dependency to requiring that the fields are accessible somehow)

After discussions with @alexeldeib - there's a preference to not have the data provided via customdata.

Describe alternatives you've considered

Additional context
We've had a number of incidents due to us not having a clear contract around node configuration, so hoping to work with y'all to get one defined. Thanks in advance!

@juan-lee
Copy link
Collaborator

juan-lee commented Mar 7, 2023

@maxwolffe is preference to not have the data provided via customdata referring to your current file dependencies?

@yangl900
Copy link

yangl900 commented Mar 7, 2023

I think the preference is to have a bootstrap.yaml that placed by customdata, and contains:

  • ca.crt
  • bootstrap token
  • API server host name
  • Node labels

then the node can bootstrap itself via custom logic (and no CSE needed)

@maxwolffe
Copy link
Contributor Author

maxwolffe commented Mar 7, 2023

@maxwolffe is preference to not have the data provided via customdata referring to your current file dependencies?

Hey @juan-lee ! Apologies, the message is a little unclear. We don't have a strong preference for how the data is provided, just that it's provided in some way that does not depend on CSE running, is accessible to services on the node, and is testable via some Agentbaker E2E test or unit test.

@alexeldeib had suggested that providing the files via customnodedata was not preferred because it's more challenging to update if parameters need to change.

Some possibilities include:

  • bootstrap.yml copied onto the node via customnodedata or userdata and placed in a consistent location. (as Anders mentioned) with our required fields - this would be our preferred option I think, since it seems easy to unit test.
  • The entire files in which the data is currently present (/etc/default/kubelet, /var/libe/kubelet/bootstrap-kubeconfg, etc) placed in a consistent location with tests.
  • Some mechanism for us to fetch the data at node bootstrap time (via IMDS or a clear API).

Happy to hop on a teams call or google meet to discuss if you'd like more context.

@juan-lee
Copy link
Collaborator

juan-lee commented Mar 8, 2023

Thanks for the explanation @maxwolffe. I'll sync up with @alexeldeib.

@maxwolffe
Copy link
Contributor Author

Thanks for your help adding the above test and #3104 unit test (was just finally looking into how to do this myself when I saw UtheMan's fix). Appreciate your help to add this protection.

@alexeldeib
Copy link
Contributor

@maxwolffe I swear there was one more file I couldn't recall. I checked our messages but I only found those four. Do you know if I missed one?

also, expect some potential movement on these soon, which was part of the reason for the test ;)

@maxwolffe
Copy link
Contributor Author

maxwolffe commented May 1, 2023

I think we have one dumb dependency which we'll soon be able to remove - kubelet.service - opened a PR to add it to the interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants