Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve proto generation #5312

Merged
merged 12 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed .github/images/scriptlessBootstrapFlow.png
Binary file not shown.
5 changes: 3 additions & 2 deletions .github/workflows/buf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@ on:
paths:
- "aks-node-controller/proto/**"
- "aks-node-controller/buf.yaml"
- "aks-node-controller/buf.gen.yaml"
- ".github/workflows/buf.yaml"
pull_request:
types: [opened, synchronize, reopened, labeled, unlabeled]
paths:
- "aks-node-controller/proto/**"
- "aks-node-controller/buf.yaml"
- "aks-node-controller/buf.gen.yaml"
- ".github/workflows/buf.yaml"
permissions:
contents: read
pull-requests: write
Expand Down
11 changes: 0 additions & 11 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -89,17 +89,6 @@ validate-image-version:
generate-kubelet-flags:
@./e2e/kubelet/generate-kubelet-flags.sh

.PHONY: lint-proto-files
lint-proto-files:
@(cd aks-node-controller && ../hack/tools/bin/buf lint)
@(cd aks-node-controller && ../hack/tools/bin/buf breaking --against '../.git#branch=dev') # TODO: change to master

.PHONY: compile-proto-files
compile-proto-files:
@(cd aks-node-controller && ../hack/tools/bin/buf format -w)
@(cd aks-node-controller && ../hack/tools/bin/buf generate)
$(MAKE) lint-proto-files

.PHONY: generate-manifest
generate-manifest:
./hack/tools/bin/cue export ./schemas/manifest.cue > ./parts/linux/cloud-init/artifacts/manifest.json
Expand Down
16 changes: 16 additions & 0 deletions aks-node-controller/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Run buf in docker, mounting the full repo into the container
# Emulate running "buf" in the current directory
BUF = docker run --volume "$(CURDIR)/../:$(CURDIR)/../" --workdir $(CURDIR) bufbuild/buf:1.47.2

.PHONY: proto-generate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to manually run make proto-generate before pushing the commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, CI may complain about it. But I didn't test it.

proto-generate:
@($(BUF) format -w)
rm -rf pkg/gen/aksnodeconfig/v1
docker build --platform $(shell uname -m) -t protoc-docker - < protoc.Dockerfile
docker run --rm -v $(shell pwd):/$(shell pwd) --workdir=$(shell pwd) protoc-docker protoc --go_opt=module=github.com/Azure/agentbaker/aks-node-controller --go_out=./ --proto_path=proto $(shell find proto/aksnodeconfig/v1 -name '*.proto')
$(MAKE) proto-lint

.PHONY: proto-lint
proto-lint:
@($(BUF) lint)
@($(BUF) breaking --against '../.git#branch=dev,subdir=aks-node-controller') # TODO: change to master
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint: Not sure if there is a good way to point to default so that we don't need to switch between master and dev every year.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't find anything in the buf docs, I don't think it's supported. But CI behaves differently, it checks target branch for PRs.

90 changes: 57 additions & 33 deletions aks-node-controller/README.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,58 @@
# AKS Node Controller

This directory contains files related to AKS Node Controller go binary.

## Overview

AKS Node Controller is a go binary that is responsible for bootstrapping AKS nodes. The controller expects a predefined contract from the client of type [`AKSNodeConfig`](https://github.com/Azure/AgentBaker/tree/dev/pkg/proto/aksnodeconfig/v1) containing the bootstrap configuration. The controller has two primary functions: 1. Parse the bootstrap config and kickstart bootstrapping and 2. Monitor the completion status.
AKS Node Controller is a go binary that is responsible for bootstrapping AKS nodes. The controller expects a predefined contract from the client of type [`aksnodeconfigv1.Configuration`](pkg/gen/aksnodeconfig/v1).

AKS Node Controller relies on two Azure mechanisms to inject the necessary data at provisioning time for bootstrapping: [`Custom Script Extension (CSE)`](https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/custom-script-linux) and [`Custom Data`](https://learn.microsoft.com/en-us/azure/virtual-machines/custom-data}). The bootstrapper should use `GetNodeBootstrapping` which returns the corresponding `CustomData` and `CSE` based on the given `AKSNodeConfig`. For guidance on populating the config, please refer to this [doc](https://github.com/Azure/AgentBaker/tree/dev/pkg/proto/aksnodeconfig/v1).
AKS Node Controller relies on two Azure mechanisms for injecting the necessary bootstrap data during provisioning: [`Custom Script Extension (CSE)`](https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/custom-script-linux) and [`Custom Data`](https://learn.microsoft.com/en-us/azure/virtual-machines/custom-data}). The bootstrapper should use `GetNodeBootstrapping` which returns the corresponding `CustomData` and `CSE` based on the given `AKSNodeConfig`. For guidance on populating the config, please refer to this [doc](https://github.com/Azure/AgentBaker/tree/dev/pkg/proto/aksnodeconfig/v1).

## Usage

Here is an example on how to retrieve node bootstrapping params and pass in the returned `CSE` and `CustomData` to CRP API for creating a VMSS instance.
Here is an example of how to retrieve node bootstrapping parameters and use the returned `CSE` and `CustomData` for creating a Virtual Machine Scale Set (VMSS) instance via the CRP API.

```go
builder := aksnodeconfigv1.NewAKSNodeConfigBuilder()
builder.ApplyConfiguration(aksNodeConfig)
nodeBootstrapping, err = builder.GetNodeBootstrapping()
config := &aksnodeconfigv1.Configuration{
Version: "v0",
// fill in the rest of the fields
}
customData, err := nodeconfigutils.CustomData(config)
if err != nil {
return err
}

cse := nodeconfigutils.CSE

model := armcompute.VirtualMachineScaleSet{
Properties: &armcompute.VirtualMachineScaleSetProperties{
VirtualMachineProfile: &armcompute.VirtualMachineScaleSetVMProfile{
OSProfile: &armcompute.VirtualMachineScaleSetOSProfile{
CustomData: &nodeBootstrapping.CustomData,
...
}
},
VirtualMachineProfile: &armcompute.VirtualMachineScaleSetVMProfile{
Extensions: []*armcompute.VirtualMachineScaleSetExtension{
{
Name: to.Ptr("vmssCSE"),
Properties: &armcompute.VirtualMachineScaleSetExtensionProperties{
Publisher: to.Ptr("Microsoft.Azure.Extensions"),
Type: to.Ptr("CustomScript"),
TypeHandlerVersion: to.Ptr("2.0"),
AutoUpgradeMinorVersion: to.Ptr(true),
Settings: map[string]interface{}{},
ProtectedSettings: map[string]interface{}{
"commandToExecute": nodeBootstrapping.CSE,
CustomData: &customData,
},
ExtensionProfile: &armcompute.VirtualMachineScaleSetExtensionProfile{
Extensions: []*armcompute.VirtualMachineScaleSetExtension{
{
Name: to.Ptr("vmssCSE"),
Properties: &armcompute.VirtualMachineScaleSetExtensionProperties{
Publisher: to.Ptr("Microsoft.Azure.Extensions"),
Type: to.Ptr("CustomScript"),
TypeHandlerVersion: to.Ptr("2.0"),
AutoUpgradeMinorVersion: to.Ptr(true),
Settings: map[string]interface{}{},
ProtectedSettings: map[string]interface{}{
"commandToExecute": cse,
},
},
},
},
}
},
},
...
}
},
}
```

### Extracting Provision Status

The provision status can be extracted from the CSE response. CSE takes the stdout from the bootstrap scripts which contains information in the form [`datamodel.CSEStatus`](https://github.com/Azure/AgentBaker/blob/dev/pkg/agent/datamodel/types.go#L2189). You can find an example of how to parse the output [here](https://github.com/Azure/AgentBaker/blob/dev/e2e/scenario_helpers_test.go#L163).
The provision status can be extracted from the CSE response. CSE takes the stdout from the bootstrap scripts which contains information in the form [`datamodel.CSEStatus`](https://github.com/Azure/AgentBaker/blob/dev/pkg/agent/datamodel/types.go#L2189).

Here is an example response return by CSE:
```
Expand Down Expand Up @@ -81,9 +84,9 @@ Here is an example response return by CSE:

Here is an indepth explanation of the provisioning flow. Upon first startup, CustomData is made available to the VM, after which cloud-init is able to process the content, in this case, writing the bootstrap config to disk. The binary is triggered by a systemd unit, [`aks-node-controller.service`](https://github.com/Azure/AgentBaker/blob/dev/parts/linux/cloud-init/artifacts/aks-node-controller.service) which is automatically run once cloud-init is complete. In this way, we are ensuring the bootstrapping config is present on the node and can proceeed to run the go binary to start the bootstrapping process.

The content of `CustomData` and `CSE` that is returned by `GetNodeBootstrapping` is as follows:
Clients need to provide CSE and Custom Data. [nodeconfigutils](pkg/nodeconfigutils) module contains helpers for generating these values.

1. Custom Data: Contains base64 encoded bootstrap configuration of type [`AKSNodeConfig`](https://github.com/Azure/AgentBaker/tree/dev/pkg/proto/aksnodeconfig/v1) in json format which is placed on the node through cloud-init write directive.
1. Custom Data: Contains base64 encoded bootstrap configuration of type [aksnodeconfigv1.Configuration](pkg/gen/aksnodeconfig/v1) in json format which is placed on the node through cloud-init write directive.

Format:
```yaml
Expand All @@ -96,19 +99,40 @@ write_files:
{{ encodedAKSNodeConfig }}`
```

2. CSE: Script used to poll bootstrap status and return exit status once complete.
2. CSE: Script used to poll bootstrap status and return exit status once complete.

CSE script: `/opt/azure/containers/aks-node-controller provision-wait`


#### Provisioning flow diagram:

![provisionFlowDiagram](../.github/images/scriptlessBootstrapFlow.png)
```mermaid
sequenceDiagram
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great we can 'write' a diagram!

participant Client as Client
participant ARM as Azure Resource Manager (ARM)
participant VM as Virtual Machine (VM)

Client->>ARM: Request to create VM<br/>with CustomData & CSE
ARM->>VM: Deploy config.json<br/>(CustomData)
note over VM: cloud-init handles<br/>config.json deployment

note over VM: cloud-init completes processing
note over VM: Start aks-node-controller.service (systemd service)<br/> after cloud-init
VM->>VM: Run aks-node-controller<br/>(Go binary) in provision mode<br/>using config.json

ARM->>VM: Initiate aks-node-controller (Go binary)<br/>in provision-wait mode via CSE

loop Monitor provisioning status
VM->>VM: Check /opt/azure/containers/provision.complete
end

VM->>Client: Return CSE status with<br/>/var/log/azure/aks/provision.json content
```

Key components:

1. `aks-node-controller.service`: systemd unit that is triggered once cloud-init is complete (guaranteeing that config is present on disk) and then kickstarts bootstrapping.
2. `aks-node-controller` go binary with two modes:

- **provision**: parses the node config and triggers bootstrap process
- **provision-wait**: waits for `provision.complete` to be present and reads `provision.json` which contains the provision output of type `CSEStatus` and is returned by CSE through capturing stdout
- **provision-wait**: waits for `provision.complete` to be present and reads `provision.json` which contains the provision output of type `CSEStatus` and is returned by CSE through capturing stdout
11 changes: 0 additions & 11 deletions aks-node-controller/buf.gen.yaml

This file was deleted.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 9 additions & 17 deletions aks-node-controller/pkg/gen/aksnodeconfig/v1/auth_config.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading