-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test: Implemented ginkgo test against local kubeconfig #2015
Conversation
✅ Deploy Preview for karpenter-docs-prod canceled.
|
4fa3c65
to
19db495
Compare
0210f59
to
cbe1b62
Compare
test/pkg/environment/expectations.go
Outdated
for _, namespace := range namespaces.Items { | ||
wg.Add(1) | ||
go func(object client.Object, namespace string) { | ||
env.Expect(env.Client.DeleteAllOf(env, object, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add &client.DeleteAllOfOptions{DeleteOptions: client.DeleteOptions{GracePeriodSeconds: ptr.Int64(0)}})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted this to be closer to natural usage of clusters and include normal deletion flows rather than force deletion.
b468169
to
5a6c55a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s discuss how we want to separate tasks, clusterTasks, and pipelines in the filesystem. If someone’s importing all the files from GitHub this is quite easy and nice to find the file definitions in the repo as you've done it.
As the amount of configs increase: if someone’s doing this manually, they have to figure out how to find all related tasks then apply them manually. I think we should keep ClusterTasks in their own ClusterTasks folder and then any tasks in their respective suites. WDYT?
helm upgrade --install --namespace karpenter --create-namespace \ | ||
karpenter karpenter/karpenter \ | ||
--version ${KARPENTER_VERSION} \ | ||
helm upgrade --install --namespace karpenter --create-namespace karpenter --version v0.13.1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's stick to v0.9.0 here. That's the highest version of Karpenter tested in KIT right now while they figure out this issue. Vending the Karpenter installation for KIT can be possible in the future and where they would establish the Karpenter version we could use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really bad precedent to set. We can't be behind on karpenter in karpenter testing. Can't we set a higher ttl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vending the Karpenter installation for KIT can be possible in the future and where they would establish the Karpenter version we could use.
Is this something like kitctl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly. If this is an issue with pre-binding then I want to be able to figure this out. This was a fairly large change in assumptions for Karpenter so I don't think we'll be sticking here for long. We use a TTL of 180 right now.
- name: karpenter-version | ||
default: latest | ||
- name: account-id | ||
default: "767520670908" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this
kind: ClusterConfig | ||
metadata: | ||
name: $(params.cluster-name) | ||
version: 1.22 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eksctl v0.86.0 doesn't work with 1.22
withOIDC: true | ||
EOF | ||
- name: kubeconfig | ||
image: docker.io/njtran/scratch:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use
image: docker.io/njtran/scratch:latest | |
image: public.ecr.aws/d2q6r4b4/minimal:latest |
TEMPOUT=$(mktemp) | ||
curl -fsSL https://karpenter.sh/"$(params.karpenter-version)"/getting-started/getting-started-with-eksctl/cloudformation.yaml > $TEMPOUT \ | ||
&& aws cloudformation deploy \ | ||
--stack-name "Karpenter-$(params.cluster-name)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to collide with our karpenter installation if using the same cluster name as the management cluster. Can we make it the following? It may be better to change up the host-cluster setup too so that we don't run into the chance of IAM Role name conflicts.
--stack-name "Karpenter-$(params.cluster-name)" \ | |
--stack-name "KarpenterTesting-$(params.cluster-name)" \ |
test/tasks/go-suite.yaml
Outdated
- name: repository | ||
- name: path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put descriptions for these
a86af3d
to
3118682
Compare
tags: | ||
kit.aws/substrate: ${CLUSTER_NAME} | ||
subnetSelector: | ||
karpenter.sh/discovery: ${CLUSTER_NAME} | ||
securityGroupSelector: | ||
kubernetes.io/cluster/${CLUSTER_NAME}: owned | ||
karpenter.sh/discovery: ${CLUSTER_NAME} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the Churn. AWS Load Balancer discovers security groups like this. Without this tag discovery, KIT Guest clusters will not connect.
karpenter.sh/discovery: ${CLUSTER_NAME} | |
kubernetes.io/cluster/${CLUSTER_NAME}: owned |
name: karpenter-integration-suite | ||
spec: | ||
params: | ||
- name: commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add description for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's self documenting as is.
tasks: | ||
- name: github-action | ||
taskRef: | ||
name: github-action |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we self-reference the type of object it is to make it clear?
name: github-action | |
name: github-action-cluster-task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of adding types to object names. deployments/foo-deployment
stutters
pkg/test/pods.go
Outdated
@@ -93,7 +93,7 @@ func Pod(overrides ...PodOptions) *v1.Pod { | |||
Tolerations: options.Tolerations, | |||
InitContainers: []v1.Container{{ | |||
Name: strings.ToLower(sequentialRandomName()), | |||
Image: options.Image, | |||
Image: "alpine", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we hardcoding the image here if we're allowing pod images to be defined in the first place? Looks like we also have a default above on L78?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pause doesn't work as an init container. Needed something that could exit.
ProviderRef: options.ProviderRef, | ||
Taints: options.Taints, | ||
StartupTaints: options.StartupTaints, | ||
Labels: options.Labels, | ||
Limits: &v1alpha5.Limits{Resources: options.Limits}, | ||
TTLSecondsAfterEmpty: ptr.Int64(10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why we're picking 10 here? Can we parameterize empty and expired? I can imagine these are functions we'll need to test too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually sure -- I wanted to default for 10 until we need to override.
provisioner.SetDefaults(context.Background()) | ||
_ = provisioner.Validate(context.Background()) | ||
if err := provisioner.Validate(context.Background()); err != nil { | ||
logging.FromContext(context.TODO()).Info("TODO: Fix the tests that cause this") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we report the error here too? I'm unsure if this is a logged TODO that should be used when running tests or a TODO for yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A ton of existing tests break if I fail here, since the error was originally squashed incorrectly. I wanted to log and clean it up later, since I'm OOTO for a few days and wanted to unblock folks.
@@ -1,6 +1,5 @@ | |||
export CLUSTER_NAME="${CLUSTER_NAME:-karpenter-test-cluster}" | |||
export CLUSTER_NAME="${CLUSTER_NAME:-karpenter-test-infrastructure}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would make the stack eksctl-karpenter-test-management-cluster
export CLUSTER_NAME="${CLUSTER_NAME:-karpenter-test-infrastructure}" | |
export CLUSTER_NAME="${CLUSTER_NAME:-karpenter-test-management}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
karpenter-test-management
is less idiomatic than karpenter-test-infrastructure
to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have some disagreements on what the CI Infra should be right now. I think the ginkgo work looks great and am happy to approve that if we scope this one down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just a nit or two
|
||
var ( | ||
EnvironmentLabelName = "testing.karpenter.sh/environment-name" | ||
CleanableObjects = []client.Object{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we ensure that Nodes are cleaned up? I think the TTL on the Provisioners is meant to do that, but if a test fails to delete nodes (which could happen), then we want our Cleanup to also include those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I forgot that Nodes are owned by provisioners now, so this should be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need to test scale down, too. I think this library will need to be greatly expanded upon (e.g. wait for things to cleanup). This is just to get us started.
|
||
var _ = Describe("Sanity Checks", func() { | ||
It("should provision nodes", func() { | ||
provider := test.AWSNodeTemplate(test.AWSNodeTemplateOptions{AWS: v1alpha1.AWS{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Karpenter being tested here doesn't have a defaultInstanceProfile, this should fail. I think it's fine to leave as is, but I think we should make sure we can test both modes eventually.
Co-authored-by: Nick Tran <[email protected]>
Fixes #
Description
Enables e2e tests against a local cluster.
How was this change tested?
Does this change impact docs?
Release Note
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.