Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: troubleshooting module #999

Open
wants to merge 55 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
2ed2045
Adding ALB troubleshooting scenario and Troubleshooting Methodologies
arcegacardenas Jun 11, 2024
f72bc88
Enable lab timings in PR website build
niallthomson Jul 1, 2024
a554a6c
- adding test automation validation to each section
Jul 27, 2024
1cc9bbd
Fix kubectl completion setup to be idempotent
niallthomson Jul 18, 2024
c16897f
Update flux lab for new sample app repo structure
niallthomson Jul 21, 2024
e74a4c8
Yaml component: Slight refactor, added zoomPath
niallthomson Jul 23, 2024
25f97a0
feat: Added Console button component to display links to AWS console …
niallthomson Jul 23, 2024
1ad5f44
update: Migrate OSS metrics lab to OpenTelemetry operator (#1017)
niallthomson Jul 23, 2024
8d3f887
update: Update Container Insights lab based on enhanced observability…
niallthomson Jul 23, 2024
e48f4af
new: Lab for Kubernetes Event-Driven Autoscaler (KEDA) (#1011)
dms486 Jul 24, 2024
d031b55
Remove debug statement
niallthomson Jul 24, 2024
68010af
Improve reliability of tests for observability labs
niallthomson Jul 27, 2024
ecd3034
Add wait to ArgoCD test while investigating flakiness
niallthomson Jul 27, 2024
cead511
Correct how OTel operator installed
niallthomson Jul 27, 2024
6b7a823
Add lab timings for keda
niallthomson Jul 27, 2024
1441444
Revert change
niallthomson Jul 27, 2024
e772770
Fixing formatting
arcegacardenas Jul 27, 2024
84e763c
Adding introduction for each section
arcegacardenas Jul 27, 2024
55382cb
Added 1.30 upgrade notice
niallthomson Jul 27, 2024
c735d08
Add upgrade header
niallthomson Jul 27, 2024
a1a429b
Migrated own account set up to use CloudFormation quick launch links
niallthomson Jul 29, 2024
1df2f48
Remove unnecessary packages from installer
niallthomson Aug 1, 2024
0ddad96
chore(deps): update dependency argoproj/argo-cd to v2.11.7 (#1023)
renovate[bot] Aug 1, 2024
0b0ea82
chore(deps): update dependency hashicorp/terraform to v1.9.3 (#1024)
renovate[bot] Aug 1, 2024
642bc0a
chore(deps): update dependency eksctl-io/eksctl to v0.188.0 (#1041)
renovate[bot] Aug 5, 2024
6209cfe
chore(deps): update helm release nginx to v18.1.7 (#1039)
renovate[bot] Aug 5, 2024
d3a5a7f
Bump sass from 1.77.6 to 1.77.8 in /website (#1037)
dependabot[bot] Aug 5, 2024
a2d98fa
Bump glob from 10.4.2 to 11.0.0 in /website (#1035)
dependabot[bot] Aug 5, 2024
d1be304
chore(deps): update helm release keda to v2.15.0 (#1033)
renovate[bot] Aug 5, 2024
e6790a5
chore(deps): update dependency kubernetes/autoscaler to v1.30.2 (#1026)
renovate[bot] Aug 5, 2024
b592f1b
chore(deps): update dependency kubernetes/kubernetes to v1.30.3 (#1032)
renovate[bot] Aug 5, 2024
830e84f
chore(deps): update dependency helm/helm to v3.15.3 (#1025)
renovate[bot] Aug 5, 2024
6fa9528
chore(deps): update helm release opentelemetry-operator to v0.65.1 (#…
renovate[bot] Aug 5, 2024
2a9c479
Bump typescript from 5.5.2 to 5.5.4 in /test/util (#1029)
dependabot[bot] Aug 6, 2024
c499738
Bump @types/chai from 4.3.14 to 4.3.17 in /test/util (#1027)
dependabot[bot] Aug 6, 2024
f51fc56
6-month steering committee rotation
svennam92 Aug 7, 2024
2591ff9
Fix network policies logging
niallthomson Aug 9, 2024
3c6ae3b
Fixed steering file formatting
niallthomson Aug 9, 2024
c64031b
Revert "chore(deps): update helm release opentelemetry-operator to v0…
niallthomson Aug 9, 2024
c58eb3d
chore(deps): update helm release argo-cd to v7 (#1002)
renovate[bot] Aug 9, 2024
b6253a1
Fix test utility to inject proper test output to after hooks
niallthomson Aug 9, 2024
6c7b40e
chore: Add markdown linting checks (#1047)
niallthomson Aug 12, 2024
f5a911c
update: Fix inconsistent deployment name in Resource View section (#1…
arkagang Aug 12, 2024
868e682
fix: Corrected typo in ADOT manifest breakdowns (#1051)
niallthomson Aug 14, 2024
88a7eee
update: Update IRSA and Pod Identity to new sample application versio…
DovAmir Aug 14, 2024
b045d0c
Bump mocha from 10.5.2 to 10.7.0 in /test/util (#1031)
dependabot[bot] Aug 14, 2024
b8d7fa3
Bump react-tooltip from 5.27.0 to 5.28.0 in /website (#1043)
dependabot[bot] Aug 14, 2024
d49d8e5
Bump @fortawesome/fontawesome-svg-core from 6.5.2 to 6.6.0 in /websit…
dependabot[bot] Aug 14, 2024
44030dc
Bump yaml from 2.4.5 to 2.5.0 in /test/util (#1028)
dependabot[bot] Aug 14, 2024
28bbd9b
Fixing formatting
arcegacardenas Aug 16, 2024
ef7bf9e
Merge branch 'aws-samples:main' into troubleshooting-module
arcegacardenas Aug 21, 2024
85c3120
Formatting
arcegacardenas Aug 21, 2024
1d74201
Merge branch 'main' of github.com:arcegacardenas/eks-workshop-v2 into…
arcegacardenas Sep 12, 2024
46882b4
Removing provisioner destroy since it was not working properly. Movin…
arcegacardenas Sep 21, 2024
95c8833
Merge branch 'main' into troubleshooting-module
arcegacardenas Sep 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions manifests/modules/troubleshooting/alb/.workshop/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

logmessage "Restoring public subnet tags..."

# Function to create ftags for subnets ids
remove_tags_from_subnets() {
subnets_vpc=$(aws ec2 describe-subnets --filters "Name=tag:Name,Values=*Public*" "Name=tag:created-by,Values=eks-workshop-v2" --query 'Subnets[*].SubnetId' --output text)
#logmessage "subnets_vpc: $subnets_vpc"


#remove tag from subnets with AWS cli
for subnet_id in $subnets_vpc; do
#logmessage "public subnets: $subnet_id"
aws ec2 create-tags --resources "$subnet_id" --tags Key=kubernetes.io/role/elb,Value='1' || logmessage "Failed to create tag from subnet $subnet_id"
done
return 0
}

remove_tags_from_subnets
176 changes: 176 additions & 0 deletions manifests/modules/troubleshooting/alb/.workshop/terraform/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
terraform {
required_providers {
# kubectl = {
# source = "gavinbunney/kubectl"
# version = ">= 1.14"
# }
}
}



provider "aws" {
region = "us-east-1"
alias = "virginia"
}

locals {
tags = {
module = "troubleshooting"
}
}

data "aws_vpc" "selected" {
tags = {
created-by = "eks-workshop-v2"
env = var.addon_context.eks_cluster_id
}
}

data "aws_subnets" "public" {
tags = {
created-by = "eks-workshop-v2"
env = var.addon_context.eks_cluster_id
}

filter {
name = "tag:Name"
values = ["*Public*"]
}
}


resource "time_sleep" "blueprints_addons_sleep" {
depends_on = [
module.eks_blueprints_addons
]

create_duration = "15s"
destroy_duration = "15s"
}


resource "null_resource" "break_public_subnet" {
triggers = {
public_subnets = join(" ", data.aws_subnets.public.ids)
always_run = timestamp()
}
count = length(data.aws_subnets.public)

lifecycle {
create_before_destroy = false
}


provisioner "local-exec" {
when = create
command = "aws ec2 delete-tags --resources ${self.triggers.public_subnets} --tags Key=kubernetes.io/role/elb,Value='1'"
}

}

module "eks_blueprints_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "1.16.2"

enable_aws_load_balancer_controller = true
aws_load_balancer_controller = {
wait = true
}

cluster_name = var.addon_context.eks_cluster_id
cluster_endpoint = var.addon_context.aws_eks_cluster_endpoint
cluster_version = var.eks_cluster_version
oidc_provider_arn = var.addon_context.eks_oidc_provider_arn

tags = merge(
var.tags,
local.tags
)

depends_on = [null_resource.break_public_subnet]

}


# create a new policy from json file
resource "aws_iam_policy" "issue" {
name = "issue"
path = "/"
policy = file("${path.module}/template/other_issue.json")
}

# attach issue policy to role
resource "aws_iam_role_policy_attachment" "issue_policy_attachment" {
role = module.eks_blueprints_addons.aws_load_balancer_controller.iam_role_name
policy_arn = aws_iam_policy.issue.arn
depends_on = [module.eks_blueprints_addons, time_sleep.blueprints_addons_sleep]
}

resource "null_resource" "detach_existing_policy" {
triggers = {
role_name = module.eks_blueprints_addons.aws_load_balancer_controller.iam_role_name,
always_run = timestamp()
}

provisioner "local-exec" {
command = "aws iam detach-role-policy --role-name ${self.triggers.role_name} --policy-arn ${module.eks_blueprints_addons.aws_load_balancer_controller.iam_policy_arn}"
when = create
}

depends_on = [aws_iam_role_policy_attachment.issue_policy_attachment]
}

resource "null_resource" "kustomize_app" {
triggers = {
always_run = timestamp()
}

provisioner "local-exec" {
command = "kubectl apply -k ~/environment/eks-workshop/modules/troubleshooting/alb/creating-alb"
when = create
}

depends_on = [aws_iam_role_policy_attachment.issue_policy_attachment]
}



# Example to now how to get variables from add ons outputs DO-NOT-DELETE; AddOns and helms documentaitons does not show exactly the output variables returned
#resource "null_resource" "blue_print_output" {
# for_each = module.eks_blueprints_addons.aws_load_balancer_controller
# triggers = {
#
# timestamp = timestamp()
# }
#
# #count = length(module.eks_blueprints_addons.aws_load_balancer_controller)
# provisioner "local-exec" {
# command = "mkdir -p /eks-workshop/logs; echo \" key: ${each.key} Value:${each.value}\" >> /eks-workshop/logs/action-load-balancer-output.log"
# }
#
# depends_on = [module.eks_blueprints_addons,time_sleep.blueprints_addons_sleep]
#}

#option to run a bash script file
#resource "null_resource" "break2" {
# provisioner "local-exec" {
# command = "${path.module}/template/break.sh ${path.module} mod2"
# }
#
# triggers = {
# always_run = timestamp()
# }
# depends_on = [module.eks_blueprints_addons,time_sleep.blueprints_addons_sleep]
#}

#option to run a kubectl manifest
#resource "kubectl_manifest" "alb" {
# yaml_body = templatefile("${path.module}/template/ingress.yaml", {
#
# })
#
# depends_on = [null_resource.break_policy]
#}


Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
output "environment_variables" {
description = "Environment variables to be added to the IDE shell"
value = merge({
VPC_ID = data.aws_vpc.selected.id,
LOAD_BALANCER_CONTROLLER_ROLE_NAME = module.eks_blueprints_addons.aws_load_balancer_controller.iam_role_name,
LOAD_BALANCER_CONTROLLER_POLICY_ARN_FIX = module.eks_blueprints_addons.aws_load_balancer_controller.iam_policy_arn,
LOAD_BALANCER_CONTROLLER_POLICY_ARN_ISSUE = aws_iam_policy.issue.arn,
LOAD_BALANCER_CONTROLLER_ROLE_ARN = module.eks_blueprints_addons.aws_load_balancer_controller.iam_role_arn
}, {
for index, id in data.aws_subnets.public.ids : "PUBLIC_SUBNET_${index + 1}" => id
}
)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
#!/usr/bin/env bash
#. .env

set -e

mkdir -p /eks-workshop/logs
log_file=/eks-workshop/logs/action-$(date +%s).log

exec 2>&1

logmessage() {
echo "$@" >&7
echo "$@" >&1
}
export -f logmessage

# Function to get the role name from a role ARN
get_role_name_from_arn() {
local role_arn=$1

# Extract the role name from the ARN
role_name=$(logmessage "$role_arn" | awk -F'/' '{print $NF}')

if [ -n "$role_name" ]; then
logmessage "$role_name"
else
logmessage "Failed to retrieve role name from ARN: $role_arn"
return 1
fi
}

# Function to get the Kubernetes role attached to a service account
get_service_account_role() {
local namespace=$1
local service_account=$2

# Get the role ARN associated with the service account
role_arn=$(kubectl get serviceaccount "$service_account" -n "$namespace" -o jsonpath="{.metadata.annotations['eks\.amazonaws\.com\/role-arn']}")

if [ -n "$role_arn" ]; then
logmessage "Service Account: $service_account"
logmessage "Namespace: $namespace"
logmessage "Role ARN: $role_arn"
get_role_name_from_arn "$role_arn"
return 0
else
logmessage "Failed to retrieve role for service account '$service_account' in namespace '$namespace'"
return 1
fi

}

# Function to get the first policy ARN attached to a role ARN
get_first_policy_arn_from_role_arn() {
local role_arn=$1

# Get the list of policies attached to the role
policy_arn=$(aws iam list-attached-role-policies --role-name "$role_arn" --query 'AttachedPolicies[0].PolicyArn' --output text)

if [ -n "$policy_arn" ]; then
logmessage "First Policy ARN attached to role '$role_arn':"
logmessage "Policy: $policy_arn"
return 0
else
logmessage "Failed to retrieve policy ARN for role '$role_arn'"
return 1
fi
}

# Function to update the policy with new statement
update_policy_with_new_statement() {
local policy_arn=$1
local new_statement=$2

logmessage "PolicyARN: $policy_arn"
logmessage "Statement: $new_statement"
aws iam create-policy-version --policy-arn $policy_arn --policy-document $new_statement --set-as-default

}

# Function to remove an action from a policy statement
remove_action_from_policy_statement() {
local policy_name=$1
local action_to_remove=$2

# Get the current policy document
policy_document=$(aws iam get-policy-version --policy-arn "$policy_arn" --query 'PolicyVersion.Document' --version-id v1 --output json)

# Remove the specified action from the statements
new_statements=$(logmessage "$policy_document" | jq ".Statement[] | select(.Action[] | contains('$action_to_remove')) | .Action = [.Action[] | select(. != '$action_to_remove')]")
new_policy_document=$(logmessage '{"Version": "2012-10-17", "Statement": '"$new_statements"'}')
+
# Update the policy with the modified document
logmessage "Policy Document"
logmessage $new_policy_document
#aws iam create-policy-version --policy-arn "$policy_arn" --policy-document "$new_policy_document" --set-as-default

if [ $? -eq 0 ]; then
logmessage "Action removed from policy statement successfully."
return 0
else
logmessage "Failed to remove action from policy statement."
return 1
fi
}

# Function to remove tags from subnets ids
remove_tags_from_subnets() {
local tag_key="Key=kubernetes.io/role/elb,Value=1"

logmessage "retrive subnets ids with tag key assigned to specific vpc_id via aws cli"
logmessage "getting public subnets from VPC: $vpc_id "


subnets_vpc=$(aws ec2 describe-subnets --filters "Name=vpc-id,Values=$vpc_id" --query 'Subnets[*].SubnetId' --output text)
logmessage "subnets_vpc: $subnets_vpc"


#remove tag from subnets with AWS cli
for subnet_id in $subnets_vpc; do
logmessage "public subnets: $subnet_id"
aws ec2 delete-tags --resources "$subnet_id" --tags "Key=$tag_key" || logmessage "Failed to remove tag from subnet $subnet_id"
done
return 0
}

# Getting the service role
path_tofile=$1
mode=$2
vpc_id=$3
public_subnets=$4
namespace="kube-system"
service_account="aws-load-balancer-controller-sa"
#new_statement="file://$path_tofile/template/iam_policy_incorrect.json"
new_statement="file://$path_tofile/template/other_issue.json"

logmessage "path_sent: $path_tofile"


# validate if mode is equal to mod1
logmessage "mode: $mode"
if [ "$mode" == "mod1" ]; then
logmessage "Removing subnet tags"
remove_tags_from_subnets
else
logmessage "Removing permissions"
get_service_account_role "$namespace" "$service_account"
get_first_policy_arn_from_role_arn "$role_name"
update_policy_with_new_statement "$policy_arn" "$new_statement"

fi




Loading
Loading