Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS CloudWatch Agent failing on OCP 4.6 cluster #75

Open
ShubhamKale31 opened this issue May 3, 2021 · 3 comments
Open

AWS CloudWatch Agent failing on OCP 4.6 cluster #75

ShubhamKale31 opened this issue May 3, 2021 · 3 comments

Comments

@ShubhamKale31
Copy link

I am trying to set up aws CloudWatch Agent to Collect ocp 4.6 Cluster Metrics using this link https://docs.amazonaws.cn/en_us/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-metrics.html

but the pods are failing with following errors:

[ec2-user@ip-10-0-2-100 ~]$ oc logs cloudwatch-agent-57htk
2021/02/02 21:17:21 I! 2021/02/02 21:17:18 E! ec2metadata is not available
2021/02/02 21:17:18 I! attempt to access ECS task metadata to determine whether I'm running in ECS.
2021/02/02 21:17:19 W! retry [0/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021/02/02 21:17:20 W! retry [1/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021/02/02 21:17:21 W! retry [2/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021/02/02 21:17:21 I! access ECS task metadata fail with response unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers), assuming I'm not running in ECS.
I! Detected the instance is OnPrem
2021/02/02 21:17:21 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json ...
/opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json does not exist or cannot read. Skipping it.
2021/02/02 21:17:21 Reading json config file path: /etc/cwagentconfig/..2021_02_02_21_14_55.157039078/cwagentconfig.json ...
2021/02/02 21:17:21 unable to scan config dir /etc/cwagentconfig with error: unable to parse json, error: invalid character '\n' in string literal
No json config files found, please provide config, exit now

2021/02/02 21:17:21 I! Return exit error: exit code=99
2021/02/02 21:17:21 I! there is no json configuration when running translator
@pingleig
Copy link
Contributor

pingleig commented May 3, 2021

Is this same as #74 ?

@ShubhamKale31
Copy link
Author

@pingleig I had only raised the issue #74 from a wrong github account. Could you please close that issue and paste that same message in this git issue.

@pingleig
Copy link
Contributor

pingleig commented May 3, 2021

Copied from #74 (comment)

The error is because the agent is trying to reach ec2 metadata service to get the cluster name , I am not sure if it is enabled when using OpenShift (I suppose that's the OCP you are referring to). Sometimes the ec2 metadata endpoint is not there when the vm first starts so the agent will restart 1~2 times to be able to reach it.

If you have access to the vm directly, you can check the endpoint directly using the following command from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html It's a link local address so it should fine for both the vm and container if the CNI is using AWS VPC.

TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` \
&& curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/

Another way is set the env RUN_IN_AWS=True from aws/amazon-cloudwatch-agent#122.

Even if ec2 metadata is working, because you are not using cluster created be eks managed node group/eksctl. You need to set the cluster name directly because the cluster name detection is based on ec2 instance tags, which may not be the case for OpenShift.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cwagentconfig
  namespace: amazon-cloudwatch
data:
  # Configuration is in Json format. No matter what configure change you make,
  # please keep the Json blob valid.
  cwagentconfig.json: |
    {
      "agent": {
        "debug": true,
        "region": "us-west-1"
      },
      "logs": {
        "metrics_collected": {
          "kubernetes": {
            "cluster_name": "eks-containerd-bottlerocket",
            "metrics_collection_interval": 60
          }
        },
        "force_flush_interval": 5
      }
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants