Helm is the recommended method of deployment. kubectl based deployment can be done by generating individual templates using helm. Refer this for details.
Note: This is supported only through the helm chart based deployment.
By default, a cluster role, cluster role binding and serviceaccount will be created for the Fluentd and Management Agent pods to access (readonly) various Kubernetes Objects within the cluster for supporting logs, objects and metrics collection. However, if you want to use your own serviceaccount, you can do the same by setting the "oci-onm-common.createServiceAccount" variable to false and providing your own serviceaccount in the "oci-onm-common.serviceAccount" variable. Ensure that the serviceaccount should be in the same namespace as the namespace used for the whole deployment. The namespace for the whole deployment can be set using the "oci-onm-common.namespace" variable, whose default value is "oci-onm".
The serviceaccount must be binded to a cluster role defined in your cluster, which allows access to various objects metadata. The following sample is a recommended minimalistic role definition as of chart version 3.0.0.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: oci-onm
rules:
- apiGroups:
- ""
resources:
- '*'
verbs:
- get
- list
- watch
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
- apiGroups:
- apps
- batch
- discovery.k8s.io
- metrics.k8s.io
resources:
- '*'
verbs:
- get
- list
- watch
Once you have the cluster role defined, to bind the cluster role to your serviceaccount use the following cluster role binding definition.
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: oci-onm
roleRef:
kind: ClusterRole
name: oci-onm
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: <ServiceAccountName>
namespace: <Namespace>
Note: This is supported only through the helm chart based deployment.
By default Fluentd tail plugin that is being used to collect various logs has default encoding set to ASCII-8BIT. To overrided the default encoding, use one of the following approaches.
Set value for encoding under fluentd:tailPlugin section of values.yaml, which applies to all the logs being collected from the cluster.
..
..
oci-onm-logan:
..
..
fluentd:
...
...
tailPlugin:
...
...
encoding: <ENCODING-VALUE>
The encoding can be set at invidivual log types like kubernetesSystem, linuxSystem, genericContainerLogs, which applies to all the logs under the specific log type.
..
..
oci-onm-logan:
..
..
fluentd:
...
...
kubernetesSystem:
...
...
encoding: <ENCODING-VALUE>
..
..
oci-onm-logan:
..
..
fluentd:
...
...
genericContainerLogs:
...
...
encoding: <ENCODING-VALUE>
The encoding can be set at individual log level too, which takes precedence over all others.
..
..
oci-onm-logan:
..
..
fluentd:
...
...
kubernetesSystem:
...
...
logs:
kube-proxy:
encoding: <ENCODING-VALUE>
..
..
oci-onm-logan:
..
..
fluentd:
...
...
customLogs:
custom-log1:
...
...
encoding: <ENCODING-VALUE>
If a log record contains a timezone identifier, the Logging Analytics service will use that timezone. However, if there is no timezone information, the service defaults to UTC.
To override this default, use the timezone
parameter in your values.yaml
file. This parameter can be configured at different levels.
Note: If a log record already has a timezone identifier, this setting may not be applicable.
- Setting
oci-onm-logan.fluentd.timezone
to PST applies PST as the default timezone for all logs collected via the Fluentd agent. - Setting
oci-onm-logan.fluentd.genericContainerLogs.timezone
to IST applies IST as the default timezone specifically for generic container logs.
..
..
oci-onm-logan:
fluentd:
timezone: <Set default timezone for all logs collected via fluentd agent>
...
...
kubernetesSystem:
timezone: <Set default timezone for all Kubernetes System logs>
logs:
kube-proxy:
timezone: <Set default timezone for kube-proxy logs>
...
...
linuxSystem:
logs:
cronlog:
timezone: <Set default timezone for cron logs>
...
...
eksControlPlane:
logs:
apiserver:
timezone: <Set default timezone for EKS API server logs>
...
...
genericContainerLogs:
timezone: <Set default timezone for generic container logs>
...
...
customLogs:
custom-log-1:
timezone: <Set default timezone for custom logs>
...
...
Note: This is supported only through the helm chart based deployment.
The default AuthZ configuration for connecting to OCI Services from the monitoring pods running in the Kubernetes clusters is InstancePrincipal
and it is the recommended approach for OKE. If you are trying to monitor Kubernetes clusters other than OKE, you need to use config
file based AuthZ instead.
First you need to have a OCI local user (preferrably a dedicated user created only for this use-case so that you can restrict the policies accordingly) and OCI user group. Then you need to generate API Signing key and policies.
- Refer OCI API Signing Key for instructions on how to generate API Signing key for a given user.
- Refer this for creating required policies.
Modify your override_values.yaml to add the following.
...
...
oci-onm-logan:
...
...
authtype: config
## -- OCI API Key Based authentication details. Required when authtype set to config
oci:
# -- Path to the OCI API config file
path: /var/opt/.oci
# -- Config file name
file: config
configFiles:
config: |-
# Replace each of the below fields with actual values.
[DEFAULT]
user=<user ocid>
fingerprint=<fingerprint>
key_file=/var/opt/.oci/private.pem
tenancy=<tenancy ocid>
region=<region>
private.pem: |-
# -----BEGIN RSA PRIVATE KEY-----
# XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
# -----END RSA PRIVATE KEY-----
Note: This feature will not work with custom fluentd configuration and if you set custom fluentd configuration, it would be placed under worker 0.
To enable multi-process workers feature of Fluentd, set value of multiProcessWorkers
under oci-onm-logan to the desired number of workers. By default it is set 0, indicating that the feature is disabled. To assign different workers for different logs, you may set worker
field to the desired worker id against each of the individual logs or group of logs as supported in the confguration. The default worker id is set to 0 for all the logs when enabling multiProcessWorkers
feature. The following are few examples,
Enable multi-process worker feature with 2 workers and set the worker id 1 to all container logs (except custom logs) and keeping the default worker id (0) for the remaining all logs.
..
..
oci-onm-logan:
..
..
fluentd:
...
...
multiProcessWorkers: 2
...
...
genericContainerLogs:
...
...
worker: 1
Enable multi-process worker feature with 3 workers and set the worker id 1 to all container logs (except custom logs), worker id 2 to Kube Proxy, Linux Syslog and Kubelet logs, and keeping the default worker id (0) for the remaining all logs.
..
..
oci-onm-logan:
..
..
fluentd:
...
...
multiProcessWorkers: 3
...
...
kubernetesSystem:
...
...
logs:
...
...
kube-proxy:
...
...
worker: 2
...
...
...
...
linuxSystem:
...
...
logs:
...
...
syslog:
...
...
worker: 2
...
...
...
...
genericContainerLogs:
...
...
worker: 1
Log location: /var/log/oci-logging-analytics.log
Sample Error :
E, [2023-08-07T10:17:13.710854 #18] ERROR -- : oci upload exception : Error while uploading the payload. { 'message': 'execution expired', 'status': 0, 'opc-request-id': 'D733ED0C244340748973D8A035068955', 'response-body': '' }
- Check if your OCNE setup configuration has
restrict-service-externalip
value set totrue
for kubernetes module. If yes, update it to false to allow access to Logging Analytics endpoint from containers. Refer this for more details. If the issue is still not resolved,- Check if your OCNE setup configuration has
selinux
value set toenforcing
in globals section. If yes, you may need to start the fluentd containers in privileged mode. To achieve the same, setprivileged
to true in override_values.yaml.
- Check if your OCNE setup configuration has
..
..
oci-onm-logan:
..
..
privileged: true
Log location: Pod logs of Daemonset oci-onm-logan
Set privileged
to true in override_values.yaml to resolve this.
..
..
oci-onm-logan:
..
..
privileged: true
Log location: Pod logs of Daemonset oci-onm-logan
Sample Error:
2023-10-10 13:00:16 +0000 [warn]: #0 [in_tail_containerlogs] /var/log/containers/kube-flannel-ds-kl9bb_kube-flannel_kube-flannel-c2a954a05c57f4f68bc3ab348f071812be2405c76bd1631890638eac7c503506.log unreadable. It is excluded and would be examined next time.
The default path for docker data (in which the container logs will be written) in a typical standalone cluster is /var/lib/docker/containers
. You may need to validate the same and update containerdataHostPath
in override_values.yaml accordingly.
..
..
oci-onm-logan:
..
..
volumes:
..
containerdataHostPath: /var/lib/docker/containers
AWS EKS control plane logs are available in CloudWatch. Once the control plane log collection is enabled, the logs are directly pulled from CloudWatch and ingested into OCI Logging Analytics for further analysis. Alternatively, the logs can be routed over to S3 and pulled from there.
To collect the logs from CloudWatch directly, modify your override_values.yaml to add the following EKS specific variables. Various other variables are available in the values.yaml file and can be updated as necessary.
..
..
oci-onm-logan:
..
..
enableEKSControlPlaneLogs: true
fluentd:
...
...
eksControlPlane:
region:<aws_region>
awsStsRoleArn:<role_arn>
If you run into CloudWatch service quotas, you can alternatively route the logs to S3 and collect them. The control plane logs in S3 need to be in a specific format for the default log collection to work. Please refer EKS CP Logs Streaming to S3 for instructions on how to configure streaming of Control Plane logs to S3 and subsequenty collect them in OCI Logging Analytics. Once the streaming of logs is setup, modify your override_values.yaml to add the following EKS specific variables. Various other variables are available in the values.yaml file and can be updated as necessary.
..
..
oci-onm-logan:
..
..
enableEKSControlPlaneLogs: true
fluentd:
...
...
eksControlPlane:
collectionType:"s3"
region:<aws_region>
awsStsRoleArn:<role_arn>
s3Bucket:<s3_bucket>