Skip to content

Latest commit

 

History

History
403 lines (289 loc) · 18.9 KB

README.md

File metadata and controls

403 lines (289 loc) · 18.9 KB

Agent claiming

Agent claiming allows a Netdata Agent, running on a distributed node, to securely connect to Netdata Cloud. A Space's administrator creates a claiming token, which is used to add an Agent to their Space via the Agent-Cloud link (ACLK).

Are you just starting out with Netdata Cloud? See our get started with Cloud guide for a walkthrough of the process and simplified instructions.

Claiming nodes is a security feature in Netdata Cloud. Through the process of claiming, you demonstrate in a few ways that you have administrative access to that node and the configuration settings for its Agent. By logging into the node, you prove you have access, and by using the claiming script or the Netdata command line, you prove you have write access and administrative privileges.

Only the administrators of a Space in Netdata Cloud can view the claiming token and accompanying script generated by Netdata Cloud.

The claiming process ensures no third party can add your node, and then view your node's metrics, in a Cloud account, Space, or War Room that you did not authorize.

By claiming a node, you opt-in to sending data from your Agent to Netdata Cloud via the ACLK. This data is encrypted by TLS while it is in transit. We use the RSA keypair created during claiming to authenticate the identity of the Agent when it connects to the Cloud. While the data does flow through Netdata Cloud servers on its way from Agents to the browser, we do not store or log it.

You can claim a node during the Netdata Cloud onboarding process, or after you created a Space by clicking on Claim Nodes in the Spaces management area.

There are two important notes regarding claiming:

  • You can only claim any given node in a single Space. You can, however, add that claimed node to multiple War Rooms within that one Space.
  • You must repeat the claiming process on every node you want to add to Netdata Cloud.

How to claim a node

To claim a node, select which War Rooms you want to add this node to with the dropdown, then copy and paste the script given by Cloud into your node's terminal. Hit Enter.

sudo netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud

The script should return Agent was successfully claimed.. If the claiming script returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information. If you prefer not to use root privileges via sudo to run the claiming script, see the next section.

Repeat this process with every node you want to add to Cloud during onboarding. You can also add more nodes once you've finished onboarding.

Claim an agent without root privileges

If you don't want to run the claiming script with root privileges, you can discover which user is running the Agent, switch to that user, and run the claiming script.

Use grep to search your netdata.conf file, which is typically located at /etc/netdata/netdata.conf, for the run as user setting. For example:

grep "run as user" /etc/netdata/netdata.conf 
    # run as user = netdata

The default user is netdata. Yours may be different, so pay attention to the output from grep. Switch to that user and run the claiming script.

netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud

Hit Enter. The script should return Agent was successfully claimed.. If the claiming script returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.

Claim an Agent running in Docker

To claim an instance of the Netdata Agent running inside of a Docker container, either set claiming environment variables in the container to have it automatically claimed on startup or restart, or use docker exec to manually claim an already running container.

For claiming to work, the contents of /var/lib/netdata must be preserved across container restarts using a persistent volume. See our recommended docker run and Docker Compose examples for details.

Using environment variables

The Netdata Docker container looks for the following environment variables on startup:

  • NETDATA_CLAIM_TOKEN
  • NETDATA_CLAIM_URL
  • NETDATA_CLAIM_ROOMS
  • NETDATA_CLAIM_PROXY

If the token and URL are specified in their corresponding variables and the container is not already claimed, it will use these values to attempt to claim the container, automatically adding the node to the specified War Rooms. If a proxy is specified, it will be used for the claiming process and for connecting to Netdata Cloud.

These variables can be specified using any mechanism supported by your container tooling for setting environment variables inside containers. For example, when creating a new Netdata continer using docker run, the following modified version of the command can be used to set the variables:

docker run -d --name=netdata \
  -p 19999:19999 \
  -v netdatalib:/var/lib/netdata \
  -v netdatacache:/var/cache/netdata \
  -v /etc/passwd:/host/etc/passwd:ro \
  -v /etc/group:/host/etc/group:ro \
  -v /proc:/host/proc:ro \
  -v /sys:/host/sys:ro \
  -v /etc/os-release:/host/etc/os-release:ro \
  -e NETDATA_CLAIM_TOKEN=TOKEN \
  -e NETDATA_CLAIM_URL="https://app.netdata.cloud" \
  -e NETDATA_CLAIM_ROOMS=ROOM1,ROOM2 \
  --restart unless-stopped \
  --cap-add SYS_PTRACE \
  --security-opt apparmor=unconfined \
  netdata/netdata

Output that would be seen from the claiming script when using other methods will be present in the container logs.

Using the environment variables like this to handle claiming is the preferred method of claiming Docker containers as it works in the widest variety of situations and simplifies configuration management.

Using docker exec

Claim a running Netdata Agent container by appending the script offered by Cloud to a docker exec ... command, replacing netdata with the name of your running container:

docker exec -it netdata netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud

The script should return Agent was successfully claimed.. If the claiming script returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.

Claim a Kubernetes cluster's parent Netdata pod

Read our Kubernetes installation for details on claiming a parent Netdata pod.

Claim through a proxy

A Space's administrator can claim a node through a SOCKS5 or HTTP(S) proxy.

You should first configure the proxy in the [cloud] section of netdata.conf. The proxy settings you specify here will also be used to tunnel the ACLK. The default proxy setting is none.

[cloud]
    proxy = none

The proxy setting can take one of the following values:

  • none: Do not use a proxy, even if the system configured otherwise.
  • env: Try to read proxy settings from set environment variables http_proxy/socks_proxy.
  • socks5[h]://[user:pass@]host:ip: The ACLK and claiming will use the specified SOCKS5 proxy.
  • http://[user:pass@]host:ip: The ACLK and claiming will use the specified HTTP(S) proxy.

For example, a SOCKS5 proxy setting may look like the following:

[cloud]
    proxy = socks5h://203.0.113.0:1080       # With an IP address
    proxy = socks5h://proxy.example.com:1080 # With a URL

You can now move on to claiming. When you claim with the netdata-claim.sh script, add the -proxy= parameter and append the same proxy setting you added to netdata.conf.

sudo netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2 -url=https://app.netdata.cloud -proxy=socks5h://203.0.113.0:1080

Hit Enter. The script should return Agent was successfully claimed.. If the claiming script returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.

Troubleshooting

If you're having trouble claiming a node, this may be because the ACLK cannot connect to Cloud.

With the Netdata Agent running, visit http://NODE:19999/api/v1/info in your browser, replacing NODE with the IP address or hostname of your Agent. The returned JSON contains four keys that will be helpful to diagnose any issues you might be having with the ACLK or claiming process.

	"cloud-enabled"
	"cloud-available"
	"agent-claimed"
	"aclk-available"

Use these keys and the information below to troubleshoot the ACLK.

bash: netdata-claim.sh: command not found

If you run the claiming script and see a command not found error, you either installed Netdata in a non-standard location or are using an unsupported package. If you installed Netdata in a non-standard path using the --install option, you need to update your $PATH or run netdata-claim.sh using the full path. For example, if you installed Netdata to /opt/netdata, use /opt/netdata/bin/netdata-claim.sh to run the claiming script.

If you are using an unsupported package, such as a third-party .deb/.rpm package provided by your distribution, please remove that package and reinstall using our recommended kickstart script.

Claiming on older distributions (Ubuntu 14.04, Debian 8, CentOS 6)

If you're running an older Linux distribution or one that has reached EOL, such as Ubuntu 14.04 LTS, Debian 8, or CentOS 6, your Agent may not be able to securely connect to Netdata Cloud due to an outdated version of OpenSSL. These old versions of OpenSSL cannot perform hostname validation, which helps securely encrypt SSL connections.

We recommend you reinstall Netdata with a static build, which uses an up-to-date version of OpenSSL with hostname validation enabled.

If you choose to continue using the outdated version of OpenSSL, your node will still connect to Netdata Cloud, albeit with hostname verification disabled. Without verification, your Netdata Cloud connection could be vulnerable to man-in-the-middle attacks.

cloud-enabled is false

If cloud-enabled is false, you probably ran the installer with --disable-cloud option.

Additionally, check that the enabled setting in var/lib/netdata/cloud.d/cloud.conf is set to true:

[global]
    enabled = true

To fix this issue, reinstall Netdata using your preferred method and do not add the --disable-cloud option.

cloud-available is false

If cloud-available is false after you verified Cloud is enabled in the previous step, the most likely issue is that Cloud features failed to build during installation.

If Cloud features fail to build, the installer continues and finishes the process without Cloud functionality as opposed to failing the installation altogether. We do this to ensure the Agent will always finish installing.

If you can't see an explicit error in the installer's output, you can run the installer with the --require-cloud option. This option causes the installation to fail if Cloud functionality can't be built and enabled, and the installer's output should give you more error details.

You may see one of the following error messages during installation:

  • Failed to build libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.
  • Unable to fetch sources for libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.
  • Failed to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
  • Unable to fetch sources for libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
  • Could not find cmake, which is required to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
  • Could not find cmake, which is required to build JSON-C. The install process will continue, but Netdata Cloud support will be disabled.
  • Failed to build JSON-C. Netdata Cloud support will be disabled.
  • Unable to fetch sources for JSON-C. Netdata Cloud support will be disabled.

One common cause of the installer failing to build Cloud features is not having one of the following dependencies on your system: cmake and OpenSSL, including the devel package.

You can also look for error messages in /var/log/netdata/error.log. Try one of the following two commands to search for ACLK-related errors.

less /var/log/netdata/error.log
grep -i ACLK /var/log/netdata/error.log

If the installer's output does not help you enable Cloud features, contact us by creating an issue on GitHub with details about your system and relevant output from error.log.

agent-claimed is false

You must claim your node.

aclk-available is false

If aclk-available is false and all other keys are true, your Agent is having trouble connecting to the Cloud through the ACLK. Please check your system's firewall.

If your Agent needs to use a proxy to access the internet, you must set up a proxy for claiming.

If you are certain firewall and proxy settings are not the issue, you should consult the Agent's error.log at /var/log/netdata/error.log and contact us by creating an issue on GitHub with details about your system and relevant output from error.log.

Remove and reclaim a node

To remove a node from your Space in Netdata Cloud, delete the cloud.d/ directory in your Netdata library directory.

cd /var/lib/netdata   # Replace with your Netdata library directory, if not /var/lib/netdata/
sudo rm -rf cloud.d/

This node no longer has access to the credentials it was claimed with and cannot connect to Netdata Cloud via the ACLK. You will still be able to see this node in your War Rooms in an unreachable state.

If you want to reclaim this node into a different Space, you need to create a new identity by adding -id=$(uuidgen) to the claiming script parameters. Make sure that you have the uuidgen-runtime package installed, as it is used to run the command uuidgen. For example, using the default claiming script:

sudo netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud -id=$(uuidgen)

The agent must be restarted after this change.

Claiming reference

In the sections below, you can find reference material for the claiming script, claiming via the Agent's command line tool, and details about the files found in cloud.d.

The cloud.conf file

This section defines how and whether your Agent connects to Netdata Cloud using the ACLK.

setting default info
cloud base url https://app.netdata.cloud The URL for the Netdata Cloud web application. You should not change this. If you want to disable Cloud, change the enabled setting.
enabled yes The runtime option to disable the Agent-Cloud link and prevent your Agent from connecting to Netdata Cloud.

Claiming script

A Space's administrator can claim an Agent by directly calling the netdata-claim.sh script either with root privileges using sudo, or as the user running the Agent (typically netdata), and passing the following arguments:

-token=TOKEN
    where TOKEN is the Space's claiming token.
-rooms=ROOM1,ROOM2,...
    where ROOMX is the War Room this node should be added to. This list is optional.
-url=URL_BASE
    where URL_BASE is the Netdata Cloud endpoint base URL. By default, this is https://app.netdata.cloud.
-id=AGENT_ID
    where AGENT_ID is the unique identifier of the Agent. This is the Agent's MACHINE_GUID by default.
-hostname=HOSTNAME
    where HOSTNAME is the result of the hostname command by default.
-proxy=PROXY_URL
    where PROXY_URL is the endpoint of a SOCKS5 proxy.

For example, the following command claims an Agent and adds it to rooms room1 and room2:

netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2

You should then update the netdata service about the result with netdatacli:

netdatacli reload-claiming-state

This reloads the Agent claiming state from disk.

Netdata Agent command line

If a Netdata Agent is running, the Space's administrator can claim a node using the netdata service binary with additional command line parameters:

-W "claim -token=TOKEN -rooms=ROOM1,ROOM2"

For example:

/usr/sbin/netdata -D -W "claim -token=MYTOKEN1234567 -rooms=room1,room2"

If need be, the user can override the Agent's defaults by providing additional arguments like those described here.

Claiming directory

Netdata stores the Agent's claiming-related state in the Netdata library directory under cloud.d. For a default installation, this directory exists at /var/lib/netdata/cloud.d. The directory and its files should be owned by the user that runs the Agent, which is typically the netdata user.

The cloud.d/token file should contain the claiming-token and the cloud.d/rooms file should contain the list of War Rooms you added that node to.

The user can also put the Cloud endpoint's full certificate chain in cloud.d/cloud_fullchain.pem so that the Agent can trust the endpoint if necessary.

analytics