Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to configure incinga as satellite #235

Open
MatrixCrawler opened this issue Oct 24, 2017 · 8 comments
Open

Make it possible to configure incinga as satellite #235

MatrixCrawler opened this issue Oct 24, 2017 · 8 comments

Comments

@MatrixCrawler
Copy link

It would be handy it it would be possible to configure the icinga in searchlight as a satellite of an other icinge master server. The Icinga Master server would not neccessarily run in the same kubernetes.

There are CLI parameters to initialize an icinga as a satellite
https://www.icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#node-setup-with-satellitesclients

I think it would ne necessary to pass the configuration parameters to the searchlight icinga in the deployment configuration or in a config map

@tamalsaha
Copy link
Contributor

Thanks @MatrixCrawler ! I am not familiar with the benefits of a master/satellite setup. Can you tell me more about your use-case and what are you trying to accomplish?

@punycode
Copy link

Since this is a feature, that I'm interested in too, I will try to elaborate on what it is and why it would be beneficial.

How it works (short version)

Icinga2 integrates an inherent concept of distributed monitoring. The short version of this that Icinga2 can model the following hierarchy:

  • master nodes at the top-level, providing the services of topology knowledge (modelled as zones in Icinga2), state persistence (usually IDOdb in Icinga2), notification (sending emails) & metrics (perfdata in Icinga2)
  • satellite nodes in the leaves or as intermediates, providing definition & execution of checks for a particular zone (this is also done by master nodes in non-zone-setups) and optional also notifications & metrics
  • agent nodes in the leaves, only providing execution of checks (think about them as equivalent of nrpe in Nagios)

Everyone of those setup types (they all use the same Icinga2 binary) define a zone, which are arranged hierarchically, and report their results upstream. At any level, if nodes are assigned to the same zone, they act as a load-balancing/fail-over setup (for example 2 master nodes execute checks in a distributed manner)

The usual setup is having satellites in network segments and agents on every leaf node (at least if check_by_ssh or sth. similar from a master or satellite is not an option)

How do nodes communicate

Icinga2 follows a "secure by design" approach. The standard setups involves a X509 certificate authority at the master level that signs certificates for every node. The actual communication uses a JSON-RPC protocol via HTTPS, with connection initiated by either side, meaning upstream & downstream node both try to establish a connection in some way and thereby mitigating problems with NAT and firewalls in general). This is usually done on a well known port (5665).

Why do we want this?

If there is already an Icinga2 master setup in place, having Searchlight act as a satellite allows integrating a Kubernetes cluster into existing monitoring infrastructure at a very low overhead. We can define & execute K8S checks inside the cluster, while utilizing the existing notifications (and dashboards for that matter) on the outside. It wouldn't really be necessary to implement the agent node level in Kubernetes, since it is a cluster and we can monitor it as a whole (though it might be a nice-to-have for appliance-style installation like CoreOS)

I hope this gives some ideas about how it works and why it would be nice to have. I will try to be available for further questions. As soon as I read into the existing codebase I may also provide ideas/code on how to achieve this.

@punycode
Copy link

Initial idea of how to achieve a satellite setup:

  • Allow providing a signed TLS key pair for the satellite via Secret
  • Spin up a Deployment of a suitable amount of replicas for the cluster with this Secret and an off-the-shelf satellite configuration for a zone named like the entrypoint (e.g. public DNS name) of the cluster (kubernetes.example.com)
  • Expose a NodePort service on the default port (5665)

If there is an Ingress controller in place, that supports SSL passthrough, we could also expose it via Ingress, instead of NodePort.

@MatrixCrawler
Copy link
Author

Thanks @punycode for elaborating on this.

@l13t
Copy link

l13t commented Apr 6, 2018

Any updates/plans on this?

@tamalsaha
Copy link
Contributor

tamalsaha commented May 3, 2018

The next release for Searchlight is feature complete . The of big things that are coming in this release are
(1) webhook based plugin for icinga checks,
(2) using workqueue (not user visible, but fixes various subtle retry issues),
(3) pause alert (instead of deleting CRd yaml, you can pause it to temporarily deactivate the check)
(4) alert history is stored as a new Incident crd.

We can discuss the on potential design for satellite support at this time. From @punycode 's comment above, the main change seems to be updating the docker image for Searchlight to provide these extra info and not run icingaweb in satellite. And document the process .

Is there good document that shows the process step by step so that we can replicate it? I found https://www.icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/ . If you are interested in contributing to this feature this will be a great time. :) Please sign up for our Slack https://slack.appscode.com/ . We can talk about it more there if you want to contribute. If you have just feedback, you can just reply here too.

Thank you all for trying Searchlight and having your patience.

@l13t
Copy link

l13t commented May 3, 2018

@tamalsaha do you have any schema how you change icinga config or where in code i can find this?

@tamalsaha
Copy link
Contributor

tamalsaha commented May 3, 2018

This icinga Dockerfiles are here: https://github.com/appscode/searchlight/tree/master/hack/docker/icinga/alpine

Any configuration passed to icinga container is via a Secret --config-secret-name. The currently supported keys are listed here: https://github.com/appscode/searchlight/blob/master/pkg/icinga/configurator.go#L20

The searchlight pod takes the data from Secret, fills in anything missing that can be set to defaults or auto generated (eg certs) and then write the icinga config.ini file. Icinga container waits until that file is available. https://github.com/appscode/searchlight/blob/master/hack/docker/icinga/alpine/runit.sh#L8

So, any extra parameter we need to pass should follow a similar pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants