Skip to content

DNS LB Alias Setup

reguero edited this page Apr 4, 2022 · 9 revisions

How to add a DNS Load Balanced Alias

Here follows to create a DNS Load Balanced Alias for his service.

  • DNS load balanced aliases can be created using the Ermis self-service GUI in https://aiermis.cern.ch/lbweb
  • The Ermis GUI uses CERN Single Sign On authentication.
  • This GUI uses hostgroup based authorisation so any user can do the operations as long as he is registered as owner of the base hostgroup of the alias, see man ai-pwn.
  • The GUI feeds the alias information to the Ermis REST service that is used by the Puppet type that generates the configuration of the LBD servers. The Puppet run interval in the LBD servers is currently 30 minutes. So once the alias is created in the GUI it will appear with a maximal latency of 30 minutes in the LBD servers.

Detailed Procedure:

1. CERN SSO Login.

2. Go to Add LB Alias. The display should look as follows:

addingfullalias.png

3. Fill a desired name for your alias as shown above. If it has not been specified, the domain .cern.ch" will be added. Note that the alias can also be on a subdomain, like myalias.mydomain.cern.ch.

4. Choose whether your alias will be external, ie. visible in the CERN external DNS server or not. Please note that being visible in the CERN external DNS server will not automatically open external access to the LB alias member nodes in the CERN firewall. If needed, please read the section External Access.

5. Provide a hostgroup. Only alias members belonging to the same base level hostgroup will be allowed. Only users that are owners of this hostgroup will be allowed to manage the alias.

6. If needed, add the parameters like the 'canonical name records' and the 'best hosts'.

7. Submit you request.

8. Configure the alias member nodes in Puppet following the section How to define DNS Load Balanced Alias Members.

9. Wait until Puppet runs in the nodes and in the LBD server (max 30 min for the LBD server).

Other Operations:

As well as creation, the Ermis GUI also allows:

  • Modification with "Modify LB Alias". The display should look as follows: modifyalias.png

  • And deletion with "Delete LB Alias": deletealias.png

  • You can also display the log of your LB alias in monit-timber.cern.ch by selecting it in the "LB Alias Logs" section of the GUI: displaylogalias

    You will be re-directed to a timber dashboard. The default dashboard contains graphs showing the members of the alias over time. Cliking on the Server logs link will display the logs related to this particular alias

    examplealiaslog

A user can do the modification and deletion operations as long as he is registered as owner of the base hostgroup of the alias, see man ai-pwn. Read operations are for the moment allowed to all authenticated users.

As the mechanics are the same as with creation, the maximal latency of the modifications is also 30 minutes.

How to define DNS Load Balanced Alias Members

In what follows, we describe how a user should configure hosts designated to be behind a specific DNS LB alias.

  • First of all, you have to define the load balanced alias as described in the above section DNS LB Alias Definition.
  • Once this is done, alias member nodes may be configured for the LB alias by letting their Puppet configuration. This could be done either in the manifests or in the hiera data. The two options are as follow:

Defining the alias in puppet

class xxxx::loadbalancing {

  include '::lbclient'

  #Define the alias, and the checks that have to be executed
  lbclient::alias { 'xxxx.cern.ch':
    checks => ['nologin', 'roger', 'sshdaemon', 'tmpfull', 'xsessions', {'type' =>'collectd', 'data' => '[systemd-nscd/gauge-running]>0' }]
  }

  # Checks can be added later on as well:
  lbclient::alias::check{'second collectd check':
    check => {type => 'collectd',
              data => '[systemd-sshd/gauge-running]>0',}
  }

  ...

}

Defining the alias in hiera

class xxxx::loadbalancing {

include '::lbclient'
...
}
---
lbclient::aliases::definitions:
  xxx.cern.ch:
    checks:
      - nologin
      - roger
      - sshdaemon
      - type: collectd
        data: '[systemd-nscd/gauge-running]>0'
    loads:
      - type: collectd
        data: '[load/load-relative:shortterm]*125 + 1'

!!! warning When defining the client, remember to use the FQDN for the node (cern.ch included).

An example can be found in the [configuration of LXPLUS in GitLab]({{ aiwwwgitdir }}/it-puppet-hostgroup-lxplus/blob/qa/code/manifests/nodes/login.pp).

Type lbclient::alias is defined in modules/lbclient/manifests/alias.pp. The list under checks will be used for health monitoring of the load balanced alias. checks could be either a string (for checks that do not require parameters), or a dictionary, with the keys type and data. For consistency reasons, the checks without parameters could also be sent as a dictionary with only type ( in other words, it is equivalent to put checks => ['nologin'] and checks => [{'type' =>'nologin'}]).

Here follow the details:

  • If nologin then the existence of either files /etc/iss.nologin or /etc/nologin will be checked so the machine will be removed from the load balanced alias when they exist.
  • If sshdaemon the machine will be removed from the load balanced alias when the sshd (daemon on port 22) is not running.
  • If tmpfull the machine will be removed from the load balanced alias when /tmp is full.
  • If ftpdaemon the machine will be removed from the load balanced alias when the ftpd (daemon on port 21) is not running.
  • If gridftpdaemon the machine will be removed from the load balanced alias when the gridftpd (daemon on port 2811) is not running.
  • If webdaemon the machine will be removed from the load balanced alias when the httpd (daemon on port 80) is not running.
  • If xsessions the lbclient will take into account how many X windows managers (GNOME, KDE, FVWM) are running for the built-in metric load calculation.
  • If swaping the lbclient will take into account if the node is swaping (makes an average over 2 seconds) for the built-in metric load calculation.
  • If afs the machine will be removed from the load balanced alias when afs is not running (check by stat entries in /afs/cern.ch/user/).
  • If eos the machine will be removed from the load balanced alias when one of the mounted EOS filesystems returns error with 'Transport endpoint is not connected' or 'Operation not supported' or timeout to the 'eosxd get eos.mgmurl' command.
  • If roger the machine will be removed from the load balanced alias when the Roger appstate is not 'production' (check by querying /etc/roger/current.yaml, which is updated by CERNMegabus service).
  • If {type => command, data => <scriptname> } the lbclient will run the program <scriptname> and the machine will be removed from the load balanced alias when the return code is != 0. Please note that you risk the lbclient to timeout if the program <scriptname> takes more then 3 seconds to respond.
  • If {type => collectd, data => <collectdexpression>}, the machine will be removed from the load balanced alias when the <collectdexpression> is false.

Verify that your DNS Load Balanced Alias Member definition is visible in PuppetDB

The DNS Load Balanced Alias Member definition is only visible by the LBD server once the Lbd::Client resource describing it is stored in PuppetDB.

Please note that A Puppet run of the alias member node is needed to put the relevant Lbd::Client resource in PuppetDB. This has to be followed by a Puppet run of the LBD server node to read it. As Puppet runs each 30 minutes on the LBD server, this means up to one hour delay since the alias definition is pushed.

You may verify that the Lbd::Client resource with the information of your node is stored in PuppetDB by running something as follows in aiadm:

[joe@aiadm ~]$ ai-pdb raw /pdb/query/v4/nodes/<nodename>.cern.ch/resources/Lbd::Client

For instance

[joe@aiadm ~]$ ai-pdb raw /pdb/query/v4/nodes/dashb-ai-641.cern.ch/resources/Lbd::Client

would produce something like the following output


  {
          "certname": "dashb-ai-641.cern.ch",
          "environment": "production",
          "exported": false,
          "file": "/mnt/puppetnfsdir/environments/production/hostgroups/hg_dashboard/manifests/modules/load/alias.pp",
          "line": 6,
          "parameters": {
               "clienthostgroup": "dashboard/web_server/rucio/production",
               "comment": "dashb-atlas-ddm.cern.ch lb alias",
               "lbalias": "dashb-atlas-ddm.cern.ch"
          },
          "resource": "9ad9fd6e95cc0630ca85a320cacae7d2956c0747",
          "tags": [
               "hg_dashboard::modules::load",
               "class",
               "modules",
               "load",
               "hg_dashboard",
               "hg_dashboard::modules::load::alias",
               "lbd",
               "alias",
               "lbd::client",
               "client",
               "dashb-atlas-ddm"
          ],
          "title": "dashb-atlas-ddm.cern.ch lb alias",
          "type": "Lbd::Client"
     }

SLS Display of the LB Alias History

If you go to https://cern.service-now.com/service-portal/sls?showall=true and hover over the DNS Load Balancing entry you will see the list of aliases that are currently defined. If you click on yours, you will go to a dashboard in meter.cern.ch with the log for your LB alias. In the case of LB alias xxxx the dashboard is:

https://timber.cern.ch/public/_plugin/kibana/#/dashboard/temp/LBD::lbalias?query=xxxx.cern.ch

!!! note You can verify the recent history of your alias by checking in the dashboard above.

You can also select the dashboard in meter.cern.ch with the log for your LB alias in the LB Alias Logs section of the self-service GUI in https://aiermis.cern.ch/lbweb/.

Please, note that the Puppet LBD servers use fully qualified names, so you will find exclusively fully qualified names in the logs.

Clone this wiki locally