Skip to content

Shared data based xCAT MN HA mini design

Yuan Bai edited this page May 23, 2018 · 18 revisions

General

Use xcatha-setup and xcatha-failover to automate most functions in user case.

xcatha-setup: We have xcat-automation using ansible to install xCAT now. In setup and configure HA mn part, add xcatha-setup script to fill in the gaps, the xcatha-setup can run standalone, and xcat-automation can integrate it easily. In the future, we can use xcat-automation to setup and configure HA MN nodes.

xcatha-failover: realize activate|deactivate HA MN.

User interface:

  1. xcatha-setup -p <shared-data directory path> -i <nic> -v <virtual ip> [-m <netmask>] [-n <hostname>] [-t <database type>]

  2. activate MN: xcatha-failover -a|--activate -p <shared-data directory path> -i <nic> -v <virtual ip> [-m <netmask>] [-t <database type>]

  3. deactivate MN: xcatha-failover -d|--deactivate -i <nic> -v <virtual ip>

Workflow:

  1. xcatha-setup setup and configure HA mn work flow:

    1. check virtual ip, make sure virtual ip is not used (ping), or else, exit with error message
    2. check if there is xcat data in shared data directories, and check if DB type in shared data directory is different from target type from input , if they are different, exit and print error
    3. add virtual ip into its nic
    4. set hostname for virtual ip
    5. check xCAT installed or not:
      1. if xcat is not installed, install xCAT;
      2. if xcat is installed, skip this step
    6. if current xcat DB type (lsxcatd -d) is different from target type, switch DB to target type
    7. check if the site table master and nameservers and network tftpserver attribute are the Virtual ip, if not, correct them -----This is another story refer to doc Changing the hostname/IP address
    8. check if there is xcat data in shared data directories:
      1. if no xcat data in shared data, and shared data directory permission is proper, create xcat data structure under shared data directory, copy xcat data into shared data directories, take /install directory as an example:
        mkdir /HA-data/install
        cp -r /install /HA-data/install
        
      2. create symbolic link to share data directories
    9. check xcat service work well, if not, exit and print error
    10. add the MN including VIP hostname and local IP hostname into policy table
    11. xcatha-failover deactivate this MN node
  2. xcatha-failover -a|--activate

    1. check virtual ip, make sure virtual ip is not used (ping), or else, exit
    2. add virtual ip into its nic
    3. set hostname to virtual ip
    4. check if current DB type is matched, if not, exit and clean up env
    5. make symbolic link to share data directories, for example:
      /install -> /HA-data/install
      /etc/xcat ->/HA-data/etc/xcat
      /root/.xcat -> /HA-data/root/.xcat
      /var/lib/pgsql -> /HA-data/var/lib/pgsql
      /tftpboot -> /HA-data/tftpboot
      
    6. start/re-configure all related services as followings, make sure all related services are configured stop from starting on reboot :
      1. database (mysql/postgresql/sqlite type)
      2. xcatd
      3. named service (makedns -n)
      4. DHCP service (makedhcp -n, makedhcp -a)
      5. Console Server
      6. ... ...
  3. xcatha-failover -d|--deactivate

    1. make sure all related services as followings are down, make sure all related services are configured stop from starting on reboot
      1. console service
      2. DHCP service
      3. named service
      4. xcatd
      5. database (mysql/postgresql/sqlite type)
    2. umount/un-link shared data directories on host1
    3. change hostname if needed
    4. remove virtual IP

Function modules (this is only reference for functions)

vip_check:

  1. check if vip is used or not, (can use ping), if it is used, print error and exit 1.

configure_vip:

  1. configure virtual ip as non-persistent alias IP address, it is no need to write ifcfg_* files.
  2. add the alias ip address into the /etc/resolv.conf as the nameserver

change_hostname:

  1. change the hostname resolution order to be using /etc/hosts before using name server
  2. add the specific ip address and its hostname into /etc/hosts
  3. change hostname to the hostname that resolves to the specific ip address

unconfigure_vip: remove virtual ip, call change_hostname to original hostname

check_xcat_attribute: check attribute value is the virtual ip (master and nameservers in site table, tftpserver in networks table)

execute_command: start|stop service, if success, return [Passed], or else , retry, after retry 3 times and get failed , return [Failed]

configure_shared_data:

  1. check if xcat data is in shared data directory or not, if not:
    1. check shared data directory permission
    2. create xcat data structure in shared data directory
    3. copy xcat data into shared data directories
  2. create symbolic link to share data directories

unconfigure_shared_data: unlink shared data directories

clean_up_env: if some service is failed, call unconfig_vip, call change_hostname, to restore original hostname

log_info(self, message)

runcmd(self, cmd)

configure_xcat_attribute(self, host, ip)

current_database_type(self, path)

get_physical_ip(self, nic)

check_database_type(self, dbtype, vip, nic)

check_xcat_exist_in_shared_data(self, path)

check_shared_data_db_type(self, tdbtype, path)

switch_database(self, dbtype, vip, physical_ip)

install_db_package(self, dbtype)

install_xcat(self, url)

find_line(self, filename, keyword)

change_hostname(self, host, ip)

unconfigure_vip(self, vip, nic)

check_service_status(self, service_name)

finditem(self, n, server)

change_xcat_policy_attribute(self, nic, vip)

copy_files(self, sourceDir, targetDir)

configure_shared_data(self, path, sharedfs)

clean_env(self, vip, nic, host)

deactivate_management_node(self, nic, vip, dbtype)

xcatha_setup_mn(self, args)

parser_arguments()

News

History

  • Oct 22, 2010: xCAT 2.5 released.
  • Apr 30, 2010: xCAT 2.4 is released.
  • Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
  • Apr 16, 2009: xCAT 2.2 released.
  • Oct 31, 2008: xCAT 2.1 released.
  • Sep 12, 2008: Support for xCAT 2 can now be purchased!
  • June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
  • May 30, 2008: xCAT 2.0 for Linux officially released!
  • Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
  • Oct 31, 1999: xCAT 1.0 is born!
    xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.
Clone this wiki locally