HA
High Availability probe

The HA probe allows you to manages queues, probes and the NAS AutoOperator in a High Availability setup. The probe runs on the standby Hub. If it loses contact with the primary Hub it initiates a failover after a defined interval. When the primary Hub comes back online the probe will reverse the failover.

Installation notes


The probe must be installed on the standby Hub.
The probe is not activated after distribution. It must be configured, then activated manually.
If your NAS does not have the subsystem id 1.2.3.8 defined, you should add it to the subsystems list in the NAS or change the messages configurations to use the string "HA" in place of the subsystem id.

Configuration notes


In the setup section these keys are the most relevant:
  • remote_hub - this is the primary Hub's full Nimsoft address in the form /Domain/Hub/Robot/hub
  • hb_seconds - this is the number of seconds between heartbeat messages. Minimum value is 5 seconds to avoid "denial of service" on the primary Hub.
  • wait_seconds - this is the number of seconds the probe should wait before initiating a failover. The failover is ended immediately when the primary Hub comes back online.
  • reset_nas_ao - this allows you to specify whether or not to (de)activate the NAS AutoOperator on the failover system. Specify 'yes' or 'no'. The default is 'yes'.

In the probes section, you can specify a list of probes that are to be activated on the local Hub when a failover occurs. When the remote_hub comes back online these probes are deactivated again. The keys are of the form probe_0, probe_1 and so on while the values are the names of probes to be started/stopped.

In the queue_up section, you should specify the queues which are to be started during a failover. The same queue definitions must be set on both the primary and secondary Hubs. The keys are of the form queue_0, queue_1 and so on while the values are the names of queues to be started/stopped.

In the queue_down section, you should specify the queues which are to be stopped during a failover. The keys are of the form queue_0, queue_1 and so on while the values are the names of queues to be started/stopped.

In the messages section, you can change the alarm messages and their severities that are sent when a problem occurs. The severities are numeric values from 0 (clear) through 5 (critical).

Update notes


When updating to version 1.20 the old "queues" section is renamed to "queue_up".
To take advantage of the spooler adress change, the configuration must be saved from the cofnguration tool after probe update.
Revision history
Date Description State Version
01.07.2022

What's New:

  • Upgraded OpenSSL to 1.1.1o
  • Support for Visual Studio 2017.
  • Drop support of platforms other than windows_x64 and linux_x64 as compliant with hub.
SHA-256 Checksum: 748239c8abba7944a824b15f68fc02f38ff45a24545abfe1e4aa55d606e8cdeb
GA 20.40
31.07.2019

Fixed Defects:

  • Fixed an issue in which the UMP failover script was sometimes not starting when the ha 1.46 was initiating the failover. To resolve this issue, a new configuration parameter “NAS_AO_first_if_failover” is now available. To use this parameter, add it to the ha.cfg file. This parameter lets the ha probe decide the order of executing steps when the failover starts. Depending on the value, this parameter behaves as follows: (Support Case: 01324021)
    • NAS_AO_first_if_failover = 0 (default value) (This behavior is the same as in ha 1.46 and earlier.)
      In this case, when the failover starts, the ha probe performs the actions in the following sequence:
      • Issues the "Initiating failover from remote Hub XXXXXX" alarm.
      • Activates the queues configured in [Queues to enable].
      • Activates nas Auto Operator.
      • Activates probes configured in [Probes to enable].
    • NAS_AO_first_if_failover = 1
      In this case, when the failover starts, the ha probe performs the actions in the following sequence:
      • Activates nas Auto Operator.
      • Issues "Initiating failover from remote Hub XXXXXX" alarm.
      • Activates queues configured in [Queues to enable].
      • Activates probes configured in [Probes to enable].

Note: Note that the UMP failover script is started on arrival of the "Initiating failover from remote Hub XXXXXX" alarm in this scenario because the ha probe is activating nas Auto Operator in Step 1.

For more information, refer https://docops.ca.com/ca-unified-infrastructure-management-probes/ga/en/alphabetical-probe-articles/ha-high-availability/ha-high-availability-release-notes

MD5 Checksum: c19144245dc0a285c9d3e4d2b5b7b4cc
SHA-1 Checksum: c71000fd18e1203db34b8397278484f5aece00ee
1.47
15.11.2018

What's New:

For Detailed Release Notes-Please refer

https://docops.ca.com/rest/ca/product/latest/topic?hid=ha_RN&space=UIMPGA&language=&format=rendered

Note: Support case(s) may not be viewable to all

MD5 Checksum: 762c74770a9d8d6b3b7aab8bc26d334d
SHA-1 Checksum: 09e1b13321ad53a82abd237bf7f19f042d158a43
1.46
05.06.2014
  • Version 1.45 now includes support for a tunnel environment between the primary hub and the secondary hub running the HA probe.
  • Version 1.45 now defaults to ‘queue_activate_method = queue_active’ to enable and disable queues. This method communicates with the hub for queue activation/deactivation rather than modifying the ‘postroute’ keys in the hub.cfg file directly. This method addresses queue activation/deactivation issues in newer hubs.
  • MD5 checksum: dc4cefc2d974e5c402c0ff3446f410d5
  • SHA-1 checksum: 52c7bae585ec6a02948736bf90acbf4419de6b24
  • 1.45
    19.04.2013 Added failback wait time specification. 1.44
    18.09.2012 Fixed startup to enable probe to enable/disable queues and probes after a restart.
    Fixed probe and queue activation issues.
    Added logging and more identifiable information.
    1.43
    08.03.2011 Fixed startup sequence so it checks if a state change is required in the initial run.
    1.41
    31.12.2010 Added support for internationalization.
    Added support for reading alarm tokens from cfg.
    1.40
    30.09.2010 Added NIS(TNT2) Changes. 1.30
    30.09.2009 Probe now caches IP and port of remote address to avoid repeated lookups on the "static" data.
    Cache is refreshed every hour.
    Fixes problem, where the hub is busy and times out on the name lookup; in the worst case causing an incorrect failover.
    Configuration tool maps hub address to the spooler address automatically, as this provides better alive status.
    1.25
    26.06.2008 Fixed bug where subsystem id was ignored for alarm messages.
    Fixed heartbeat message timing issue. Changed default subsystem id to 1.2.3.8
    1.23
    10.04.2008 Changed how queues are activated/deactivated to avoid potential problems with Hub restarting in the middle of the operation.
    Added option to take a probe down when failing over with the new section "probes_down".
    Fixed minor memory leak when restarting the probe.
    Added configuration tool.
    Changed name of section from "queues" to "queue_up".
    Added section "queue_down" for queues which need to be deactivated when failover occurs. This is useful where the secondary hub has a post queue for e.g. QoS data to the primary hub. To avoid duplicate entries this has to be deactivated. It is reactivated after the primary hub comes back online.
    Port to Linux, Solaris 8 (sparc) and AIX 5. No functional changes.
    Changed control mechanism to active heartbeat checks. Queue is no longer required.
    Initial Release
    1.22
    Requirements
    Platform: Please refer to the Platform Support Matrix located in the Download section of http://support.nimsoft.com
    Software: None
    Hardware: None