cluster
Cluster Management of Nimsoft Probes

The Cluster probe enables fail-over support for a set of Nimsoft probes in clustered environments. Probe configuration that has been associated with a Resource Group will follow that group to the Cluster Node that the group running on. The probe will also send alarms if a Resource Group or Cluster Node changes state.

The data collection is implemented as plugins. How the plugins collect data from the cluster differ, but the information returned by the plugin is always the same.
The collected data must contain the name and state of the Nodes, the name and state of each Resource Group and on what Node they are running. This information is used to manage where the different probe configurations should run and not run.

Supported Clusters

Microsoft Cluster Service (MSCS)
The plugin has been verified on Microsoft Cluster Service version 5. This plugin collects data using the Cluster Server API found in the Windows SDK.

Veritas Cluster Service (VCS)
This plugin collects data from the cluster using the command hastatus -summary.

Note: The Nimsoft cluster probe supports all Microsoft Cluster configurations.

Important: Profile name(s) of Nimsoft probe profiles that are to be "clustered", needs unique names across the cluster.

Supported probes:

  • Exchange_monitor (Only supported on MSCS) ( Active/Active, Active/Passive and N+I node cluster)
    (supported in cluster probe version 1.61 and newer)
  • CDM (disk profiles only), (Active/Passive and N+I node cluster) (supported in cluster probe version 2.20 and newer)
  • Processes (2-node Active/Passive. N+I node clusters as long as profile names are unique)
  • NTServices (2-node Active/Passive. N+I node clusters as long as profile names are unique)
  • Logmon (2-node Active/Passive. N+I node clusters as long as profile names are unique)
  • Dirscan (2-node Active/Passive. N+I node clusters as long as profile names are unique)
  • SqlServer
  • Oracle
  • Ntperf


Notes:

  • Version 2.2x: Changed behaviour when running with cdm probe version 4.0x.
    The cdm probe will receive information about cluster disk resources from the cluster probe and create monitoring profiles for these based on the 'fixed_default' settings. These profile are automatically registered with the cluster probe to ensure continous monitoring on cluster group failover. In the cdm probe, the cluster group is used as Alarm and Quality of Service source instead of the cluster node.
    Note that on upgrade old monitoring profiles in the cdm probe for the cluster disks are overriden with the new ones.
  • Cluster probe 2.51 and above do not support CDM probe below version 4.20. All the sub mounted and cluster disks in the CDM probe needs to be manually reconfigured when upgraded to 4.20 or above.


Installation notes

All the cluster nodes must run a Robot with a identical set of probes installed.
Revision history
Date Description State Version
02.03.2023

What's New:

  • Added support for Redhat Pacemaker v2.x cluster monitoring.
  • Fixed the following security vulnerabilities:
    • Upgraded probe with zLib v1.2.13
    • OpenSSL upgraded to v1.1.1q

Defects Fixed:

  • Fixed an issue where cluster and cdm probes block all TCP ports on a Veritas cluster node.
    Support Case: 33235964
SHA-256 Checksum: 0f2c1997bf5bb7040d7ec220b95a9ff8915d77468b34abc8778376c68d49aae7
GA 3.72
02.07.2021

What's New:

  • Added support for RHEL 8.x (x86_64).
  • Added support for Microsoft Visual C++ Redistributable for Visual Studio 2017 package version 1.01 (vs2017_vcredist_x86 1.01 and vs2017_vcredist_x64 1.01). This support ensures that the minimum version of the VS 2017 package is equal to or greater than 1.01.

Fixed Defects:

  • Unable to select MS Cluster Shared Volume in cluster probe. Support case number: 31924995, 31931481
  • Cluster probe causing sqlserver profiles to disappear. Support case number: 01358985
  • Cluster failover leaves passive node in unexpected state. Support case number: 32389063
  • CDM Probe Error -[9784] cdm: Unable to register section /cluster/XXXXXX/disk for group XXXXX with cluster probe (error). Support case number: 32327697
MD5 Checksum: 484e39149d3e50295c8f945d7e20efa3
SHA-1 Checksum: 31ae62d94682dc8a44eb7bfa8f5d3a51ccc6cdaa
3.71
02.03.2020

What's New:

  • Alarm source drop down has been added to accommodate all the 3 possible values of alarm source which are "Node IP" (default one), "Cluster Name" and "Robot Name". 
  • Added ‘auto_configure_profiles' key in the raw configuration of the cluster probe to support automatic profile creation.

Fixed Defect:

  • The Cluster probe reads wrong file system names in the Veritas cluster. Salesforce case number: 01132664
MD5 Checksum: 3215b2bd95fa7e0a56c7fc6329344749
SHA-1 Checksum: 98027678463a6a8c1f309ae7264fddf47ed45acb
3.70
20.07.2018

What's New:

  • Added support for Redhat Pacemaker Cluster monitoring.

Note: The Redhat Pacemaker Cluster behaves differently from other supported clusters. If a resource group’s state is Down, these groups are not displayed on the probe GUI, although the alarms and QoS are sent by the node to which the group belongs (last owner).

For Detailed Release Notes-Please refer

https://docops.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN

Note: Support case(s) may not be viewable to all

MD5 Checksum: ef84c02c5d7cd785c18da615dc06b09c
SHA-1 Checksum: 7df68e856da63ea8bfd847ec31fd6dba1189f807
3.50
08.12.2017 Fixed Defects:
• The output of the clustat command truncated the cluster information to only 80 characters column size. Salesforce case number: 00639419
• Partition disks did not display as a part of the cluster. Salesforce case number: 00517081
• The cluster probe is trying to communicate with the port number 48000 even when the controller probe is using a different port. Salesforce case number: 00502664
• Communication error when the probe attempts to communicate with the cluster. Salesforce case number: 481762
• The probe crashes and results in communication error when deployed on 15 HP-UX machines. Salesforce case number: 00246784

For Detailed Release Notes-Please refer
https://docops.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN

Note: Support case(s) may not be viewable to all

SHA-1: eb1402ae036c02661d62317ab6ab4dddadadaca0
MD5: 0b587c220bb3fa14c6441cc20cd25466
3.43
30.09.2016 What's New:
- Added support to suppress the Available Storage Group alarms for Microsoft Cluster Shared Volume and Microsoft Cluster Services. Support case number 442501
- Updated the name of the resource group for Microsoft Cluster Shared Volume. Support case number 245292
Fixed Defect:
- The probe incorrectly identified the drives that are managed by the storage foundation on Microsoft Cluster Service. Support case number 252181

For Detailed Release Notes-Please refer
https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN Note: Salesforce case(s) may not be viewable to all
3.42
26.04.2016 Fixed Defect:
After a failover, the probe displayed clustered drives in the cdm probe, with disabled monitoring on the new passive node. Support case number 298709

For Detailed Release Notes-Please refer
https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN

Note: Salesforce case(s) may not be viewable to all
3.41
19.03.2016 What's New:

- Added support for Microsoft Cluster Shared Volume monitoring.

For Detailed Release Notes-Please refer https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN
3.40
31.12.2015 Fixed Defects:
1. On RHEL platform, the cluster disks were displayed as local or network disks on the cdm Infrastructure Manager (IM). Support case number 00160058
Note: You must use cdm version 5.61 with cluster version 3.33 to view the cluster disks on the cdm Infrastructure Manager (IM).
2. When node service was stopped, cluster probe marked resources offline and kept sending _restart command to other probes. Support case number 70002275

For Detailed Release Notes-Please refer
https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN

Note: Salesforce case(s) may not be viewable to all
3.33
14.10.2015 Fixed Defects:
1. On Linux platform, the cluster disks were displayed as local disks on the cdm Infrastructure Manager (IM). Salesforce case 00135028
2. The probe displayed incorrect status of clustered disks on the IM. Salesforce cases 00169389, 00170460
3. Updated the Supported Probes section in Release Notes to describe the cluster configuration that is supported by sqlserver, oracle, and ntperf. Salesforce case 00169432

For Detailed Release Notes-Please refer
https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN
Note: Salesforce case(s) may not be viewable to all
3.32
09.06.2015 Updated support for factory templates.
Fixed an issue where the probe does not clear package halted\failed alarms when the failover is unsuccessful in HP-UX clusters and the package starts on the primary node.

For Detailed Release Notes-Please refer
https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN
3.31
22.05.2015 Added Logical Volume Manager (LVM) support for HP-UX platform.
Added Log Size field.
Provided an option to choose the protocol key as TCP or UDP for raw configuration. By default, the protocol key is TCP. (Salesforce Case: 00160858)

For Detailed Release Notes-Please refer

https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN
Note:Salesforce case(s) may not be viewable to all
3.30
31.03.2015 Added support for HP-UX service guard A.11.19 cluster.Added support for factory templates. Fixed a defect where cdm disks were not getting activated/deactivated on failover. (salesforce case: 00133038) Fixed a defect where on Redhat cluster another node was coming for quorum disks. (salesforce case: 00144966) Fixed the defect where the probe used to give an error message "The node has no valid cluster probe address and you can't add any resources until this is resolved". (salesforce case: 00121291)


For Detailed Release Notes-Please refer

https://wiki.ca.com/rest/ca/product/latest/topic?format=rendered&language=&space=UIMPGA&hid=cluster_RN
Note:Salesforce case(s) may not be viewable to all
3.20
30.09.2014 Added support for configuring the probe from the Admin Console (web-based) GUI
For Detailed Release Notes- Please Refer http://docs.nimsoft.com/prodhelp/en_US/Probes/AdminConsole/cluster/ReleaseNotes/index.htm
3.13
03.09.2013 Fixed Memory Leak 3.12
19.06.2013 Fixed issue related handles leaks. Fixed issue: Cluster probe is not working (Unable to define any profile). Fixed an issue where cluster probe is not cleaning up alarms afters failover to next node. 3.11
21.12.2012 Added Probe Defaults. Added resource information for the a resource group in Alarm Message. Added the support for RHEL5.0 x64bit. 3.10
27.08.2012 Updated cluster information in cdm for RHCS. 3.01
16.08.2011 Added support for RedHat cluster.
Fixed message synchronization for NTServices probe.
3.00
28.03.2011 Fixed group synchronization when not using node IPs in cluster.cfg
(applied fix from v2.66 into 2.7x release)
2.72
14.01.2011 Fixed the group synchronization when not using node IPs in cluster.cfg (applied fix from v2.66 into v2.7x release)
2.72
14.01.2011 Fixed the group synchronization when not using node IPs in cluster.cfg
2.66
12.01.2011 Applied fixes from v2.65 into 2.7x release.
2.71
11.01.2011 Fixed a potential program failure on SOLARIS (logging of NULL pointer terminates probe on SOLARIS).
2.65
23.12.2010 Added support for internationalization.
2.70
30.09.2010 Fixed a potential program failure on SOLARIS (no node IP in cfg causing failure).
2.64
13.09.2010 Added fix in the GUI to use named APIs instead of IP.
2.63
04.08.2010 Changed the haclus -list to haclus -value ClusterName
Fixed the cluster compatibility issue on IA64 platfom.
Fixed the issue of wrong IP in NAT environment in the GUI and the probe.
Fixed the issue of double cluster group devices listing in CDM.
Fixed the issue of cluster drives being reported as local in CDM on non-owner nodes.
Added a validation while adding shared sections or subsections in GUI.
Removed whites spaces from the cluster names at the time of discovery.
Version 2.62 withdrawn because of potential commucation errors when configuring.
2.62
24.06.2010 Added support for AIX platform.
2.61
28.04.2010 Added support for extended NIS database information.
Added support for Resources Failed and Resources Not Probed in hastatus.
2.60
31.03.2010 Fixed the issue of Drive reported as Disk3Partition1 in case the device is down on the cluster.
NOTE: Please see the Notes section below if CDM probe is used with cluster probe.
2.52
19.03.2010 Fixed the CDM mount point handling issue in the Microsoft cluster plugin dll. 2.51
19.03.2010 Added support for merging configuration when configuration is done parallely across different cluster nodes.
Added support for configuring shared resources individually as well in bulk.
2.50
19.03.2010 Added a callback get_cluster_group_all to get the complete status of cluster. 2.30
14.07.2009 Built with new libraries.
Added QoS messages for state changes on Node and Group state.
Added levels of alarm severity based on Node and Group state.
Added support for fetching of cluster resources.
Added support for identifying clustered disks.
Added option of removing Resource Groups no longer part of the cluster.
Changed callback get_nodename.
Fixed retrieval of evs_name (correct case) for Exchange 2007 in get_EVS.
Fixed issue regarding "illegal SID" upon cluster probe synchronization.
Fixed fetching of resource type in calls to get_EVS and get_cluster_resources.
Fixed callback get_EVS (input argument nodename is no longer case sensitive).
Added support for Windows on Itanium 2 systems (IA64).
Fixed synchronizing issue in NAT environments. (Add key use_local_ip = 1 in /setup section (use "raw configure"))
2.21
26.06.2008 Fixed problem with change of alarm subsystem IDs.
Fixed association of same profile to multiple Service Groups (This is not allowed).
2.04
17.04.2008 Fixed minor GUI issues.
Fixed GUI refresh. Fixed logging on Solaris.
Fixed program failure on Solaris.
Fixed handling of Service Groups in PARTIAL state for VCS.
Fixed probe security settings for LINUX and SOLARIS.
Fixed OS key for Solaris plugin (wizard failed).
Added the port library on Solaris (load plugin failed).
Added support for Veritas Cluster Server (VCS) on Windows, Linux and Solaris.
2.03
24.09.2007 Fixed synchronization issues.
Fixed memory leak (IP=NULL)
1.64
22.06.2007 Fixed saving of Resource Groups containing slash (/) 1.62
22.06.2007 Share "partially online" resource group setting with exchange_monitor probe.
Added support for identifying Exchange Virtual Servers.
Added support for NimBUS exchange_monitor in A/A, A/P and N+I node cluster configurations.
1.61
01.09.2006 Fixed several GUI issues:
  • Added multiple profile selection
  • Removed "add node", "delete node" and "add resource group" options.
  • Added support for edit/disable alarms.
  • Fixed several run-time error situations.
  • Changed source field on node alarms.
Added support for shared sections and probe profiles found in /cluster_setup section of other probes.
1.50
28.04.2006 Fixed issue with resource groups not having their states set correctly when alarms were turned off.
Fixed issues relating to synchronization between cluster probes, especially when adding new resources.
Fixed security issue when synchronizing probe configuration between nodes.
1.26
14.12.2005 Cosmetic GUI changes. Added Refresh to menu. Fixed text for clear alarms. 1.22
Requirements
Platform: Please refer to the Platform Support Matrix located in the Download section of http://support.nimsoft.com
Software: Cluster Software: MSCS (Windows only) or VCS (Windows, Linux, Solaris, AIX)
Hardware: None