Wednesday, March 21, 2012

VCS Cluster Communications Guide

VCS Cluster Communications

From HP's MC/Serviceguard, cluster membership is defined in the primary cluster configuration file - simply an ASCII file that the administrator edits. Cluster communications takes place in any network path between the cluster nodes.

Veritas Cluster Server's communications is more complex. But once you have everything set up, in theory, you should never have to worry about it again. However, troubleshooting VCS cluster communications without the requisite knowledge, would be a very entertaining exercise - to watch...

This page seeks to document the VCS communications elements and how to go about updating/modifying them, as needed. The first section is an overview - how the pieces tie togeter.

Following that, we'll work our way up from the bottom - examining LLT, then GAB in order.

Overview

To begin the overview, refer to the picture:


















VCS agents track the state of all resources and service groups in the cluster. _HAD_ polls the various agents on the node and, if there's a change, reports that to _GAB_. _GAB_ (Group Membership Services/Atomic Broadcast) has two jobs.

First, it tracks which systems are part of the cluster. Cluster membership is defined by systems sharing the same cluster ID and a pair of redundant Ethernet LLT cables.

GAB's second job is to transmit resource status changes to all nodes in the cluster.

The atomic broadcast portion of the name implies (correctly, as it turns out) that all systems in the cluster are notified of any changes. If a failure occurs during the update, the "status change" is rolled back ensuring that, upon recovvery, all nodes have the same status information. It's the same paradigm as a database commit, if that's familiar to you _LLT_ is responsible for transmitting the heartbeat signals which GAB uses to maintain cluster membership.

A cluster can have between 2 and 8 LLT cables. LLT links can be identified as low or high priority.

- High priority links:
  - Send a heartbeat ever .5 seconds
  - Carry cluster status information
  - Should be configured over dedicated network links

- Low priority links:
  - Send a hearbeat every second.
  - Do not carry cluster status information.
  - Can be configured on public networks
  - Will be automtically promoted to high priority links if all other high priority links have failed.

LLT


LLT is the lowest protocol in the VCS communications chain so everything else relies on it. If LLT isn't happy, ain't nothing happening - so, let's make LLT happy.

_/etc/llttab_

_/etc/llttab_ is LLT's primary configuration file. At a minimum, it specifies the system ID, cluster ID, and the local links that LLT uses for heartbeat signals.

You can see what other options are available via the sample llttab file in _/opt/VRTSllt/llttab_ and in the llttab man page. Since the file defines host specific entries, it must be unique to the host. No rdist'ing this file...

# cat /etc/llttab
set-node       1
set-cluster    10
link qfe0 /dev/qfe:0 - ether - -
link hme0 /dev/hme:0 - ether - \-

set-node

Ensure that there is only one _set-node_ directive in the file The value for the set-node can be either a number (0-31) or a system name. If you use the system name, the name must resolve to a unique number in the _/etc/llthosts_ file.

For instance:

# cat /etc/llttab
set-node       athena
set-cluster    10
link qfe0 /dev/qfe:0 - ether - -
link hme0 /dev/hme:0 - ether - -
# cat /etc/llthosts
0 athena
1 zeus
2 otehllo
3 ...

If you decide to use the _/etc/llthosts_, the following rules apply: - The file must be synced across all nodes in the cluster. - The node numbers must be unique otherwise, the cluster won't start. - The system names used must match those in the _/etc/llttab_, _etc/VRTSvcs/conf/main.cf_ and _/etc/VRTSvcs/conf/sysname_ (if used) files.

set-cluster

The set-cluster directive must specify a unique number across any clusters that the LLT heartbeat can reach. This implies that you can have multiple clusters sharing the same LLT links. This makes it easier on the networking people in that you can have four 4-node clusters sharing the same two 16-port switches thereby reducing the overall hardware costs.

_/etc/sysname_

The sysname file eliminates VCS's dependence on the UNIX uname command to identify the hostname. The problem is that some OSs will report a fully qualified domain name for a "uname -n" command. If that's the case, the system name won't match the name in the main.cf file and VCS will puke.

If you use the sysname file, ensure the system name matches the one used in the llttab and ltthosts file - assuming you're using the system name there as well. At a minimum, it must match the name in the main.cf file.

LLT commands


OK; now that we have all the LLT configuration files set up, it's time to run LLT:lltconfig -c Now that it's running, how do we verify that

# lltconfig -a list
llt is running

The "lltstat -nvv" command displays some verbose information on the status of LLT. You can use this command to verify that all LLT links are operational.

The training manual suggests executing the command via cron and looking for DOWN periodically. See the man page for additional options


GAB


GAB is the next rung up the protocol ladder; hower, it's easier to configure than is LLT h3. _/etc/gabtab_

The gabtab file is the primary (and only) means of configuring GAB. It specifies the command line used to start GAB. As such, in order to start GAB, you just run the /etc/gabtab file/etc/gabtab


GAB Status


Use "gabconfig -a" to check the status of GAB.

# gabconfig -a
GAB Port Membership
===============================================
Port a gen b38f123 membership 01        ; 2       01
Port h gen f00b123 membership 01        ; 2       01

There's quite a bit of useful information in the list above. For instance, the _Port a{_}indiates that GAB is communicating (which automatically means that LLT is fully functional) and has membership of nodes 0, 1, 12, 20, and 21.

The membership list uses ';' for 10's markers or 0 if the node is actually in use The _Port h_ indicates that had is started and has similar memberships to GAB

Manual Seeding


GAB will normally handle the seeding of the cluster nodes automatically based on the information in the _/etc/gabtab_. If, however, you have a node down for maintenance and need to restart your cluster, GAB will effectively hang waiting for the last system to come alive.

To circumvent this issue, you will need to manually seed the cluster. To do that, do the following: - On one node and one node only, execute gabconfig -c -x -

On the remaining available nodes, execute gabconfig -c

----
Reference:
[http://olearycomputers.com/ll/vcs/vcs_comms.html]

Enable VCS SecondLevelMonitoring for Apache with Siteminder Protection

This is a record of how i setup for VCS 5.0 MP3. The following steps has been tested on RHEL4 64bit with VCS 5.0MP3 and siteminder 6QMR5 Hotfix15. Originally VCS is sending the following to apache for 2nd level probing.

[root@webserver Apache]# grep HEAD Apache.pm
  print $sock "HEAD $sGetFile HTTP/1.0" . $space;

When we manually tested with Apache, the successful command is as follows,

[root@webserver siteminder]# telnet 10.11.12.13 80
Trying 10.11.12.13...
Connected to webserver.site.com (10.11.12.13).
Escape character is '^]'.
HEAD / HTTP/1.1
Host: webserver.site.com

HTTP/1.1 200 OK
Date: Mon, 17 Nov 2008 06:20:48 GMT
Server: Apache
Last-Modified: Thu, 06 Nov 2008 06:52:31 GMT
ETag: "4c4ab-25-45affbd9a39c0"
Accept-Ranges: bytes
Content-Length: 21
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1

Connection closed by foreign host.
[root@webserver siteminder]#

When using HTTP/1.1 with additional host line, code 200 is returned. Therefore, we can try modifying Apache.pm with the following in RED to work with Siteminder using SecondLevelMonitoring. Host identifier for siteminder.

[root@webserver Apache]# grep HEAD Apache.pm
  print $sock "HEAD $sGetFile HTTP/1.1\nHost: $sHost" . $space;

A dummy file for checking that the web service is OK.

[root@webserver conf.d]# grep "sGetFile =" /opt/VRTSvcs/bin/Apache/Apache.pm
  $sGetFile =  '/ok.gif';

Without the above, the following will happen when SecondLevelMonitor is enabled, what you see in /var/VRTSvcs/log/Apache_A.log

2009/01/13 11:56:28 VCS ERROR V-16-2-13066 Thread(4136012704) Agent is calling clean for resource(webserver) because the resource is not up even after online completed.
2009/01/13 11:56:29 VCS NOTICE V-16-55005-10455 Resource(webserver) - (webserver:clean) VCSagentFW:SetupLogging:[clean] Entered by resource instance [webserver] with clean reason [3][Online Ineffective]
2009/01/13 11:56:34 VCS ERROR V-16-2-13068 Thread(4136012704) Resource(webserver) - clean completed successfully.
2009/01/13 11:56:34 VCS ERROR V-16-2-13071 Thread(4136012704) Resource(webserver): reached OnlineRetryLimit(0).
and you see in /var/log/messages
Jan 13 11:54:27 webserver Had[31344]: VCS ERROR V-16-1-20047 (webserver) Apache:webserver:monitor:  HTTP GET test failed for host [webserver.site.com] port [80]
Jan 13 11:55:28 webserver Had[31344]: VCS ERROR V-16-1-20047 (webserver) Apache:webserver:monitor:  HTTP GET test failed for host [webserver.site.com] port [80]
Jan 13 11:56:28 webserver Had[31344]: VCS ERROR V-16-1-20047 (webserver) Apache:webserver:monitor:  HTTP GET test failed for host [webserver.site.com] port [80]
Jan 13 11:56:28 webserver AgentFramework[31357]: VCS ERROR V-16-1-13066 Thread(4136012704) Agent is calling clean for resource(webserver) because the resource is not up even after online completed.
Jan 13 11:56:28 webserver Had[31344]: VCS ERROR V-16-1-13066 (webserver) Agent is calling clean for resource(webserver) because the resource is not up even after online completed.
Jan 13 11:56:34 webserver AgentFramework[31357]: VCS ERROR V-16-1-13068 Thread(4136012704) Resource(webserver) - clean completed successfully.

Eventually, the apache service will be FAULTED. Reason behind this is due to the simple query done by Apache.pm and siteminder blocked this query. Inside /var/log/messages, you will see something similar,

Jan 13 11:56:34 webserver AgentFramework[31357]: VCS ERROR V-16-1-13071 Thread(4136012704) Resource(webserver): reached OnlineRetryLimit(0).
Jan 13 11:56:35 webserver Had[31344]: VCS ERROR V-16-1-10303 Resource webserver (Owner: unknown, Group: webserver_grp) is FAULTED (timed out) on sys webserver

Do note that this is not supported by Symantec but the suggestion came from Symantec after i logged a case with them for VCS+Apache not working with Siteminder. You may need to backup this Apache.pm file in case the file gets overwritten during patching or when you need to get support from Symantec.

 Thats all folks.

MPT major number error

If you hit the following errors when you reboot the server, just as the server is almost ready. They will come in a large numbers repeatedly.

"WARNING: add_spec: No major number for mpt"
You may want to try the solutions below. I have tried out solution 2 successfully.
Document Audience: SPECTRUM
Document ID: 74181
Title: SolarisTM: "WARNING: add_spec: No major number for mpt"
Update Date: Wed Sep 29 00:00:00 MDT 2004
Products: Solaris 8 Operating System, Solaris 9 Operating System
Technical Areas: Patch

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

Keyword(s):add_spec, mpt, major number

Problem Statement:

After installing a patch to update /etc/driver_classes with
an "mpt" entry (for example, 108528-xx with xx>21), the system may
generate the following WARNING messages at boot time:

SunOS Release 5.8 Version Generic_108528-29 64-bit
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
WARNING: add_spec: No major number for mpt
WARNING: add_spec: No major number for mpt
WARNING: add_spec: No major number for mpt
WARNING: add_spec: No major number for mpt
[...]
WARNING: add_spec: No major number for mpt
WARNING: add_spec: No major number for mpt
WARNING: add_spec: No major number for mpt
configuring IPv4 interfaces: hme0.
configuring IPv6 interfaces: hme0.

The following message appears in the /var/sadm/patch//log file
when the entry has not been successfully added to the /etc/name_to_major
file:
SUNWcsr: failed to add mpt to /etc/name_to_major:
(mpt) already in use as a driver or alias.

Explanation:
============
The problem is due to a reference in /etc/driver_classes to the
mpt driver. The mpt driver isn't added to the system because the
/etc/driver_aliases file already had an entry with the mpt driver. Because
of this, the add_drv failed when trying to install the driver.

This is an example of the inconsistencies described in Bug ID 4939994,
"Inconsistency between name_to_major and driver_aliases."

Resolution:

To solve this problem, a script has been created to identify the
inconsistencies between /etc/driver_aliases and /etc/name_to_major.
This script can also add the missing entries if they are listed in a
"reference" name_to_major file.

Here is an example of the way to use the attached script:

    cksum InconsitencyFixTool.tar.gz
    1091730651 4410 InconsitencyFixTool.tar.gz

    gzip -dc InconsitencyFixTool.tar.gz | tar -xvf -
    x InconsitencyFixTool, 0 bytes, 0 tape blocks
    x InconsitencyFixTool/README, 1364 bytes, 3 tape blocks
    x InconsitencyFixTool/name_to_major.InconsitencyFix, 4928 bytes, 10 tape blocks
    x InconsitencyFixTool/name_to_major.i386.5.8, 841 bytes, 2 tape blocks
    x InconsitencyFixTool/name_to_major.i386.5.9, 889 bytes, 2 tape blocks
    x InconsitencyFixTool/name_to_major.sparc.5.8, 1831 bytes, 4 tape blocks
    x InconsitencyFixTool/name_to_major.sparc.5.9, 1815 bytes, 4 tape blocks

    cd InconsitencyFixTool
    ./name_to_major.InconsitencyFix
    Saving original files into .
    Inconsistency found between //etc/driver_aliases and //etc/name_to_major on
    the following driver(s):
    mpt

Add mpt to //etc/name_to_major ? y y
Adding the following devices to //etc/name_to_major :
mpt 215

     

Reboot the system.

Temporary Workaround:

If the above script cannot be used, there are two other ways to fix the
problem:

1. Remove the "mpt" lines from /etc/driver_classes and /etc/driver_aliases.

They should look like the following:
driver_aliases:mpt "pci1000,30"
driver_classes:mpt scsi

Next, install (or remove and reinstall) a patch updating all
those files [/etc/driver_aliases /etc/driver_classes
/etc/name_to_major]. For example, install the mpt patch 115275-01 (or
above) or a new kernel Update patch.

OR

2. Add the missing "mpt" entry in the /etc/name_to_major file
to correct the problem with the patch installation.

You must manually append the "mpt" entry at the end
of the /etc/name_to_major file as follows:

mpt XXX

Where XXX is the maximum+1 of the numbers already given to
the other drivers in this file. Separate "mpt" and the given number with
a space character.

For example:

    tail /etc/name_to_major
    fasttrap 223
    dmfe 224
    todds1307 225
    pool 226
    zcons 227
    ipf 228
    pfil 229
    ctsmc 230
    bl 231
    mpt 232

Reboot the system.

Additional Information:

History:
========
Two patches have been identified that can cause the problem:

108974-31(or greater): SunOS 5.8: dada, uata, dad, sd, ssd and scsi
drivers patch

OR

109885-14 SunOS 5.8: glm patch

Only on top of a kernel patch 108528-xx where xx <= 21.

Note: If you have a more recent version than -21 before installing
the above patches, the problem will not occur.

This problem can also exist on Solaris 9 - if S9 was installed with
an upgrade from a S8 system where the inconsistency existed.

Impact:
=======
The mpt driver would not be installed correctly and cannot be used.

Tuesday, March 20, 2012

Unable to access a Terminal Server

In the event you receive the following errors while RDP'ing into a server running Windows Terminal Service,

"The terminal server has exceeded the maximum number of allowed connections"
Try the following to forcefully login. At Command Prompt type
mstsc /v:host_ip:host_port /admin
Warning: Do note that you will be logging off the logged in or disconnected user. Any unsaved work by this logged in or disconnected user will be lost.

CmdLog write failed in VxVM

Ever encountered CmdLog write failure warning from VRTSvcs or VRTSvxvm?

For example,

root@nodeA # grep CmdLog /var/VRTSvcs/log/engine_A.log  | tail -5
VxVM vxdisk WARNING V-5-1-9668 CmdLog: write failed - No space left on device
VxVM vxdctl WARNING V-5-1-9668 CmdLog: write failed - No space left on device
VxVM vxdg WARNING V-5-1-9668 CmdLog: write failed - No space left on device
VxVM vxdisk WARNING V-5-1-9668 CmdLog: write failed - No space left on device
VxVM vxdisk WARNING V-5-1-9668 CmdLog: write failed - No space left on device

The cause of this is due to disk space in the partition where cmdlog resides is used up. In my case, the warning message came from node A even though /var partition on node B was full. Since logs for VCS should be identical for all nodes in a cluster, log could not be written to /var in node B but was able to in node A, hence the notification from node A.

The resolution is to clear up some space in that partition and you are good to go. Do remember to find out why this partition was filled up.

A little more about CmdLog.

 

Typically, cmdlog is located at /var/adm/vx

root@nodeA # ls -l /var/adm/vx/*log
-rw-------   1 root     root      336683 Nov 18 10:19 /var/adm/vx/cmdlog
-rw-r--r--   1 root     root       62423 Nov 17 17:25 /var/adm/vx/dmpevents.log
-rw-------   1 root     root      428837 Nov 18 10:19 /var/adm/vx/translog
-rw-r--r--   1 root     other          0 Aug  5  2009 /var/adm/vx/veacmdlog

And it is ASCII that records the commands that you fire to VCS / VxVM.
 
root@nodeA # file /var/adm/vx/cmdlog
/var/adm/vx/cmdlog:     ascii text
 
root@nodeA # tail -3 /var/adm/vx/cmdlog
 /usr/sbin/vxdisk -qag dg_myapp list
# 21304, 8603, Fri Nov 18 10:27:19 2011
 /usr/sbin/vxdisk list c4t50060E1006A11B41d0s2

Hope this helps.