Wednesday, March 21, 2012

Enable VCS SecondLevelMonitoring for Apache with Siteminder Protection

This is a record of how i setup for VCS 5.0 MP3. The following steps has been tested on RHEL4 64bit with VCS 5.0MP3 and siteminder 6QMR5 Hotfix15. Originally VCS is sending the following to apache for 2nd level probing.

[root@webserver Apache]# grep HEAD Apache.pm
  print $sock "HEAD $sGetFile HTTP/1.0" . $space;

When we manually tested with Apache, the successful command is as follows,

[root@webserver siteminder]# telnet 10.11.12.13 80
Trying 10.11.12.13...
Connected to webserver.site.com (10.11.12.13).
Escape character is '^]'.
HEAD / HTTP/1.1
Host: webserver.site.com

HTTP/1.1 200 OK
Date: Mon, 17 Nov 2008 06:20:48 GMT
Server: Apache
Last-Modified: Thu, 06 Nov 2008 06:52:31 GMT
ETag: "4c4ab-25-45affbd9a39c0"
Accept-Ranges: bytes
Content-Length: 21
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1

Connection closed by foreign host.
[root@webserver siteminder]#

When using HTTP/1.1 with additional host line, code 200 is returned. Therefore, we can try modifying Apache.pm with the following in RED to work with Siteminder using SecondLevelMonitoring. Host identifier for siteminder.

[root@webserver Apache]# grep HEAD Apache.pm
  print $sock "HEAD $sGetFile HTTP/1.1\nHost: $sHost" . $space;

A dummy file for checking that the web service is OK.

[root@webserver conf.d]# grep "sGetFile =" /opt/VRTSvcs/bin/Apache/Apache.pm
  $sGetFile =  '/ok.gif';

Without the above, the following will happen when SecondLevelMonitor is enabled, what you see in /var/VRTSvcs/log/Apache_A.log

2009/01/13 11:56:28 VCS ERROR V-16-2-13066 Thread(4136012704) Agent is calling clean for resource(webserver) because the resource is not up even after online completed.
2009/01/13 11:56:29 VCS NOTICE V-16-55005-10455 Resource(webserver) - (webserver:clean) VCSagentFW:SetupLogging:[clean] Entered by resource instance [webserver] with clean reason [3][Online Ineffective]
2009/01/13 11:56:34 VCS ERROR V-16-2-13068 Thread(4136012704) Resource(webserver) - clean completed successfully.
2009/01/13 11:56:34 VCS ERROR V-16-2-13071 Thread(4136012704) Resource(webserver): reached OnlineRetryLimit(0).
and you see in /var/log/messages
Jan 13 11:54:27 webserver Had[31344]: VCS ERROR V-16-1-20047 (webserver) Apache:webserver:monitor:  HTTP GET test failed for host [webserver.site.com] port [80]
Jan 13 11:55:28 webserver Had[31344]: VCS ERROR V-16-1-20047 (webserver) Apache:webserver:monitor:  HTTP GET test failed for host [webserver.site.com] port [80]
Jan 13 11:56:28 webserver Had[31344]: VCS ERROR V-16-1-20047 (webserver) Apache:webserver:monitor:  HTTP GET test failed for host [webserver.site.com] port [80]
Jan 13 11:56:28 webserver AgentFramework[31357]: VCS ERROR V-16-1-13066 Thread(4136012704) Agent is calling clean for resource(webserver) because the resource is not up even after online completed.
Jan 13 11:56:28 webserver Had[31344]: VCS ERROR V-16-1-13066 (webserver) Agent is calling clean for resource(webserver) because the resource is not up even after online completed.
Jan 13 11:56:34 webserver AgentFramework[31357]: VCS ERROR V-16-1-13068 Thread(4136012704) Resource(webserver) - clean completed successfully.

Eventually, the apache service will be FAULTED. Reason behind this is due to the simple query done by Apache.pm and siteminder blocked this query. Inside /var/log/messages, you will see something similar,

Jan 13 11:56:34 webserver AgentFramework[31357]: VCS ERROR V-16-1-13071 Thread(4136012704) Resource(webserver): reached OnlineRetryLimit(0).
Jan 13 11:56:35 webserver Had[31344]: VCS ERROR V-16-1-10303 Resource webserver (Owner: unknown, Group: webserver_grp) is FAULTED (timed out) on sys webserver

Do note that this is not supported by Symantec but the suggestion came from Symantec after i logged a case with them for VCS+Apache not working with Siteminder. You may need to backup this Apache.pm file in case the file gets overwritten during patching or when you need to get support from Symantec.

 Thats all folks.

No comments: