When ever Websphere Application Server has a thread that runs for a long time (600s), by default it will report with a WSVR0605W message. This is a similar behvaiour we see in weblogic.
WSVR0605W: Thread
Reading from the following source, it is possible to get Websphere Application Server to generate a javacore when a potentially hung thread is reported. On Solaris, we call this thread dump.
Source : [http://www-01.ibm.com/support/docview.wss?uid=swg21448581]
This core file can be helpful in troubleshooting server hangs and performance issues.
If the jobs running in your system usually take a long time, you may want to tune the monitoring time to more than 600s else you might get false reports.
The website described that property "com.ibm.websphere.threadmonitor.dump.java" should be enabled.
h4. Steps to enable auto thread dump.
Log in to administrative console, click Servers > Application Servers > server_name.
Under Server Infrastructure, click Administration > Custom Properties.
Click New.
Add the following property:
Name: com.ibm.websphere.threadmonitor.dump.java
Value: true
Click Apply.
Click OK and save the configuration changes.
Restart the Application Server for the changes to take effect.
done.
In case you want to manually trigger a thread dump, try kill -3
Monday, July 16, 2012
How to Configure WebSphere Application Server hung thread detector to automatically produce javacores or thread dump
Wednesday, July 11, 2012
Restarting Application Server by Node Agent
Learnt that in websphere application server 7, by default, the node agent will not take any action when an application server fails.
In order to get the node agent to monitor and automatically restart a failed application server instance, we must setup the monitoring policy for that application server.
Go to the deployment manager console, and do the following:
1 .
2. Check the “Automatic Restart” box
3. In the “Node Restart State“, set the state to “STOPPED”
Whenever you have a failed or killed application servers, node agent will now auto-restart the application server.
If the state is set to "RUNNING", not only will the node agent restart a failed or killed application server, it WILL ALSO auto start the application server upon a node agent restart.
Tuesday, July 10, 2012
Resolving ADMR0104E for Application Server
This write up serve to record the resolution for the ADMR0104E error encountered by Websphere Application server during start up. The Application Server eventually is unable to start up.
From "SystemOut.log", we see that the system is unable to read some properties file.[6/27/12 12:08:35:103 SGT] 00000000 FileDocument E ADMR0104E: The system is unable to read document cells/Cell01/nodes/Node01/node-metadata.properties: java.io.IOException: Permission denied
at java.io.File.checkAndCreate(File.java:1715)
at java.io.File.createTempFile(File.java:1803)
at com.ibm.ws.management.repository.FileDocument.createTempFile(FileDocument.java:564)
at com.ibm.ws.management.repository.FileDocument.read(FileDocument.java:500)
at com.ibm.ws.management.repository.FileRepository.extractInternal(FileRepository.java:1134)
Some research and checks revealed that the permissions on the temp directory under the application server profile had been changed. The application server would then be no longer able to write to the temp directory for the node in the below directory.# ls -ltr
total 0
drwxr-xr-x 3 root system 256 Jun 27 11:54 download
The cause of this is the start up of the application server using root. That's the reason why the above temp directory is owned by root.
Potentially, you should check the ffdc directory as well.# ls -l /
drwxr-xr-x 2 appusr appgrp 49152 Jun 27 13:50 ffdc
Research from the internet, the directory owner and the process execution user should be in the same group and be at least of permission 774. TO be fail safe, change the ownership/group as required under /
Once the ownership is reverted back to "appusr", we should see the result as below.# chown -R appusr:appgrp download
# ls -ltr
total 0
drwxr-xr-x 3 appusr appgrp 256 Jun 27 11:54 download
The Application server is able to start up now. [6/27/12 12:21:56:692 SGT] 00000000 AdminTool A ADMU3000I: Server appsrv open for e-business; process id is 4128910{code}
We can also check the process execution of the application server in order to compare to the file system permissions, one can do the following:
1. Open the admin console
2. Open Servers –> Application Servers –>
3. Open Java Process Management –> Server Execution
4. Look for username and group of executing user
Thats all.
Monday, July 09, 2012
Recover websphere password
Google online and found this interesting step to recover websphere 7.1 password.
For encrypting the password we have,
/
The output is
decoded password == "secret", encoded password == "{xor}LDo8LTor"
Hence, you can use the same method to decrypt the encrypted password./
The output is
encoded password == "{xor}LDo8LTor", decoded password == "secret"
If you want to know, you can update the password for the deployment manager and nodes without knowing the password. Check out
Friday, July 06, 2012
Encrypting the ID and Password for Websphere Application Server
By default, you need to supply the ID and password when starting up/shutting down the deployment manager, node or application server. Example of the command as below
Deployment Manager/
Node/
Application Server/
The steps to encrypt the password and ID is as follows.
Insert the ID and password in clear text into the SOAP properties file at /# grep SOAP.login soap.client.props | grep -v "#"
com.ibm.SOAP.loginUserid=wasadm
com.ibm.SOAP.loginPassword=wasadm
com.ibm.SOAP.loginSource=prompt
We use the IBM provided script to encode the password./
Taking a look at the same property file again, the password is now encrypted.# grep SOAP.login soap.client.props | grep -v "#"
com.ibm.SOAP.loginUserid=wasadm
com.ibm.SOAP.loginPassword={xor}Es4zPjwS
com.ibm.SOAP.loginSource=prompt
Now, we can start up websphere and shut down without using the password.su wasadm -c "/
su wasadm -c "/
su wasadm -c "/
su wasadm -c "/
su wasadm -c "/
su wasadm -c "/
end.
======================
Some trival.
How come IBM prefers to use XOR instead of some stronger algorithm like how weblogic uses 3DES? XOR is good enough only to prevent casual snooping.
Someone demonstrated that with a online decoder
http://www.poweredbywebsphere.com/decoder.html
Thursday, July 05, 2012
Change WebSphere Ports without Reinstalling
Scenario: you have WebSphere Application Server 7.1 installed as ND. If the cells are using default ports on the same host and you want to access the different cells concurrently, you may want to change the ports on one of the cell.
1. Go to the master config repository for the server ports (Dmgr profiles directory)
2. Backup the current serverindex.xml
3. Edit each of the ports in this file. (Dmgr will use the new ports)
4. Repeat this process for all nodes in the master repository (Node profiles directories)
5. For all cells,
6. Backup virtualhosts.xml
7. edit all the ports. (nodes will use this ports to connect with Dmgr)
7. Start the dmgr (startServer.sh)
8. For each node, executue a syncNode so that nodes get their new port assignments from the master repository
/
Use the new SOAP ports used in step #3.
9. Start up each node
10. Start up each application server.
Confirm which new ports you want to use before you start.
To make it easier to remember, maybe instead of the usual 80, can try prepending like 9080, 19080, etc.
Done.
Monday, July 02, 2012
How to resolve LVM error in powerHA
In the event you run into the following error:
cl_mklv: Operation is not allowed because vg is a RAID concurrent volume group.
This may be caused by the volume group being varied on, on the other node. If it should not be varied on, on the other node, run:
# varyoffvg vg
And then retry the LVM command.
BUT if it continues to be a problem, then stop powerHA 7.1 on both nodes, export the volume group and re-import the volume group on both nodes, and then restart the cluster.