Phantom Websphere: September 2012

What to do if someone accidentally remove some system critical files in rootvg?

# rm -rf ~

In this case, the stuffs in root's home directory will be removed. You will see /admin, /dev, /bin, etc being deleted. If you are quick to notice the mistake and halt the rm command,

IMPORTANT: Keep your existing SSH session alive at all cost. Otherwise, working on a terminal via HDMC or similar is going to be painful.

h2. So, its "oh shit" right?

Hopefully, /lib is not removed yet, else you are in bigger shit.

ssh, rsync, scp all will no longer work. Let's do a little self repair before recovering the rest of the files.

Is your tar command gone? What commands do i have left?

h2. Recover mkdir

Read from http://coding-journal.com/restoring-your-unix-system-after-rm-rf/ about this one. Try the following.

# echo "mkdir 'bin', 0777;" | perl

This is on the assumption that you lost your mkdir command but still have perl. Here, the /bin directory is created. The full permission is just for this emergency purpose, you can probably change it later.

h2. Recovering /dev

If /dev is lost, you may need to create some of the more critical ones to enable scp and ssh to bring in your backups (mksysb). The steps below are used on AIX 7.1 SP4

# cd /dev

# mknod random c 36 0
# mknod urandom c 36 1

# mknod null c 2 2
# chmod 644 random urandom

# chmod 666 null

Go ahead and try a ssh or sync. If cannot, you may need to restart sshd.

# stopsrc -s sshd

# startsrc -s sshd

if you have another server with a similar make, you can also try to recreate the disk structure but this is not critical if you have a backup which you can extract later. The convention should follow a standard since IBM name all the basic disk the same way on AIX 7.1

# mknod hd1 b 10 8
# mknod hd2 b 10 5
# mknod hd3 b 10 7
# mknod hd4 b 10 4
# mknod hd5 b 10 1
# mknod hd6 b 10 2
# mknod hd8 b 10 3
# mknod hd9var b 10 6
# mknod hd10opt b 10 9
# mknod hd11admin b 10 10
# chmod 660 hd1 hd10opt hd11admin hd2 hd3 hd4 hd5 hd6hd7hd8 hd9var

# mknod hd1 c 10 8

# mknod hd2 c 10 5

# mknod hd3 c 10 7

# mknod hd4 c 10 4

# mknod hd5 c 10 1

# mknod hd6 c 10 2

# mknod hd8 c 10 3

# mknod hd9var c 10 6

# mknod hd10opt c 10 9

# mknod hd11admin c 10 10

# chmod 660 hd1 hd10opt hd11admin hd2 hd3 hd4 hd5 hd6hd7hd8 hd9var

h2. Lets bring back the files.

After you bring in your backup, you can the commence restoration. I will list example using mksysb file.

Say we need to recover /dev, /admin, bosinst.data, etc. We just double check if the file is usable and whether the original files are inside this archive.

# restore -alvTf mksysb_mysever_date > /tmp/mksysb.server.txt

Then proceed to restore.

# restore -xvqf mksysb_mysever_date ./bosinst.data

# mv bosinst.data /

# restore -xvqf mksysb_mysever_date ./dev

# cd ./dev

# mv * /dev/

# restore -xvqf mksysb_mysever_date ./admin

# cd ./admin

# mv * /admin/

# restore -xvqf mksysb_mysever_date ./.ssh

# mv .ssh /

# restore -xvqf mksysb_mysever_date ./.profile

# mv .profile /

so on and forth.

If you have another server with the similar make or build, you may want to go the extra step to verify if there are anything else that is still missing.

In addition, go for a reboot at the nearest opportunity to ensure all is working well. Nothing is confirmed until it is tested and proven working.

Found that audit log grow too much on my new servers.

myserver:/:>audit query | head -2 auditing on bin processing off

The audit will record audit events like, 'su', 'passwd', file changes, cron, mail, tcpip, lvm, etc. Since audit files are kept on a separate partition for my case, risk of widespread diskspace full is still not that great.

myserver:/:>df -k | grep audit /dev/fslv00 262144 227972 84% 8 1% /audit

myserver:/:>ls -l /audit/ total 67608 -rw------- 1 root system 0 Sep 14 16:43 auditb -rw-rw---- 1 root system 10453248 Sep 14 16:43 bin1 -rw-rw---- 1 root system 11456 May 14 10:25 bin2 drwxr-xr-x 2 root system 256 Jul 10 14:43 lost+found -rw-r----- 1 root system 34589752 May 14 10:24 trail

Although the binsize in /etc/security/audit/config is set to 10240, which is 10240 bytes but the bin1 and bin2 files did not stay within the 10kb limit.

Also, there is a cron that 'rotate' the trail log file but it does not compress the rotated file, hence disk space is still being hogged.

myserver:/:>crontab -l | grep audit 0 * * * * /etc/security/aixpert/bin/cronaudit

So, let me suggest a workaround.

For the cron script, we add in a line to gzip the rotated log file after shifting the old file.

mv /audit/trail /audit/trailOneLevelBack gzip /audit/trailOneLevelBack

For the bin1 and bin2 files, stop audit, rotate the files and start audit.

# audit shutdown # cp -p /audit/bin1 /audit/bin1. # cp -p /audit/bin2 /audit/bin2. # gzip /audit/bin1. # gzip /audit/bin2. # cp /dev/null /audit/bin1 # cp /dev/null /audit/bin2 # audit start

Be careful not to change the inode of the files. Otherwise, i read from Mr Google that audit might get 'confused' and does not write audit logs into the bin files anymore. you might then need to reboot the host for audit to recover.

Phantom Websphere

Friday, September 28, 2012

How to files in AIX from a rm -rf / command

Thursday, September 27, 2012

Rotating AIX audit log

About Me

Catagories

Blog Archive

Other blog Links