Friday, September 28, 2012

How to files in AIX from a rm -rf / command

What to do if someone accidentally remove some system critical files in rootvg?

# rm -rf ~

In this case, the stuffs in root's home directory will be removed. You will see /admin, /dev, /bin, etc being deleted. If you are quick to notice the mistake and halt the rm command,

IMPORTANT: Keep your existing SSH session alive at all cost. Otherwise, working on a terminal via HDMC or similar is going to be painful.

h2. So, its "oh shit" right?

Hopefully, /lib is not removed yet, else you are in bigger shit.

ssh, rsync, scp all will no longer work. Let's do a little self repair before recovering the rest of the files.

Is your tar command gone? What commands do i have left?

h2. Recover mkdir

Read from http://coding-journal.com/restoring-your-unix-system-after-rm-rf/ about this one. Try the following.

# echo "mkdir 'bin', 0777;" | perl

This is on the assumption that you lost your mkdir command but still have perl. Here, the /bin directory is created. The full permission is just for this emergency purpose, you can probably change it later.

h2. Recovering /dev

If /dev is lost, you may need to create some of the more critical ones to enable scp and ssh to bring in your backups (mksysb). The steps below are used on AIX 7.1 SP4

# cd /dev
# mknod random c 36 0
# mknod urandom c 36 1
# mknod null c 2 2
# chmod 644 random urandom
# chmod 666 null

Go ahead and try a ssh or sync. If cannot, you may need to restart sshd.

# stopsrc -s sshd
# startsrc -s sshd


if you have another server with a similar make, you can also try to recreate the disk structure but this is not critical if you have a backup which you can extract later. The convention should follow a standard since IBM name all the basic disk the same way on AIX 7.1


# mknod hd1 b 10 8
# mknod  hd2 b 10 5
# mknod  hd3 b 10 7
# mknod  hd4 b 10 4
# mknod  hd5 b 10 1
# mknod  hd6 b 10 2
# mknod  hd8 b 10 3
# mknod  hd9var b 10 6
# mknod  hd10opt b 10 9
# mknod  hd11admin b 10 10
# chmod 660 hd1 hd10opt hd11admin hd2 hd3 hd4 hd5 hd6hd7hd8 hd9var

# mknod hd1 c 10 8

# mknod  hd2 c 10 5
# mknod  hd3 c 10 7
# mknod  hd4 c 10 4
# mknod  hd5 c 10 1
# mknod  hd6 c 10 2
# mknod  hd8 c 10 3
# mknod  hd9var c 10 6
# mknod  hd10opt c 10 9
# mknod  hd11admin c 10 10
# chmod 660 hd1 hd10opt hd11admin hd2 hd3 hd4 hd5 hd6hd7hd8 hd9var


h2. Lets bring back the files. 

After you bring in your backup, you can the commence restoration. I will list example using mksysb file.

Say we need to recover /dev, /admin, bosinst.data, etc. We just double check if the file is usable and whether the original files are inside this archive.

# restore -alvTf mksysb_mysever_date > /tmp/mksysb.server.txt

 Then proceed to restore.

# restore -xvqf mksysb_mysever_date ./bosinst.data
# mv bosinst.data /

# restore -xvqf mksysb_mysever_date ./dev
# cd ./dev
# mv * /dev/

# restore -xvqf mksysb_mysever_date ./admin
# cd ./admin
# mv * /admin/

# restore -xvqf mksysb_mysever_date ./.ssh
# mv .ssh /

# restore -xvqf mksysb_mysever_date ./.profile
# mv .profile /

so on and forth.

If you have another server with the similar make or build, you may want to go the extra step to verify if there are anything else that is still missing. 

In addition, go for a reboot at the nearest opportunity to ensure all is working well. Nothing is confirmed until it is tested and proven working.


Thursday, September 27, 2012

Rotating AIX audit log

Found that audit log grow too much on my new servers.

myserver:/:>audit query | head -2
auditing on
bin processing off


The audit will record audit events like, 'su', 'passwd', file changes, cron, mail, tcpip, lvm, etc. Since audit files are kept on a separate partition for my case, risk of widespread diskspace full is still not that great.

myserver:/:>df -k | grep audit
/dev/fslv00        262144    227972   84%        8     1% /audit


myserver:/:>ls -l /audit/
total 67608
-rw-------    1 root     system            0 Sep 14 16:43 auditb
-rw-rw----    1 root     system        10453248 Sep 14 16:43 bin1
-rw-rw----    1 root     system        11456 May 14 10:25 bin2
drwxr-xr-x    2 root     system          256 Jul 10 14:43 lost+found
-rw-r-----    1 root     system     34589752 May 14 10:24 trail


Although the binsize in /etc/security/audit/config is set to 10240, which is 10240 bytes but the bin1 and bin2 files did not stay within the 10kb limit.

Also, there is a cron that 'rotate' the trail log file but it does not compress the rotated file, hence disk space is still being hogged.

myserver:/:>crontab -l | grep audit
0 * * * * /etc/security/aixpert/bin/cronaudit


So, let me suggest a workaround.

For the cron script, we add in a line to gzip the rotated log file after shifting the old file.

mv /audit/trail /audit/trailOneLevelBack
gzip /audit/trailOneLevelBack



For the bin1 and bin2 files, stop audit, rotate the files and start audit.

# audit shutdown
# cp -p /audit/bin1 /audit/bin1.
# cp -p /audit/bin2 /audit/bin2.

# gzip /audit/bin1.
# gzip /audit/bin2.

# cp /dev/null /audit/bin1
# cp /dev/null /audit/bin2

# audit start


Be careful not to change the inode of the files. Otherwise, i read from Mr Google that audit might get 'confused' and does not write audit logs into the bin files anymore. you might then need to reboot the host for audit to recover.