Phantom Websphere: Network

Showing posts with label Network. Show all posts

Monday, September 12, 2011

Basic networking TCP test using telnet

When telneting to an IP at a given port, there are various telnet responses. Knowing the difference in telnet responses could easily point you in the right direction when a telnet to a host on a particular port in unsuccessful.

There are a distinct differences in getting ‘refused’ or ‘timeout’ responses.

You will get a connection refused message for one of the following reasons:

The application you are trying to test hasn’t been started/installed on the remote server.
There is a firewall rejecting the connection attempt by terminating the connection setup.

Example output from a Linux box:

$ telnet server2 7063
Trying 172.1.1.1...
telnet: connect to address 172.1.1.1: Connection refused
telnet: Unable to connect to remote host: Connection refused

The similar Connection refused message from a Solaris box :

$ telnet server3 7055
Trying 172.2.1.1...
telnet: Unable to connect to remote host: Connection refused

The Connect failed message is the equivalent but from a Windows box :

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Documents and Settings\vickwan>telnet 172.2.1.1 7062
Connecting To 172.2.1.1...Could not open connection to the host, on port 7062: Connect failed

The telnet command will abort the attempted connection after waiting a predetermined time for a response. This is called a timeout response.

In some cases, telnet won’t abort, but will just wait indefinitely. This is also known as hanging. These symptoms can be caused by the one of the following reasons:

The remote server doesn’t exist on the destination network. It could be turned off.
The could be a routing issue, either the request or the response never gets to the destination.
A firewall could be blocking the connection attempt, causing it to timeout instead of being quickly refused.

Here is an example of the output:

$ telnet server3 7055
Trying 172.2.1.1...
telnet: connect to address 172.2.1.1: Connection timed out
telnet: Unable to connect to remote host: Connection timed out

The script, command file and input file.

Reference Adapted from : http://blog.ru.co.za/2009/09/29/telnet/

This little script is written to helps cut down time needed to test if ACL allows connection from server A to server B at a given port. It will attempt to suggest the remedy actions.

Script tested on AIX 6.1, AIX 7.1, RHEL 4.6 and Solaris 9.

#!/bin/ksh
# Written By   : Victor Kwan
# Written On   : 25 Oct 2009
# EMAIL   : victorkk [AT] gmail [DOT] com
# Description  : Utility to test TCP ACL via telnet
# Updated On   : 27 Oct 2009 : Attempt to interpret telnet response.
#              : 09 Mar 2011 : Test if telnet command is executable
#              : 07 Apr 2011 : Support AIX, Improve code to be not chatty and terminate telnet session properly.

FILE=${1}
OUTPUTFILE="$0.output"
LOGFILE="$0.log"
TELNETCMD="$0.telnetcmd"
TELNET=`which telnet`
CAT=`which cat`
ECHO=`which echo`
OS="`uname -s`"

#UNIX Normal "Connection to 10.106.50.10 closed."
#UNIX No route "No route to host"
#UNIX Conn refused "telnet: Unable to connect to remote host: Connection refused"
#UNIX timed out "telnet: Unable to connect to remote host: Connection timed out"

RESPONSE_NORMAL="gn host."
RESPONSE_NO_ROUTE="to host"
RESPONSE_CONN_REFUSED="refused"
RESPONSE_TIMED_OUT="med out"

#AIX Normal "Connection closed."
#AIX No route "No route to host"
#AIX Conn refised "telnet: connect: A remote host refused an attempted connect operation."
#AIX timed out "telnet: connect: A remote host did not respond within the timeout period."

AIXRESPONSE_NORMAL="Connection closed."
AIXRESPONSE_NO_ROUTE="No route to host"
AIXRESPONSE_CONN_REFUSED="connect operation."
AIXRESPONSE_TIMED_OUT="he timeout period."

THISRESPONSE_NORMAL="$RESPONSE_NORMAL"
THISRESPONSE_NO_ROUTE="$RESPONSE_NO_ROUTE"
THISRESPONSE_CONN_REFUSED="$RESPONSE_CONN_REFUSED"
THISRESPONSE_TIMED_OUT="$RESPONSE_TIMED_OUT"

COLOR_BLUE="\033[0;34m"
COLOR_GREEN="\033[32m"
COLOR_RED="\033[31m"
COLOR_BRIGHTRED="\033[1;31m"
COLOR_WHITE="\033[0m"
COLOR_BRIGHTWHITE="\033[1;37m"

if [ ! -x $TELNET ]
then
        echo "${COLOR_BRIGHTRED}Telnet command is not executable!!${COLOR_WHITE}"
        echo "${COLOR_WHITE}Script will now terminate.${COLOR_WHITE}"
        exit
fi

echo "Commence telnet test based on [$FILE] file."
echo

cat ${FILE} | grep -v "#" | while read LINE do {
        IP=`echo $LINE | awk -F: '{print $1}'`
        PORT=`echo $LINE | awk -F: '{print $2}'`

        ($CAT $TELNETCMD) | $TELNET $IP $PORT >> $OUTPUTFILE 2>&1

        RESPONSE=`tail -1 $OUTPUTFILE | tr -d "\r" | tr -d "\n"`
	if [ "$OS" = "SunOS" ]
	then
	{
		STR_TO_CMP=`echo "$RESPONSE" | awk '{print substr($0,length-7)}'`
	}
	elif [ "$OS" = "AIX" ]
	then
	{
		STR_TO_CMP=`echo "$RESPONSE" | awk '{print substr($0,length-18)}'`
		THISRESPONSE_NORMAL="$AIXRESPONSE_NORMAL"
		THISRESPONSE_NO_ROUTE="$AIXRESPONSE_NO_ROUTE"
		THISRESPONSE_CONN_REFUSED="$AIXRESPONSE_CONN_REFUSED"
		THISRESPONSE_TIMED_OUT="$AIXRESPONSE_TIMED_OUT"
	}
	fi

        if [ ! "$STR_TO_CMP" = "$THISRESPONSE_NORMAL" ]
        then
        {
                echo "Telnet ${COLOR_BRIGHTRED}FAILED${COLOR_WHITE} for ${COLOR_BRIGHTWHITE}$IP:$PORT${COLOR_WHITE}."
                echo "${COLOR_BRIGHTRED}Error Message${COLOR_WHITE} : [$RESPONSE]!"

                if [ "$STR_TO_CMP" = "$THISRESPONSE_NO_ROUTE" ]
                then
                {
                        echo "${COLOR_GREEN}Suggestion${COLOR_WHITE}: Check routing at both source and destination"
                }
                fi

                if [ "$STR_TO_CMP" = "$THISRESPONSE_CONN_REFUSED" ]
                then
                {
                        echo "${COLOR_GREEN}Suggestion${COLOR_WHITE}: Destination may not be listening, routable or firewall is blocking the connection."
                }
                fi

                if [ "$STR_TO_CMP" = "$THISRESPONSE_TIMED_OUT" ]
                then
                {
                        echo "${COLOR_GREEN}Suggestion${COLOR_WHITE}: Destination may not be listening, routable or firewall is blocking the connection."
                }
                fi
        }
        fi
     	echo "Done for $IP:$PORT."
        echo " "
}
done
echo "Telnet test ends."

For the input file, e.g. IP_PORT It is okay to have commented lines as the script will ignore them.

~$more IP_PORT
#WLS
server2:7003
server2:7004
server2:7022
server2:7023
...
...

For the command file, the 2 telnet control commands must be used.

~$ more testACL.telnetcmd
^]
quit

Final outcome. Output may look similar to the following. No output for telnet success.

> ./testACL_telnet.ksh IP_PORT
Commence telnet test based on [IP_PORT] file.
Telnet FAILED for server4:7053.
Error Message : [telnet: Unable to connect to remote host: Connection refused]!
Suggestion: Destination may not be listening on this IP and Port, routable or firewall is blocking the connection.
...
...
...
telnet test ends.

How to Remove Unwanted route to 169.254.0.0 in RHEL Linux

Every time the system boots, You may have seen the following with the route to 169.254.0.0.

# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.10.2.0      *               255.255.255.0   U     0      0        0 bond0
10.10.2.0      *               255.255.255.0   U     0      0        0 eth3
169.254.0.0     *               255.255.0.0     U     0      0        0 eth3
default         10.10.2.254    0.0.0.0         UG    0      0        0 bond0

This is the zeroconf route (169.254.0.0). You manually disable it by turning off the firewall and remove the route with 169.254.0.0 / 255.255.0.0 using the route command.

Permanent Solution:

To disable the zeroconf route during system boot, edit the /etc/sysconfig/network file and add the following NOZEROCONF value to the end of the file:

NETWORKING=YES
HOSTNAME=localhost.localdomain
NOZEROCONF=yes

Layman Explanation:

Zeroconf, or Zero Configuration Networking, is a set of techniques that automatically create a usable IP network without configuration or special servers.

This allows inexpert users to connect computers, networked printers, and other network devices and expect a functioning network to be established automatically. Without Zeroconf, a user must either set up special services, like DHCP and DNS, or set up each computer's network settings manually, which may be challenging for non-technical or novice users.

Additional Information: wiki more about zeroconf

Sunday, September 11, 2011

How NPIV can save on fibre cabling for SAN

What is NPIV

NPIV, which stands for N-Port ID Virtualisation is a fibre channel facility allowing multiple N-Port IDs to share a singale physic N-Port. Hence this allows multiple fibre channel to occupy a single physical port, easing hardware requirements in the SAN design.

It is noted that NPIV is an extension to a standard already defined in the fibre channel protocols that allow one to get past single initiator/single target design limitations.

In order to take advantage of this, both the HBA card (from the host and SAN array) and the switch must support NPIV to generate and publish an additional WWPN in a virtual fashion.

Why is it good

Traditionally, we provide at least 2 fibre link for each host, 1 link on 1 controller which is connected to 1 SAN switch in the production environment. With 4 LPARs in the p7 server requiring SAN connection, potentially, we need at least 8 fibre links with 8 SAN ports allocation. Additional links and SAN ports are needed to connect to the SAN arrays.

With the NPIV protocol, we can use just 2 fibre links to connect between the SAN switch and the p7 server. This is a savings of 75%!

Why it may be bad

In the event that there are lots of host sharing the same fibre link, the nightmare of link failure will be catastrophic. It may be mitigated by having 2 HBA controllers with 2 links each, and distributed connection to 2 different SAN switches.

My thoughts on how can 802.1q can save on network cablings and potential problems.

IEEE 802.1Q or commonly known as VLAN tagging is a networking standard for sharing of physical Ethernet network link by multiple independent logical networks.

The protocol works with the MAC layer and Spanning Tree Protocol (802.1D) to allow nodes / hosts on different VLAN to communicate with each other through network switch or router on Network Layer.

VLAN tagging

If i have 2 different environment where servers are members of different network segments, in order to allow the different hosts which are in the p750 server to share the same physical ethernet link, the switch or router need to understand and route the network traffic for both the 10.10.10.* and 10.10.20.* to the p750 machine.

Within the p750 machine, the NIC is capable of deciphering the VLAN ID and route the traffic to the designated hosts or LPARs for inward traffic. For outward traffic, the NIC would tag the VLAN ID to the traffic and route it out to the gateway.

This is also similar to the VTP or ISL protocol that is proprietary to Cisco.

Why is it good.

With 2 VIO servers, the different LPARs in the same IBM p750 machine would traditionally need more than 20 UTP cables. With 802.1Q, we need only 2 cables for all the LPARs and 2 cables for HMC. This is more than 80% savings!

Why it may be bad

In the event that there are lots of host sharing the same UTP link, the nightmare of link failure will be catastrophic. It may be mitigated by having 2 NIC controllers with 2 links each, and distributed connection to 2 different switches.

i guess you cant have your cake and eat it!

Thursday, August 14, 2008

Link based IPMP vs probe based IPMP in solaris 10

From the sunsolve doc http://sunsolve.sun.com/search/document.do?assetkey=1-9-86869-1 , there are 3 modes of IPMP that we can configure,

Probe based IPMP - active standby setup
Probe based IPMP - active active setup
Linke based IPMP - active standby setup

The only difference on Probe based IPMP active-active' setup versus active-standby's setup is the word "deprecated" in the second configuration file. When you add in the "deprecated" tag, the network traffic would actually NOT go through the physical IP. When you snoop on the interface the traffic will go out on your virtual IPs.

Link-based IPMP

For link-based failure detection, only the link between local interface and the link partner is checked on hardware layer. Neither IP layer nor any further network path will be monitored.

No test addresses are required for link-based failure detection. So the pro here is that you save on the number of IP. But then if you are on your own private network, are you sure you have some many IPs that you would ran out of it? Most likely the reason is the ease of IP management.

Probe-based IPMP

Probe-based failure detection is performed on each interface in the IPMP group that has a test address. Using this test address, ICMP probe messages go out over this interface to one or more target systems on the same IP link.

The in.mpathd daemon determines which target systems to probe dynamically. The whole network path up to the gateway (router) is monitored on IP layer. With all interfaces in the IPMP group connected via redundant network paths (switches etc.), you get full redundancy.

On the other hand the default router can be a single point of failure, resulting in 'All Interfaces in group have failed'.

Conclusion

Meaning that probe based IPMP monitors the path up to the gateway while link based IPMP monitors only up to the next physical link. Nothing more nothing less.

Link based IPMP cant 'see' what's after this physical link.

I still prefer probe based IPMP as i have more ease when troubleshooting to determine whether i have connection all the way to the destination. Using link based IPMP means that i would have to get the network guys to check for me if the connection is down.

Note: netstat -k seem to be dropped in solaris 10.

Link based IPMP on Solaris 10

Setting up Link based IPMP in solaris 10 is much more easier than probe based IPMP.

Lets see what NIC i have in my server..

root ~>#dladm show-dev
bge0 link: up speed: 1000 Mbps duplex: full
bge1 link: up speed: 1000 Mbps duplex: full
bge2 link: unknown speed: 0 Mbps duplex: unknown
bge3 link: up speed: 100 Mbps duplex: full

So i have 3 NIC connected, lets use bge0 and bge1 for our link based IPMP. Just use the following configuration.

root ~># more /etc/hostname.bge*
::::::::::::::
/etc/hostname.bge0
::::::::::::::
myserver netmask + broadcast + group production up
::::::::::::::
/etc/hostname.bge1
::::::::::::::
group production up

Remember to put the IPs in the /etc/hosts, netmask in /etc/defaultrouter.

Verify that IPMP daemon is running.

root ~>#pgrep -lf mpathd
165 /usr/lib/inet/in.mpathd -a

Another indication that you are using link based IPMP instead of probe based IPMP is the following message appearing in your console or /var/adm/messages.

Aug 14 10:41:44 in.mpathd[155]: No test address configured on interface bge1; disabling probe-based failure detection on it
Aug 14 10:41:44 in.mpathd[155]: No test address configured on interface bge0; disabling probe-based failure
detection on it

Now, we are ready to do some fail over test.

# if_mpadm -d bge0

root@png2gw2:~>#ifconfig -a
bge0: flags=89000842 mtu 0 index 2
inet 0.0.0.0 netmask 0
groupname production
ether 0:14:4f:91:d:5c
bge1: flags=1000843 mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname production
ether 0:14:4f:91:d:5d
bge1:1: flags=1000843 mtu 1500 index 3
inet 10.55.9.192 netmask ffffff00 broadcast 10.55.9.255

Noticed that the IP in bge0 has been transfered into bge1:1. You may also notice the following will appear in your console or /var/adm/messages.

Aug 14 10:05:10 myserver in.mpathd[165]: [ID 832587 daemon.error] Successfully failed over from NIC bge0 to NIC bge1

So we are quite done. Lets recover and restore the IP

root ~>#if_mpadm -r bge0

In /var/adm/message,

Aug 14 10:05:10 myserver in.mpathd[165]: [ID 832587 daemon.error] Successfully failed over from NIC bge0 to NIC bge1
Aug 14 10:07:26 myserver in.mpathd[165]: [ID 620804 daemon.error] Successfully failed back to NIC bge0

We have restored the NIC.

root ~>#ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000843 mtu 1500 index 2
inet 10.55.9.192 netmask ffffff00 broadcast 10.55.9.255
groupname production
ether 0:14:4f:91:d:5c
bge1: flags=1000843 mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname production
ether 0:14:4f:91:d:5d

While doing this test, i also noticed that when the failover was done, bge1 is still 0.0.0.0 and bge0's IP was plumbed to bge1:1.

From some research and experiment, i found out bge1 can be acutally configured with its own IP, such that it can still provide service, thus allowing 1 more NIC to work.

When the failover happens, both IP will serve business as usual.Here's the test.

bge0 and bge1 has IP plumbed on them. let do a Telnet to 10.55.9.192 (bge0)

myclient -> myserver TCP D=22 S=57645 Syn Seq=2664010663 Len=0 Win=49640 Options=

Also do a Telnet to 10.55.9.198 (bge1)

myclient -> 10.55.9.198 TCP D=22 S=57656 Syn Seq=2669824191 Len=0 Win=49640 Options=
10.55.9.198 -> myclient TCP D=57656 S=22 Syn Ack=2669824192 Seq=478613400 Len=0 Win=49640 Options=

myclient -> 10.55.9.198 TCP D=22 S=57656 Ack=478613401 Seq=2669824192 Len=0 Win=49640
10.55.9.198 -> myclient TCP D=57656 S=22 Push Ack=2669824192 Seq=478613401 Len=20 Win=49640

myclient -> 10.55.9.198 TCP D=22 S=57656 Ack=478613421 Seq=2669824192 Len=0 Win=49640

Because I did not log in, /var/adm/message 'complain'

Aug 14 10:16:27 myserver sshd[24576]: [ID 800047 auth.info] Did not receive identification string from 10.10.140.36
Aug 14 10:16:40 myserver sshd[24579]: [ID 800047 auth.info] Did not receive identification string from 10.10.140.36

Lets fail Bge0 now, monitor the /var/adm/message and snoop output. Acutally all traffic is now going on bge1:1.

Aug 14 10:19:38 myserverin.mpathd[165]: [ID 832587 daemon.error] Successfully failed over from NIC bge0 to NIC bge1

myclient -> myserver TCP D=22 S=57682 Syn Seq=2724219601 Len=0 Win=49640 Options=
myserver -> myclient TCP D=57682 S=22 Syn Ack=2724219602 Seq=2859863037 Len=0 Win=49640 Options=

Traffic on 10.55.9.198 (bge1) is unaffected.

myclient -> 10.55.9.198 TCP D=22 S=57685 Syn Seq=2740048358 Len=0 Win=49640 Options=
10.55.9.198 -> myclient TCP D=57685 S=22 Syn Ack=2740048359 Seq=1234692052 Len=0 Win=49640 Options=

myclient -> 10.55.9.198 TCP D=22 S=57685 Ack=1234692053 Seq=2740048359 Len=0 Win=49640
10.55.9.198 -> myserver TCP D=57685 S=22 Push Ack=2740048359 Seq=1234692053 Len=20 Win=49640
myclient -> 10.55.9.198 TCP D=22 S=57685 Ack=1234692073 Seq=2740048359 Len=0 Win=49640

We restore the NIC.

Aug 14 10:35:08 myserver in.mpathd[165]: [ID 620804 daemon.error] Successfully failed back to NIC bge0

Some of my references are:
http://sunsolve.sun.com/search/document.do?assetkey=1-61-211105-1
http://raulsg.wikispaces.com/ipmp-link-based
http://os.miamano.eu/node/25

Saturday, May 24, 2008

Change IP Address without rebooting in Solaris 10

Wonderful Solaris 10 huh? normally i would have to reboot the server whenever host IP is changed, now adding or editing the IP address on a Solaris 10 server need not need a reboot.

before Solaris 10, you need to edit the following files

/etc/hosts
/etc/nodenames
/etc/hostname.[device]
/etc/defaultrouter
/etc/defaultdomain
/etc/nodename

and add or modify the entries for the IP address and the hostname.

Be sure to check your /etc/netmasks too if you have network changes.

Example:
192.168.1.1 myserver

In Solaris 10, we need to edit BOTH

/etc/hosts (symlink to /etc/inet/hosts file) AND
/etc/inet/ipnodes

adding an entry for IP address and hostname plue those above mentioned ones.

Once done, do either of the following

1) svcadm restart network/physical <--- restart network service
2) reboot the server

From the man page, ipnodes file is read first before the system check hosts file. I hit this problem before when i changed the hostname/IP of a Solaris 10 server and got the network errors on my ALOM console.

Although, the /etc/inet/ipnodes files is primarily for IPv6 only, without adding an entry to the file, the IP address (IPv4) doesn’t become active in Solaris 10, at least on the release version that i'm using. This seems to be solved in Solaris 10 U4 (08/07 build).

If you need to add addresses, you must add IPv4 addresses to both the hosts and ipnodes files. You add only IPv6 addresses to the ipnodes file.

IPMP Configuration (Probe based)

Lets set up using 2 physical interface with 2 virtual IPs.

Physical interfaces ce0, ce3
Logical interfaces ce0:1, ce0:1

Lets put the IPs into the /etc/hosts.

172.20.1.10 mynic-ce0
172.20.1.11 mynic-ce3

172.20.1.12 application1-service
172.20.1.13 application2-service

Verify local-mac-address? is set to true.
# eeprom local-mac-address?
local-mac-address?=true

Impt! - setting local-mac-address to true will not take effect until next reboot*

If this host will not be forwarding packets set the following.
# touch /etc/notrouter

Manually plumb the interfaces (ignore if you plumb them already)

# ifconfig ce0 plumb mynic-ce0 netmask + broadcast + -failover deprecated up
# ifconfig ce3 plumb mynic-ce3 netmask + broadcast + -failover deprecated up
# ifconfig ce0 addif application1-service + broadcast + failover up
# ifconfig ce3 addif application2-service netmask + broadcast + failover up

Check using /usr/sbin/ifconfig -a.

Lets make the configuration persistant across reboots.

--- /etc/hostname.ce0 ---
mynic-ce0 netmask + broadcast + group production deprecated -failover up \
addif application1-service netmask + broadcast + failover up \
addif application2-service netmask + broadcast + failover up
--- EOF ---

--- /etc/hostname.ce3 ---
mynic-ce3 netmask + broadcast + group production deprecated -failover up
--- EOF ---

8) Now lets test the failover by pulling the ce0 cable. watch /var/adm/messages for the errors.

9) you can now get the applications and users to use the IP application1-service and application2-service

Impt!! network traffic on application1-service can be incoming and outgoing BUT network traffic on application2-service can only be incoming. I have tried to play around with the deprecated tag but only managed to get network traffic going out from application1-service. Well, if you have any advise on this, do let me know!! :)

Enable/Disable IP Forwarding in Solaris 10 without reboot

In Solaris 10, there is this feature of IP forwarding.

This is the process of forward/routing the packets between network interfaces on one system. Meaning to say that the packet for a host on a different network arrive on one of the network interface. This will be forwarded to the appropriate network interface.

We can enable or disable using the following commands:

1) routeadm
2) ifconfig

Read from other websites that in Solaris 9, ndd command is used.

The advantage here in Solaris 10 is that the change is dynamic, real-time and the change is persistant across reboot unlike the ndd command.

Example: Enable/Disable IP Forwarding Globally

# routeadm -e ipv[4|6]-forwarding
# routeadm -d ipv[4|6]-forwarding

Use either 4 or 6 for [4|6] options.
The switches “-e” enables IP Forwarding.
The switches “-d” enables IP Forwarding.

Once done, use either one of the steps below let the new setting take effect.

1) reboot
2) routeadm -u
3) svcadm enable svc:/network/ipv[4|6]-forwarding

The option -u as digged from the man page.
Apply the currently configured options to the running system. These options might include enabling or disabling IP forwarding and launching or killing routing daemons, if any are specified. It does not alter the state of the system for those settings that have been set to default. This option is meant to be used by administrators who do not want to reboot to apply their changes. In addition, this option upgrades on-SMF configurations from the invocations of daemon stop commands, which might include a set of arguments, to a simple enabling of the appropriate service.

To revert? do the following:

# routeadm -r ipv[4|6]-forwarding
# routeadm -u

Example: Enable/Disable IP Forwarding on a particular interface

If we want to work on the ce0 interface using the ifconfig command.

In IPv4
# ifconfig ce0 router <--- enable
# ifconfig ce0 -router <--- disable

In IPv6
# ifconfig ce0 inet6 router <--- enable
# ifconfig ce0 inet6 -router <--- disable

More References:
http://gibbs.acu.edu/2007/02/24/using-solaris-10-as-a-firewallrouter/

Tuesday, April 29, 2008

Taking IPMP offline for maintenance

Found this gem for use in Solaris 10 from some sun blog site..

Solaris IPMP (IP multipathing) allows the servers keep operating in the event that a network interface or switch were to fail. Periodically you may need to take IPMP managed interfaces offline but still need to keep the IP addresses attached to those interface up and operational.

Solaris now come with the if_mpadm utility, which provides a simple and straight forward way to take IPMP managed interfaces online and offline.

Prior to using the if_mpadm utility, it is useful to check the status of the interface you want to take online or offline. This can be done by running the ifconfig utility, and checking the status of the interface you are interested in taking online or offline (in this case ni0):


$ ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
ni0: flags=201000843 mtu 1500 index 2
        inet 192.168.1.5 netmask ffffff00 broadcast 192.168.1.255
        groupname ipmp0
        ether 0:45:e8:33:3c:97
ni1: flags=201000843 mtu 1500 index 3
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp0
        ether 0:72:b6:d3:ee:35
lo0: flags=2002000849 mtu 8252 index 1
        inet6 ::1/128

To take the interface ni0 offline for maintenance, the if_mpadm utility can be run with the “-d” option (take interface offline), and the name of the interface to take offline:


$ if_mpadm -d ni0

Once if_mpadm does it’s job, the interface will be in the OFFLINE state, and the IP addresses attached to that interface will have migrated to another device in the IPMP group:


$ ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
ni0: flags=289000842 mtu 0 index 2
        inet 0.0.0.0 netmask 0
        groupname ipmp0
        ether 0:45:e8:33:3c:97
ni1: flags=201000843 mtu 1500 index 3
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp0
        ether 0:72:b6:d3:ee:35
ni1:1: flags=201000843 mtu 1500 index 3
        inet 192.168.1.5 netmask ffffff00 broadcast 192.168.1.255
lo0: flags=2002000849 mtu 8252 index 1
        inet6 ::1/128

After you finish your maintenance, you can use the if_mpadm “-r” option (bring interface online) to bring the interface online:


$ if_mpadm -r ni0

Once if_mpadm completes, you can use the ifconfig utility to verify the interface is back up, and the IP addresses have migrated back to the original adaptor (you can disable automatic failback by setting FAILBACK to no in /etc/default/mpathd):


$ ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
ni0: flags=201000843 mtu 1500 index 2
        inet 192.168.1.5 netmask ffffff00 broadcast 192.168.1.255
        groupname ipmp0
        ether 0:45:e8:33:3c:97
ni1: flags=201000843 mtu 1500 index 3
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp0
        ether 0:72:b6:d3:ee:35
lo0: flags=2002000849 mtu 8252 index 1
        inet6 ::1/128

Check out http://blogs.sun.com/meem/date/20070425

Solaris 10 Fibre Channel Management

On Solaris 10 Fibre Channel Management is easy, cos the storage foundation kit is now integrated into the base OS, name is ”fcinfo“.

”fcinfo” utility is available to view fibre channel connectivity information.
”fcinfo” is especially useful, since it provides a tool with the base Operating System to view HBA and connectivity information, include HBAs from Emulex, JNI and Qlogic.

Warning!! You should do so only when the server fibre links are online or offline, and never do it continually while you are disrupting the fibre link. else you may never be able to bring up the fibre link till you reboot the server..


# uname -a
SunOS 5.10 

# fcinfo -V
fcinfo: Version 1.0
For more information, please see fcinfo(1M) 

# fcinfo hba-port 

HBA Port WWN: 210000e08b8f29bf
OS Device Name: /devices/pci@84,4000/fibre-channel@3:devctl
Manufacturer: QLogic Corporation
Model: QLA2340
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 200000e08b8f29bf 

HBA Port WWN: 10000000c9581765
OS Device Name: /dev/cfg/c3
Manufacturer: Emulex
Model: LP9802
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c9581765 

...
...

Option “-l” is Lists the link error statistics information for the port


# fcinfo hba-port -l 

HBA Port WWN: 210000e08b8f29bf
OS Device Name: /devices/pci@84,4000/fibre-channel@3:devctl
Manufacturer: QLogic Corporation
Model: QLA2340
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 200000e08b8f29bf
Error: SendRLS failed for 210000e08b8f29bf 

HBA Port WWN: 10000000c9581765
OS Device Name: /dev/cfg/c3
Manufacturer: Emulex
Model: LP9802
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c9581765
Link Error Statistics:
Link Failure Count: 1
Loss of Sync Count: 6
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 120
Invalid CRC Count: 0 

HBA Port WWN: 10000000c9582596
OS Device Name: /dev/cfg/c4
Manufacturer: Emulex
Model: LP9802
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c9582596
Link Error Statistics:
Link Failure Count: 1
Loss of Sync Count: 6
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 8
Invalid CRC Count: 0

To check the remote port


# fcinfo remote-port -slp 2100001b320616d2 
Remote Port WWN: 50060e8005626100 
        Active FC4 Types: SCSI 
        SCSI Target: yes 
        Node WWN: 50060e8005626100 
        Link Error Statistics: 
                Link Failure Count: 0 
                Loss of Sync Count: 0 
                Loss of Signal Count: 0 
                Primitive Seq Protocol Error Count: 0 
                Invalid Tx Word Count: 0 
                Invalid CRC Count: 0 
        LUN: 0 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d0s2 
        LUN: 1 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d1s2 
        LUN: 2 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d2s2 
        LUN: 3 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d3s2 
        LUN: 4 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d4s2 
        LUN: 5 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d5s2 
        LUN: 6 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d6s2 
        LUN: 7 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c2t50060E8005626100d7s2

To check the details of a particular fibre especially during setup, use luxadm to match the fibre and its WWN.


root # luxadm -e dump_map /devices/pci@7,700000/SUNW,qlc@0/fp@0,0:devctl 
Pos  Port_ID Hard_Addr Port WWN         Node WWN         Type 
0    80f00   0         50060e8005626110 50060e8005626110 0x0  (Disk device) 
1    82400   0         2100001b320610cd 2000001b320610cd 0x1f (Unknown Type,Host Bus Adapter) 

root # fcinfo remote-port -slp 2100001b320610cd 
Remote Port WWN: 50060e8005626110 
        Active FC4 Types: SCSI 
        SCSI Target: yes 
        Node WWN: 50060e8005626110 
        Link Error Statistics: 
                Link Failure Count: 0 
                Loss of Sync Count: 0 
                Loss of Signal Count: 0 
                Primitive Seq Protocol Error Count: 0 
                Invalid Tx Word Count: 0 
                Invalid CRC Count: 0 
        LUN: 0 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d0s2 
        LUN: 1 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d1s2 
        LUN: 2 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d2s2 
        LUN: 3 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d3s2 
        LUN: 4 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d4s2 
        LUN: 5 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d5s2 
        LUN: 6 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d6s2 
        LUN: 7 
          Vendor: HITACHI 
          Product: OPEN-V      -SUN 
          OS Device Name: /dev/rdsk/c3t50060E8005626110d7s2

Using Telnet to test SMTP

We normally use telnet to open up a connection to the smtp server to send/retrieve the mails. Below is an example of the commands used..


telnet mail.domain 25
Trying ???.???.???.???...
Connected to mail.domain.
Escape character is '^]'.
220 mail.domain ESMTP Sendmail ?version-number?; ?date+time+gmtoffset?

be polite and give the smtp server a 'hello' although the mail server would take your word for it as of RFC822-RFC1123


HELO local.domain.name
250 mail.domain Hello local.domain.name [loc.al.i.p], pleased to meet you

lets send an email..


MAIL FROM: mail@domain.com
250 2.1.0 mail@domain.com... Sender ok
RCPT TO: mail@otherdomain.com
250 2.1.0 mail@otherdomain.com... Recipient ok

If it doesn't please see possible problems.

To start composing the message issue the command DATA

If you want a subject for your email type Subject:-type subject here- then press enter twice (these are needed to conform to RFC 882)

You may now proceed to type the body of your message

To tell the mail server that you have completed the message enter a single "." on a line on it's own.


.
250 2.0.0 ???????? Message accepted for delivery

You can close the connection by issuing the QUIT command.


quit
221 2.0.0 mail.domain.ext closing connection
Connection closed by foreign host.

Here are a list of problems that i have encountered before....

The domain that you are sending from must exist


501 nouser@nosuchplace.here... Sender domain must exist

A recipient has been specified before a sender.


503 Need MAIL before RCPT

The mail server has refused to relay mail for you, this may be for any number of reasons but typical resons include:
Not using this provider for an internet connection and/or
Not using an email address provided by the owner of the server.
ACL in the mail configuration.


550 mail@domain.ext... Relaying Denied

Solaris NIC speed and duplex settings

In solaris 10, we have this very useful command to check the state of the NIC..


# dladm   show-dev 
bge0            link: up        speed: 1000  Mbps       duplex: full 
bge1            link: unknown   speed: 0     Mbps       duplex: unknown 
e1000g0         link: up        speed: 1000  Mbps       duplex: full 
e1000g1         link: unknown   speed: 0     Mbps       duplex: half 
e1000g2         link: up        speed: 1000  Mbps       duplex: full 
e1000g3         link: up        speed: 100   Mbps       duplex: full 
bge2            link: up        speed: 1000  Mbps       duplex: full 
bge3            link: up        speed: 1000  Mbps       duplex: full 
e1000g4         link: up        speed: 1000  Mbps       duplex: full 
e1000g5         link: unknown   speed: 0     Mbps       duplex: half 

# dladm show-link 
bge0            type: non-vlan  mtu: 1500       device: bge0 
bge1            type: non-vlan  mtu: 1500       device: bge1 
e1000g0         type: non-vlan  mtu: 1500       device: e1000g0 
e1000g1         type: non-vlan  mtu: 1500       device: e1000g1 
e1000g2         type: non-vlan  mtu: 1500       device: e1000g2 
e1000g3         type: non-vlan  mtu: 1500       device: e1000g3 
bge2            type: non-vlan  mtu: 1500       device: bge2 
bge3            type: non-vlan  mtu: 1500       device: bge3 
e1000g4         type: non-vlan  mtu: 1500       device: e1000g4 
e1000g5         type: non-vlan  mtu: 1500       device: e1000g5

Here's a few good places to check out!!

http://www.brandonhutchinson.com/Solaris_NIC_speed_and_duplex_settings.html
http://forum.java.sun.com/thread.jspa?threadID=5084843

Phantom Websphere

Monday, September 12, 2011

Basic networking TCP test using telnet

How to Remove Unwanted route to 169.254.0.0 in RHEL Linux

Sunday, September 11, 2011

How NPIV can save on fibre cabling for SAN

What is NPIV

Why is it good

Why it may be bad

My thoughts on how can 802.1q can save on network cablings and potential problems.

VLAN tagging

Why is it good.

Why it may be bad

Thursday, August 14, 2008

Link based IPMP vs probe based IPMP in solaris 10

Link based IPMP on Solaris 10

Saturday, May 24, 2008

Change IP Address without rebooting in Solaris 10

IPMP Configuration (Probe based)

Enable/Disable IP Forwarding in Solaris 10 without reboot

Tuesday, April 29, 2008

Taking IPMP offline for maintenance

Solaris 10 Fibre Channel Management

Using Telnet to test SMTP

Solaris NIC speed and duplex settings

About Me

Catagories

Blog Archive

Other blog Links