Monday, February 8, 2010

HeartBeat Linux problem

Related to Heartbeat package for High Availability Clusters (SLES 11)
The apache resource script was failing, for this reason the whole cluster wasnt working fine. I searched so much, but couldnt find the reason..

node242:/etc/ha.d/resource.d # ./apache status
2009/05/08_02:41:04 ERROR: command failed: sh -c wget -O- -q -L --bind-address=127.0.0.1 http://localhost:80/server-status | tr '\012' ' ' | grep -Ei "[[:space:]]*" >/dev/null
2009/05/08_02:41:04 ERROR: Generic error
ERROR: Generic error

Then I set up the debug flag set -x in the shell script, and I got the location of the actual file where the command was failing. Its in:

/usr/lib/ocf/resource.d/heartbeat
Here in the apache script, I saw the following code, which was in fact preparing the wget command parameters.

#
# It's difficult to figure out whether the server supports
# the status operation.
# (we start our server with -DSTATUS - just in case :-))
#
# Typically (but not necessarily) the status URL is /server-status
#
# For us to think status will work, we have to have the following things:
#
# - $WGET has to exist and be executable
# - The server-status handler has to be mapped to some URL somewhere
#
# We assume that:
#
# - the "main" web server at $PORT will also support it if we can find it
# somewhere in the file
# - it will be supported at the same URL as the one we find in the file
#
# If this doesn't work for you, then set the statusurl attribute.
#
if
[ "X$STATUSURL" = "X" ]
then
if
have_binary $WGET
then
StatusURL=`FindLocationForHandler $1 server-status | tail -1`
if
[ "x$Listen" != "x" ]
then
echo $Listen | grep ':' >/dev/null || # Listen can be only port spec
Listen="localhost:$Listen"
STATUSURL="http://${Listen}$StatusURL"
case $WGET in
*wget*) WGETOPTS="$WGETOPTS --bind-address=127.0.0.1";;
esac
else
STATUSURL="${LOCALHOST}:${PORT}$StatusURL"
fi
fi
fi
test "$PidFile"
}


From the comments I figured out that server status check wasnt required in my case, its best to comment that out for my case, the problem seems to be that the wget command itself isnt getting executed by the shell.

monitor_apache() {
if
! have_binary $WGET
then
ocf_log err "Monitoring not supported by $OCF_RESOURCE_INSTANCE"
ocf_log info "Please make sure that wget is available"
return $OCF_ERR_CONFIGURED

elif [ -z "$STATUSURL" ]; then
ocf_log err "Monitoring not supported by $CONFIGFILE"
ocf_log info "Please set the statusurl parameter"
return $OCF_ERR_CONFIGURED
fi

if
silent_status
then
#ocf_run sh -c "$WGET $WGETOPTS $STATUSURL | tr '\012' ' ' | grep -Ei \"$TESTREGEX\" >/dev/null"
else
ocf_log info "$CMD not running"
return $OCF_NOT_RUNNING
fi
}


So I commented the line:
#ocf_run sh -c "$WGET $WGETOPTS $STATUSURL | tr '\012' ' ' | grep -Ei \"$TESTREGEX\" >/dev/null"

and my problem was fixed.


node242:/etc/ha.d/resource.d # ./apache status
Script name is : /usr/lib/ocf/resource.d//heartbeat/apache
2009/05/08_02:46:29 INFO: Running OK
INFO: Running OK

Installing a Module in Perl through source

I am very new to perl. No idea how to make things work in perl. I mean resolving errors and that kind of stuff. I can write programs with some google help. Two days back I wanted to generate a malformed UDP packet, a packet with an Invalid UDP length field. This kind of packet was notorious for causing a DOS attack on older Unix systems (dont know whats the current status). Sure it was fun. But yes, I found a useful tip for a perl beginner like me. It happens when your code requires a perl module that is not available in your current perl installation. In such cases you see errors like:

Can't locate Socket6.pm in @INC (@INC contains: /usr/lib/perl5/5.10.0/s390x-linux-thread-multi /usr/lib/perl5/5.10.0 /usr/lib/perl5/site_perl/5.10.0/s390x-linux-thread-multi /us
r/lib/perl5/site_perl/5.10.0 /usr/lib/perl5/vendor_perl/5.10.0/s390x-linux-thread-multi /usr/lib/perl5/vendor_perl/5.10.0 /usr/lib/perl5/vendor_perl .) at /etc/ha.d/resource.d/l
directord line 721.
BEGIN failed--compilation aborted at /etc/ha.d/resource.d/ldirectord line 721.

Obviously it means that my Linux doesnt have the perl module named Socket6.pm. It happends many times that if I google with this error string, I may or may not find a quick solution. The better way is to go to the CPAN search site

http://search.cpan.org/

and search for Socket6.pm

This will give you the package that has Socket6.pm in it. Again there can be two ways of installing it, either you install it through CPAN or install it by source. I preferred the second method as my linux machine had some internet connectivity issues.

So download the tar.gz package from the results returned by search.cpan, extract it and install it using the commands

tar -xvzf package.tar.gz
perl Makefile.pl
make
make test
make install

tcpdump

This is for reference, its not a guide but just a list of usage commands that I picked from various sources. Yeah I admit, I am one of those lamers who prefer to google than reading the man page. :/ Most are picked from wireshark's homepage :

http://openmaniak.com/tcpdump.php

1.tcpdump
2.tcpdump -v //verbose
3.tcpdump -D //lists devices
4.tcpdump -n //avoid dns lookup
5.tcpdump -q // quick output
6.tcpdump udp // capture udp packets only :: useful
7.tcpdump -w capture.cap //save the capture to a file named capture.cap :: useful
8.tcpdump -r capture.cap //read dump from capture.cap
9.tcpdump host abc.com //packets coming from or going towards abc.com ::useful
10.tcpdump src xx.xx.xx.aa and dst xx.xx.xx.bb
11.tcpdump -A //displays the packet's content ::useful
12.tcpdump -i eth1 //capture on interface eth1
13.tcpdump -v -A udp and dst 192.168.69.238 or dst 192.168.69.242 -i eth1
14.tcpdump -n -S -s 15000 -vv -X 'host 192.168.0.159 and udp and port 1717'
-S print absolute IP sequence number (not relative)
-n no address resolution
-s size of capture for each packet (15000 should be enough to hold data returned by query,
you will have to play with this depending on what type of query you issue)
-X print HEX and ASCII version of packet 'host 192.168.0.159 and udp and port 1717'

for an exhaustive list, see the man page

http://linux.die.net/man/8/tcpdump

Exceeding Windows Remote Desktop Limit

While making a Remote desktop connection, the maximum number of allowed connections is 2. And when this limit is reached, you see an error message of the sort:

When you close the remote desktop window using the 'x' sign in the top right corner, you DISCONNECT from the windows session. However windows keeps your session alive in its memory. So that when you try to relogin it assigns the active session in its memory to you. Closing the window using the 'x' button doesnot make you logoff. Your session remains active, only that your state is 'DISCONNECTED'. So sometimes when the number of sessions is 2, even though they are disconnected, still windows shows you this message. You can use a third reserved connection to remotely login into windows:

type this command in your command prompt:

start mstsc -v:xx.xx.xx.xx /f -console

and this will open the third connection. You can use this connection to kill the other disconnected sessions through taskmanager. xx.xx.xx.xx is the IP of the windows machine.

Failed to find VM - aborting Red Hat

In case you are using RedHat 5.* Linux, and you a message like this while installation:

Failed to find VM - aborting


You need to disable Selinux.
Go to /etc/selinux directory, open the file config, which would look like:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - SELinux is fully disabled.
SELINUX=disabled
# SELINUXTYPE= type of policy in use. Possible values are:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.
SELINUXTYPE=targeted

Change the line SELINUX=enforcing to
SELINUX=disabled