Monday, February 8, 2010

HeartBeat Linux problem

Related to Heartbeat package for High Availability Clusters (SLES 11)
The apache resource script was failing, for this reason the whole cluster wasnt working fine. I searched so much, but couldnt find the reason..

node242:/etc/ha.d/resource.d # ./apache status
2009/05/08_02:41:04 ERROR: command failed: sh -c wget -O- -q -L --bind-address=127.0.0.1 http://localhost:80/server-status | tr '\012' ' ' | grep -Ei "[[:space:]]*" >/dev/null
2009/05/08_02:41:04 ERROR: Generic error
ERROR: Generic error

Then I set up the debug flag set -x in the shell script, and I got the location of the actual file where the command was failing. Its in:

/usr/lib/ocf/resource.d/heartbeat
Here in the apache script, I saw the following code, which was in fact preparing the wget command parameters.

#
# It's difficult to figure out whether the server supports
# the status operation.
# (we start our server with -DSTATUS - just in case :-))
#
# Typically (but not necessarily) the status URL is /server-status
#
# For us to think status will work, we have to have the following things:
#
# - $WGET has to exist and be executable
# - The server-status handler has to be mapped to some URL somewhere
#
# We assume that:
#
# - the "main" web server at $PORT will also support it if we can find it
# somewhere in the file
# - it will be supported at the same URL as the one we find in the file
#
# If this doesn't work for you, then set the statusurl attribute.
#
if
[ "X$STATUSURL" = "X" ]
then
if
have_binary $WGET
then
StatusURL=`FindLocationForHandler $1 server-status | tail -1`
if
[ "x$Listen" != "x" ]
then
echo $Listen | grep ':' >/dev/null || # Listen can be only port spec
Listen="localhost:$Listen"
STATUSURL="http://${Listen}$StatusURL"
case $WGET in
*wget*) WGETOPTS="$WGETOPTS --bind-address=127.0.0.1";;
esac
else
STATUSURL="${LOCALHOST}:${PORT}$StatusURL"
fi
fi
fi
test "$PidFile"
}


From the comments I figured out that server status check wasnt required in my case, its best to comment that out for my case, the problem seems to be that the wget command itself isnt getting executed by the shell.

monitor_apache() {
if
! have_binary $WGET
then
ocf_log err "Monitoring not supported by $OCF_RESOURCE_INSTANCE"
ocf_log info "Please make sure that wget is available"
return $OCF_ERR_CONFIGURED

elif [ -z "$STATUSURL" ]; then
ocf_log err "Monitoring not supported by $CONFIGFILE"
ocf_log info "Please set the statusurl parameter"
return $OCF_ERR_CONFIGURED
fi

if
silent_status
then
#ocf_run sh -c "$WGET $WGETOPTS $STATUSURL | tr '\012' ' ' | grep -Ei \"$TESTREGEX\" >/dev/null"
else
ocf_log info "$CMD not running"
return $OCF_NOT_RUNNING
fi
}


So I commented the line:
#ocf_run sh -c "$WGET $WGETOPTS $STATUSURL | tr '\012' ' ' | grep -Ei \"$TESTREGEX\" >/dev/null"

and my problem was fixed.


node242:/etc/ha.d/resource.d # ./apache status
Script name is : /usr/lib/ocf/resource.d//heartbeat/apache
2009/05/08_02:46:29 INFO: Running OK
INFO: Running OK

No comments:

Post a Comment