Re: [SLUG] automated sequential boot of multiple machines

From: Ian C. Blenke (icblenke@nks.net)
Date: Wed Nov 16 2005 - 18:07:02 EST


Mike Branda wrote:
> Hey all,
>
> I figure I'll put this out there. I'm starting to think of a way that I
> can bring machines up here in sequence when the power is out for more
> than the designated run time of the backups. Obviously some machines
> have services that need to be up before others. I've just been reduced
> back to a one man operation and if things go amuck I'd like to be able
> to run a script from a remote machine. I was starting to think of
> something that uses wake on lan or something combined with a check on
> that service before the next machine was given a "ring" to start up.
> Anybody do this in Linux yet?? Is there a project out there or a piece
> of hardware that does such a thing?? Anybody ever monkey with wake on
> lan??
>
Wake on LAN (WOL) is a layer2 ethernet frame that you send to given MAC
addresses of ethernet cards with the WOL feature that cause it to
power-on the machine. If you use a PCI ethernet card, there is usually a
"WOL header" with a cable that can run from the card to your motherboard
to trigger the power-on of the system. If you have a built-in ethernet
card with WOL support, your motherboard probably has support for WOL
natively.

The typical use for this is in an enterprise environment where you want
to turn on user workstations in the middle of the night for a critical
after-hours job (typically backups and/or automated software updates).

The fun bit here is that you will need to disable "auto-power-on" and
enable WOL in your systems BIOSes. This way, when the power comes back
on, they will not power back on automagically, but rather only when you
hit the power button or send a WOL packet to the ethernet card in the
system to power it on.

Quite honestly, if I were you, I'd much rather write some service
polling logic into your init scripts that check to make sure a remote
server is responding and that it is running a service that a given
system requires before allowing the system to continue booting.
Something as simple as a ping followed by a netcat (nc) / socat network
socket connect attempt to see if things are "ready" for the system to
continue. Write it as a infinite loop:

          while ! ping -c 5 remoteserver ; do
             echo "Remote server is not pinging yet! Retrying"
          done
          while ! nc remoteserver 5432 ; do
             echo "Remote server isn't running Postgres yet. Retrying"
          done

Put that in the section of your /etc/init.d/ scripts for things that
require a remote server in the "start" handling section.

You will probably want to make sure that sshd is running before those
services, to make sure you can get into a system remotely.

If you want to get really fancy, you could always:
- Use something like heartbeat to handle failover and shared resources
to create an HA failover cluster.
or
- Build a Xen hosting cluster (with RedHat clustering: cman/dlm or gulm
for locking), and write a script that sparks off Xen domains in a
particular order across the cluster. This is something I'm playing with
in my spare time here.

Hope this helps.

 - Ian C. Blenke <ian@blenke.com> http://ian.blenke.com/

-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:11:48 EDT