Re: [SLUG] automated sequential boot of multiple machines

From: Mike Branda (mike@wackyworld.tv)
Date: Thu Nov 17 2005 - 14:48:53 EST


On Wed, 2005-11-16 at 18:07 -0500, Ian C. Blenke wrote:
> Mike Branda wrote:
> > Hey all,
> >
> > I figure I'll put this out there. I'm starting to think of a way that I
> > can bring machines up here in sequence when the power is out for more
> > than the designated run time of the backups. Obviously some machines
> > have services that need to be up before others. I've just been reduced
> > back to a one man operation and if things go amuck I'd like to be able
> > to run a script from a remote machine. I was starting to think of
> > something that uses wake on lan or something combined with a check on
> > that service before the next machine was given a "ring" to start up.
> > Anybody do this in Linux yet?? Is there a project out there or a piece
> > of hardware that does such a thing?? Anybody ever monkey with wake on
> > lan??
> >

> The fun bit here is that you will need to disable "auto-power-on" and
> enable WOL in your systems BIOSes. This way, when the power comes back
> on, they will not power back on automagically, but rather only when you
> hit the power button or send a WOL packet to the ethernet card in the
> system to power it on.
>

Yeah, that's what they do now. The mobo's are Intel Dual Xeon
SE7501HG2's and as soon as the power is restored after an extended
outage they power right up on their own.

> Quite honestly, if I were you, I'd much rather write some service
> polling logic into your init scripts that check to make sure a remote
> server is responding and that it is running a service that a given
> system requires before allowing the system to continue booting.
> Something as simple as a ping followed by a netcat (nc) / socat network
> socket connect attempt to see if things are "ready" for the system to
> continue. Write it as a infinite loop:
>
> while ! ping -c 5 remoteserver ; do
> echo "Remote server is not pinging yet! Retrying"
> done
> while ! nc remoteserver 5432 ; do
> echo "Remote server isn't running Postgres yet. Retrying"
> done
>
> Put that in the section of your /etc/init.d/ scripts for things that
> require a remote server in the "start" handling section.
>

The more I think of it the more I like this approach due to the nature
of the fact that the WOL script solution requires user intervention to
get the thing going. Also as Eben commented, POST and local disk
mounting and other startup items independent or do not require a wait
period. This way that stuff could be done and out of the way up to the
point where the wait is required.

> - Build a Xen hosting cluster (with RedHat clustering: cman/dlm or gulm
> for locking), and write a script that sparks off Xen domains in a
> particular order across the cluster. This is something I'm playing with
> in my spare time here.
>
> Hope this helps.
>
> - Ian C. Blenke <ian@blenke.com> http://ian.blenke.com/

Are you talking about multiple box Xens or a single with virtual Xens?

Right now there are at least 2 separate hardware hosts (masters and
slaves) running each service like ntp, dns, nis so the hardware failure
variable is reduced.

Thanks for the ideas!

Mike Branda Jr.

-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:12:42 EDT