Re: [SLUG] Configuration Management and Linux Distributions in Large Scale Farms (was "hardware dependencies?")

From: steve szmidt (steve@szmidt.org)
Date: Wed Feb 01 2006 - 18:12:12 EST


On Wednesday 01 February 2006 12:16, Ian C. Blenke wrote:
> steve szmidt wrote:
>
> Unfortunately, very few machines are exactly the same. Every batch of
> motherboards we order from a given manufacturer's lot (with the same
> model number) seem to be basically identical. Each time we put in an
> order, however, we get a different batch of motherboards with slightly
> different chipset changes and whatnot (cost savings for the
> manufacturer). It can be maddening if you're not flexible enough to
> handle that kind of rapid change in your hardware supply.

Hmm, yes that can be a problem. I use a distributor who is very proactive with
their h/w. Any changes and they notify me before shipping. You pay a bit more
but you have the best customer service. If I try to buy anything incompatible
they point it out, or if they think there's a better product.

> We have one unified kernel source tree that I maintain. It has
> patches and fixes known to be required for our various hardware
> platforms in the farm. I've built a kernel building harness I call
> "kerncob" to build the dozens of kernels with the slightly differing
> .config files required for each. In the end, they all behave identically
> from a userspace perspective - the same debian packages for kernel
> modules (like ALSA, openswan, etc), the only difference being that the
> correct optimization/configuration kernels are installed on the box
> automagically via creative /proc/cpuinfo, lspci, etc.

Sounds interesting.

> >I agree that there's a lot of differences that occurs across distros. The
> > way they integrate various packages. But I'd argue that the kernel is
> > probably the least of your problems. How a package has been integrated is
> > in my view the problem we see. The version that distro used and
> > configured it to work for them. There's a massive amount of variables.
> > (Maybe that's what you meant?)
>
> Yes. Massive amount of variables. And each variable is dependant on that
> package maintainer's view of How Things Should Be. At NKS, we have our

True enough. For example I always change mc to display owner group and
permission on the window below. By default it shows the same information that
is on every line above. This is a very simple and non destructive example,
but that sort of thing is visible everywhere.

> own idea about that, and try to make metapackages that are centrally
> maintainable yet apply at the edge in the same reproducable way everytime.

> I stress the concept of "little switches" and "big switches" here. A
> "little switch" is a configuration option that an OpenSource package
> sees inside it's configuration file. A "big switch" is a configuration
> switch that we set for our metapackage to model the machine's
> configuration state. The metapackages on every box turn the "big
> switches" into "little switches" for the packages that they envelop. By
> developing in this way, we can add "big switches" as necessary to enable
> features for our customers, yet hide the complexity of the "little
> switches" inside each metapackage. Each metapackage begins with a
> "template" configuration, and proceeds to set the many little switches
> as appropriate for each big switch setting. This is enforced every time
> from the central configuration repository, and will override any local
> configuration changes (debugging and troubleshooting should happen at
> the edge, _not_ configuration).

Mmm, I love that kind of automation. Good control over many machines.

> Any machine in our farm can die at any time, and we can redeploy a base
> imaged machine purposed as that machine within minutes. Every system's
> configuration state is maintained in our central configuration
> management structure. We have historical dirvish backups of everything
> should we need to restore data (if it could not be restored from the
> dying/dead machine).

Neat!

> So, yes, we maintain our own distribution. It's not debian per-se, but a
> meta-package distribution _consisting_ of debian packages. Sure, it
> looks like a debian machine for all intents and purposes, but our
> Configuration Management structure is a scaffolding integrated into the
> fabric of every system. You would not apt-get from an external
> repository, for example.. every metapackage has its own mini-apt
> repository of the packages that comprise it.
>
> I'll never go back to managing systems individually. I'll also never go
> back to blindly trusting any public repository. If you want to be in
> complete control of the state of every machine in your farm, you _must_
> use some form of configuration management. Ours is homegrown, but very
> similar to ISConf in a number of ways. I'm also actively looking at
> Puppet as a potential migration path.
>
> In my opinion, you _must_ use some form of Configuration Management for
> more than a few machines.

I've obviously never needed to build enough servers for one location to need
that level of fine grained control. But it sounds like you got a pretty nice
and efficient setup.

-

Steve Szmidt

"For evil to triumph all that is needed is for good men to do nothing.
                                                Edmund Burke
-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 18:05:27 EDT