Re: [SLUG] bash script fork children

From: Eben King (eben01@verizon.net)
Date: Wed Jan 03 2007 - 17:35:25 EST


On Wed, 3 Jan 2007, Mike Branda wrote:

> So I've been playing with ImageMagick to batch process psd files into
> flattened jpgs. I have a bash script that runs a for loop with
> "convert" on a directory. What I have noticed is that in xosview on my
> Dual Intel Hyper Threaded box ( shows up as 4 cpu's cause of the HT ),
> only one ( or one half? ) CPU is doing all the work. Makes sense
> because of the whole user space kernel space thing and I don't really
> need both CPU's working on pieces of the same image anyway.
>
> What I would like though is a way to fork multiple children in the loop
> to start additional instances of convert until 4 are active. Then wait
> until one exits and then start on the next image. 4 active converts
> always until all the images are processed.

The easy (but inelegant) way is to start them all in the background (i.e.
with "&") and have the script's last command be "wait". This can start a
huge number of background processes (which can deplete your swap space and
suck down your cycles), and if you have dependencies, make sure they're
taken care of; e.g.

{
   stage_1
   stage_2
   stage_3
} &

is not the same as

stage_1 &
stage_2 &
stage_3 &

Slicker would be to have some utility wait on starting a new process until
one exits. I don't know if something like this exists. Or you could do as
Dylan (I think) suggested and use "make".

You could have a function take a filename (or pair thereof) and add it to
the end of a queue. Then a daemon could make sure there's always N (where
N=4) "convert" processes running, using the filenames in that queue.

> Right now I just create 4 different dirs and run 4 instances of the
> shell script ( which calls convert ) on them. This makes the batching
> of a few hundred images go much faster.

As you know, this can be inefficient, if some CPUs are done (and idle) while
the others are still working.

> I see
>
> mbranda:~> apropos fork
>
> fork (2) - create a child process
> vfork (2) - create a child process and block parent
> fork (3p) - create a new process
> vfork (3p) - create a new process; share virtual memory
>
>
> Any ideas or suggestions? Anybody actually use the fork command in a
> script before?

Those aren't commands, they're C functions. But, I don't think "(v)fork"
provides you with any more information than "&" does. Well, the PID is
returned with "fork", but that's easy to get with "$!".

-- 
-eben      QebWenE01R@vTerYizUonI.nOetP      http://royalty.no-ip.org:81
ARIES:  The look on your face will be priceless when you find that 40lb
watermelon in your colon.  Trade toothbrushes with an albino dwarf, then
give a hickey to Meryl Streep.  -- Weird Al, _Your Horoscope for Today_
-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS).  Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:27:20 EDT