Re: [SLUG] bash woes - my turn

From: Eben King (eben1@tampabay.rr.com)
Date: Wed Aug 03 2005 - 17:13:36 EDT


On Wed, 3 Aug 2005, Mike Branda wrote:

> Well our web host uses analog to give stats but it doesn't do unique
> visitors. I found a project called "visitors" that does and also
> outputs visio style traffic flowcharts. However, they only give me real
> apache log files for 5 days. After that all you get is what analog
> outputs for up to 2 years I think. So I've started cat-ing the
> access.log.0 (the 1 day old logrotate file) to the end of a cumulative
> file so I can use visitors.

Isn't that cumulative file going to grow without bound?

> I'm also gzipping the last month's file so
> I can conserve as much space as possible. It worked great the last 2
> months but now I have the current month's unzipped file, and 2 gzipped
> files and my weak attempt at a bash script coughed. It worked great
> from a cron job with 1 gz and 1 unzipped. So the issue is that I need
> to somehow say for file in access.*.gz (where * is monthly date).

  Try
> not to laugh at this too hard as it's only my second attempt at a bash
> script. :^) I built in an else and elif to handle if one of the files
> is missing somehow. So the deal is that $file1 is now multiple files.
> also keep in mind that evolution is wrapping the long lines.

> #!/bin/bash
>
> home="/home/wws1"
> logdir="$home/cumulativelog"
> file1="access.log.*.gz"

I hope you aren't relying on some program's behavior of "when presented with
multiple files e.g. 'access.jan.log.gz access.feb.log.gz access.mar.log.gz'
only deal with the first one" (and the fact that $file1 isn't in ""),
because that'll break as soon as somebody touch(1)es a previous month's
file. I like Bill Glidden's idea of using date(1).

> file2="access.log"
> outputfilea="$home/report/index.html"
> outputfileb="$home/cumulativelog/graph.dot"
> outputfilec="$home/cumulativelog/graphfix.dot"
> outputgraph="$home/report/graph.png"
> commanda="$home/bin/visitors -A -m 30 -o html -f $outputfilea -"
> commandb="$home/bin/visitors -T -V -"
> commandc="$home/bin/dot"

It's $HOME not $home, unless your environment is weird.

> if [ -e $logdir/$file1 ] && [ -e $logdir/$file2 ]

You could maybe speed that up, dunno:

if [[ ( -e $logdir/$file1 ) && ( -e $logdir/$file2 ) ]]

With [], if $foo contains a space or other "bad" character, you have to
write "$foo". Not so with [[]].

> then
> echo "starting visitors with both files"
> (gunzip -c $logdir/$file1 && cat $logdir/$file2) | $commanda
> (gunzip -c $logdir/$file1 && cat $logdir/$file2) | $commandb > $outputfileb

No need to decompress it twice (unless decompression is faster than
writing to disk):

gunzip -c "$logdir/$file1" && "cat $logdir/$file2" > "$tempfile"
"$commanda" < "$tempfile" &
"$commandb" < "$tempfile" > "$outputfileb" &
sleep 1 # or some small value, just so that $commanda and $commandb open
        # $tempfile. Most programs don't mind reading from a deleted file,
        # and your script might as well do other stuff while that churns.
rm "$tempfile"

And at the end of the script (or before you need $commanda'a output):

wait

> cat $outputfileb | sed 's/\[splines=true/\[splines=true fontpath="\/home\/wws1\/fonts\/"/' > $outputfilec

UUOC, and you can use any delimiter in a sed "s" command:

sed 's@\[splines=true@\[splines=true fontpath="/home/wws1/fonts/"@' > "$outputfilec"

Probably no need to put quotes on the fontpath, unless visitors(?) requires
it.

> $commandc $outputfilec -Tpng > $outputgraph

If you don't want to keep $outputfileb and $outputfilec, you could do
something like

"$commandb" < "$tempfile" | sed ... | "$commandc" - -Tpng > "$outputgraph"

> elif [ -e $logdir/$file2 ]
> then
> echo "starting visitors with $file2"
> cat $logdir/$file2 | $commanda
> cat $logdir/$file2 | $commandb > $outputfileb
> cat $outputfileb | sed 's/\[splines=true/\[splines=true fontpath="\/home\/wws1\/fonts\/"/' > $outputfilec

UUOC, and the sed thing:

"$commanda" < "$logdir/$file2"
"$commandb" < "$logdir/$file2" > "$outputfileb"
sed 's@\[splines=true@\[splines=true fontpath="/home/wws1/fonts/"@' < "$outputfileb" > "$outputfilec"

> $commandc $outputfilec -Tpng > $outputgraph

pipe thing:

"$commandb" < "$logdir/$file2" | sed ... | "$commandc" - -Tpng > "$outputgraph"

> else
> echo "starting visitors with $file1"
> zcat $logdir/$file1 | $commanda
> zcat $logdir/$file1 | $commandb > $outputfileb

$tempfile:

gunzip -c "$logdir/$file1" > "$tempfile"
"$commanda" < "$tempfile" &
"$commandb" < "$tempfile" > "$outputfileb" &
sleep 1 # or whatever
rm "$tempfile"

> cat $outputfileb | sed 's/\[splines=true/\[splines=true fontpath="\/home\/wws1\/fonts\/"/' > $outputfilec

sed thing, and UUOC:

sed 's@\[splines=true@\[splines=true fontpath="/home/wws1/fonts/"@' < "$outputfileb" > "$outputfilec"

> $commandc $outputfilec -Tpng > $outputgraph

pipe thing:

"$commandb" < "$tempfile" | sed ... | "$commandc" - -Tpng > "$outputgraph"

> fi

If you use $tempfile, don't forget to define it. $commandc may not do the
"- = stdin" thing.

-- 
-eben    ebQenW1@EtaRmpTabYayU.rIr.OcoPm    home.tampabay.rr.com/hactar
TAURUS:  You will never find true happiness - what you gonna
do, cry about it?  The stars predict tomorrow you'll wake up,
do a bunch of stuff and then go back to sleep.  -- Weird Al

----------------------------------------------------------------------- This list is provided as an unmoderated internet service by Networked Knowledge Systems (NKS). Views and opinions expressed in messages posted are those of the author and do not necessarily reflect the official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 19:41:13 EDT