Re: [SLUG] bogofilter

From: Derek Glidden (dglidden@illusionary.com)
Date: Tue Sep 17 2002 - 12:12:50 EDT


On Mon, 2002-09-16 at 22:17, Paul M Foster wrote:
> There was a write-up on the front page of the weekly version of LWN this
> week (http://www.lwn.net) that compared the new bogofilter with
> spamassassin.
[snip]

I've been using bogofilter for a few weeks now (since 0.4) and
completely replaced spamassassin with bogofilter about a week ago. I
totally agree that it is A Good Thing In The Fight Against SPAM.

It does take some training, and it's still not 100% for me, but it's
getting very very close, while SpamAssassin's rules were starting to age
and become less effective. (I think not in part due to the fact that
the SpamAssassin guys have gone commercial lately and I haven't seen any
updates to SpamAssassin for a while now.)

If you want to start playing with bogofilter, I'd highly recommend using
it in conjunction with something else like SpamAssassin to catch SPAM
until bogofilter has "Trained" to the point you can start relying on it
primarily.

An interesting side-effect of using Bogofilter is that it keeps track of
how many emails it's categorized as SPAM and non-SPAM. Since I started
using it, I've had 1341 non-SPAM and 287 SPAM. That's a pretty large
ratio, even moreso when you consider that a HUGE number of those
"non-SPAM" emails come from closed/moderated lists.

FWIW: the very original article that outlined the use of Naive Bayes
Filters to do SPAM filtering was on Paul "LISP Wizard" Graham's website
at http://www.paulgraham.com/ which has a number of other very
interesting articles he's written about LISP and the computer industry
in general. Highly recommended reading for hackers.

There is also another piece of software out there that does Naive Bayes
filtering on email that's been around a lot longer than Bogofilter
called ifile. (http://www.ai.mit.edu/~jrennie/ifile/) (Not entirely
irrationally, a lot of die-hard ifile users are a little peeved at ESR
getting a lot of attention for Bogofilter when they say they've got the
more mature application and why can't ESR just contribute to ifile
instead of writing his own and getting all the attention. The politics
never end...) ifile, however, isn't specifically designed to do SPAM
filtering, so it takes some extra effort to use it as such.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \ | extract_mpeg2 | mpeg2dec -

http://www.cs.cmu.edu/~dst/DeCSS/Gallery/ http://www.eff.org/ http://www.anti-dmca.org/



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 19:41:01 EDT