Re: [SLUG] Training bogofilter problem

From: Ian C. Blenke (icblenke@nks.net)
Date: Sat Sep 09 2006 - 12:52:14 EDT


Bob Stia wrote:
> The spammers use all kinds of words and phrases. Is bogofilter confused by the
> amount of info in it's spam directory and applying it to everything? Do I
> have to resort to listing all of my several hundred email addresses in the
> filters to be passed as ham?
>
> Would appreciate your thoughts, experiences, advice, whatever, in your use of
> bogofilter.
>

I gave up on bogofilter a while ago. It became monstrously huge, and
horribly tainted by "anti-bayesian" spam mumbo jumbo.

Tens of thousands of emails a day to a domain that has been on the net
since '94. I'm on every spamming list possible.

Today I use a custom Mailscanner + SpamAssassin + SARE rules + custom
rules + all of the possible SpamAssassin plugins (pyzor, razor2, dcc,
etc etc), and an auto-training bayesian database that auto-expires
anything older than a week.

Even that doesn't catch _everything_. For the bulk of the remainder, I
use Thunderbird's spam filtering. A few things sneak through, but it is
not bad at all.

Now and again, I get the urge to use CRM114 or dspam in there as well.
Eventually that will happen.

Bogofilter? Sure, you can use that as one tier of anti-spam, I suppose.
But don't rely on only Bogofilter, and learn to retrain Bogofilter
periodically to prune tainted weightings.

- Ian



-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 17:00:32 EDT