Re: [SLUG] spamassassin vs pine

From: Mike Branda (mike@wackyworld.tv)
Date: Wed Dec 14 2005 - 13:49:34 EST


On Wed, 2005-12-14 at 12:20 -0500, steve szmidt wrote:
> On Wednesday 14 December 2005 12:01, Eben King wrote:
> > I put Spamassassin on my machine to deal with the spam issue (which is not
> > severe, by "abuse@" standards, but still), and it doesn't seem to be
> > getting beter. Correction, the two-week average of "% spams caught" has
> > gone from 38% two weeks ago to 44% now. There have been no false
> > positives, for which I'm grateful, but I'd put up with a few false
> > positives for a better kill rate. I got fed up with it, so I made a
> > spreadsheet to see _how bad_ it was. It's at
> > http://24.94.123.65:81/spam.xls . I've fed every missed spam to it by "|
> > sa-learn --spam" from pine. What am I doing wrong? Shouldn't it be
> > improving?
>
> You also need to tell it how good email (ham) looks like. It should be a 50 -
> 50 split when you train it.
>

Also from "info Mail::SpamAssassin::Conf" :

bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)

        To be accurate, the Bayes system does not activate until a
        certain number of ham (non-spam) and spam have been learned. The
        default is 200 of each ham and spam, but you can tune these up
        or down with these two settings.
        

Also as a side note make sure "use_bayes 1" (no quotes) is set in your
local.cf file. Make sure you use the --ham on a mailbox full of good
spam-free e-mail. I set the default required_hits down from 5 to 3 in
the user_prefs file. That will tighten the belt a little.

# How many hits before a mail is considered spam.
required_hits 3

Also there's spamassassin -D --lint which might yield other deeper
problems.

Mine at our office has about a 98% catch rate....however the above
command shows:

debug: bayes corpus size: nspam = 1857, nham = 1267

So that may be why it's so accurate! ;^)

HTH!

Mike Branda Jr.

-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 19:52:52 EDT