Re: [SLUG] spamassassin vs pine

From: Eben King (eben1@tampabay.rr.com)
Date: Sun Jan 15 2006 - 01:41:23 EST


On Sat, 14 Jan 2006, Mike Branda wrote:

> On Sat, 2006-01-14 at 20:55 -0500, Eben King wrote:
>
> > > and in user_prefs (home or etc):
> > >
> > > # How many hits before a mail is considered spam. Default is 5.
> > > required_hits 3
> > >
> > > which is depreciated in newer versions of SA but still works. The
> > > replacement parameter is required_score. I bumped that down which
> > > helped too. 5 was too high.
> >
> > Did you have to add that parameter, or just change it?
>
> In SuSE it was there (local.cf) but commented out by default. In the
> perldoc it describes this setting too. Definitely defaults to 5 so if
> you think you need to bump down to a score thresh of 0 that's a long way
> to go. I would imagine there's no decimal option for the thresh and it
> would catch _all_ your mail.

There is one:

,--
| required_score n.nn (default: 5)
| Set the score required before a mail is considered spam.
'--

> This would be the parameter to change though. I set mine to 3. It
> catches the majority of it without overdoing it for me.

For some reason your numbers are much higher than mine. The documentation
seems to feel your situation is more normal. Could be the source and the
documentation differ, and the packager (in your case) tweaked the conf files
to make them match the docs. I installed from source.

Anyhow, I added

required_score 0

to /etc/mail/spamassassin/local.cf . I'll see how it works out. Even if
I'm wrong, no messages get sent to the bit-bucket.

> > Correct, there are two mailboxes, "almost-certainly-spam" and
> > "probably-spam". Only two messages have been pink enough to land in the
> > former.
>
> Ditto except I decided to drop both into one _spam_ folder. If I have
> to look to confirm whether it's legit or not, I'd rather not have to go
> through 2 separate mailboxes.

If it comes to pass that there are false positives, I'll change the
arrangement. I'm reasonably certain that somewhere in this adjusting
process, that'll happen.

> And as you can see, not a lot gets tagged as almost-certainly. But...as
> it gets smarter, this could change as the bayes bumps up the score bar.

Will it adjust the scoring automatically, or is that up to me?

> > It looks as if a score threshold of 0 would catch between 87% and 96% of
> > spam, with between 0 and 0.4% false positives, depending on where edge
> > cases go.
>
> 0 ? really? gee wiz I usually get stuff like:

Yup. Only 2 out of 481 non-spams that I have get scores of 0 or higher
(actually both are 0). The rest are negative.

> X-Spam-Flag: YES
> X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
> mail.wackyworld.lan
> X-Spam-Level: *************
> X-Spam-Status: Yes, score=13.5 required=3.0 tests=BAYES_99,
>
> DATE_IN_FUTURE_12_24,HELO_DYNAMIC_COMCAST,RCVD_IN_BL_SPAMCOP_NET,
> RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4
>
> Content analysis details: (13.5 points, 3.0 required)

How'd you do that?

(/etc/mail/spamassassin/user_prefs)

> It would be in the same spamassassin dir.

What you see there is all I have.

> I don't think it really matters though as SA reads all of them and the
> same options seem to be valid in all three files according to the docs.

Fair enough.

> I set mine up sitewide so I do everything out of /etc and fore-go the user
> dir files.

I'm the main user on this box, but others log in from time to time. Nobody
but me gets mail here on a regular basis, though.

> All seems to be right. I wonder what's up with the low scores?

Dunno. Good mail has stuff like this:

X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
    pc.tampabay.rr.com
X-Spam-Level:
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham
        version=3.0.4

Spam (that it caught) has stuff like this:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
    pc.tampabay.rr.com
X-Spam-Level: *******
X-Spam-Status: Yes, score=7.3 required=5.0 tests=BAYES_60,DRUGS_ANXIETY,
        DRUGS_ANXIETY_EREC,DRUGS_DIET,DRUGS_ERECTILE,DRUG_ED_GENERIC,
        FORGED_YAHOO_RCVD,HG_HORMONE,HTML_FONT_BIG,HTML_MESSAGE,
        HTML_SHOUTING5,HTML_TAG_EXIST_TBODY,MIME_QP_LONG_LINE,
        SUBJECT_DRUG_GAP_C autolearn=no version=3.0.4

Spam (that it missed) has stuff like this:

X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
    pc.tampabay.rr.com
X-Spam-Level: **
X-Spam-Status: No, score=2.5 required=5.0 tests=BAYES_50,HTML_90_100,
        HTML_FONT_BIG,HTML_MESSAGE,LONGWORDS,MIME_QP_LONG_LINE autolearn=no
        version=3.0.4

Need to do some statistical analysis to find out which tests show up in spam
but not in non-spam, mostly.

-- 
-eben    ebQenW1@EtaRmpTabYayU.rIr.OcoPm    home.tampabay.rr.com/hactar

Only two things are infinite, the universe and human stupidity, and I'm not sure about the former." - Albert Einstein

----------------------------------------------------------------------- This list is provided as an unmoderated internet service by Networked Knowledge Systems (NKS). Views and opinions expressed in messages posted are those of the author and do not necessarily reflect the official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 16:31:10 EDT