Re: [SLUG] spamassassin vs pine

From: Eben King (eben1@tampabay.rr.com)
Date: Sun Jan 15 2006 - 15:52:28 EST


On Sun, 15 Jan 2006, Mike Branda wrote:

> On Sun, 2006-01-15 at 01:41 -0500, Eben King wrote:
>
> > Anyhow, I added
> >
> > required_score 0
>
> > Yup. Only 2 out of 481 non-spams that I have get scores of 0 or higher
> > (actually both are 0). The rest are negative.
>
> Probably due to the ham training. From your previous post, your SA's
> basically been on an all ham diet. Almost 10 to 1.
>
> debug: bayes corpus size: nspam = 787, nham = 7777

Well, that's the ratio of messages I get, I guess. I do run everything I
see through sa-learn.

> > (/etc/mail/spamassassin/user_prefs)
>
> You know what.....I wonder if your setup has the stuff for the
> blacklists......notice there are the BL/RBL entries from mine that are
> high point items.
>
> > Dunno. Good mail has stuff like this:
> >
> > X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
> > pc.tampabay.rr.com
> > X-Spam-Level:
> > X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham
> > version=3.0.4
>
> Good. Again, I believe the negative scores are via the ham learning
> process.
>
> > Spam (that it caught) has stuff like this:
> >
> > X-Spam-Flag: YES
> > X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
> > pc.tampabay.rr.com
> > X-Spam-Level: *******
> > X-Spam-Status: Yes, score=7.3 required=5.0 tests=BAYES_60,DRUGS_ANXIETY,
> > DRUGS_ANXIETY_EREC,DRUGS_DIET,DRUGS_ERECTILE,DRUG_ED_GENERIC,
> > FORGED_YAHOO_RCVD,HG_HORMONE,HTML_FONT_BIG,HTML_MESSAGE,
> > HTML_SHOUTING5,HTML_TAG_EXIST_TBODY,MIME_QP_LONG_LINE,
> > SUBJECT_DRUG_GAP_C autolearn=no version=3.0.4
>
> Also good although again it doesn't look like it's using any blacklist
> rules.
>
> > Spam (that it missed) has stuff like this:
> >
> > X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
> > pc.tampabay.rr.com
> > X-Spam-Level: **
> > X-Spam-Status: No, score=2.5 required=5.0 tests=BAYES_50,HTML_90_100,
> > HTML_FONT_BIG,HTML_MESSAGE,LONGWORDS,MIME_QP_LONG_LINE autolearn=no
> > version=3.0.4
>
> Now this one has a score of 2.5 not 0. is this not the more common?

No individual net score is very common, but non-positive scores are
universal for ham, and unusual for spam (~16%, with 2/3 of those being 0).
It's increasingly common that it miss identifying spam (some 75% of spam is
not identified as such).

> > Need to do some statistical analysis to find out which tests show up in spam
> > but not in non-spam, mostly.
>
> I'm going to do some looking into the BL/RBL thing and find out if the
> SuSE default setup comes with that stuff. Your analysis will hopefully
> guide you with any custom scoring if we don't come up with anything
> else...

I came up with some scores, with a little scripting and spreadsheet magic.
There are 175 of them, and an "include scores_file" in
/etc/mail/spamassassin/local.cf makes them active. Let me know if you
want them. They might not make your situation any better.

I plan to upgrade (probably to Ubuntu, since it's new and hot and I have
it) this weekend. I'm running RedHat 8.5 (sorta) now, and some stuff is
ancient and/or broken. I hope not too much fixing is required.

-- 
I firmly believed we should not march into Baghdad ...To occupy Iraq would
instantly shatter our coalition, turning the whole Arab world against us and
make a broken tyrant, into a latter-day Arab hero assigning young soldiers
to a fruitless hunt for a securely entrenched dictator [.] - George Bush Sr.

----------------------------------------------------------------------- This list is provided as an unmoderated internet service by Networked Knowledge Systems (NKS). Views and opinions expressed in messages posted are those of the author and do not necessarily reflect the official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 16:32:57 EDT