SpamCorpora

Setting up “corpora” of test messages is a way to verify that new rules catch spam without accidentally marking good e-mail (”ham”) as spam. It seems the source code for SpamAssassin may include some tools to do this, but typically are not included in distribution packaged versions, or may be in a separate -dev or -devel package.

Some hints from Kevin McGrail

cd [sabuilddirectory]/masses
mkdir testrules
cp /etc/mail/spamassassin/new.cf testrules
./mass-check -c testrules ham:dir:/home/kmcgrail/HAM 
spam:dir:/home/kmcgrail/SPAM
[this takes a LONG time]
./hit-frequencies -x -p > freqs

A quick web search led me to some very helpful pages in the SpamAssassin Wiki:

 
spamcorpora.txt · Last modified: 2006/10/27 12:53 by 207.96.37.60
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki