Training dspam from Thunderbird junk messages

Recently I have installed and configured dspam on my mailserver. It seems to work nicely but needs occasional training. I wanted to integrate this with Thunderbird so that users could automatically train dspam from their mail client.

Based on this code I knocked together a few lines of bash script which will scan junk mail directories on the server and automatically train dspam. This means that an end-user can click the “Junk” button in Thunderbird (or Mail.app, etc) and dspam will be trained for them automagically. The user could even just move the messages there manually, or use some sort of filtering or an extension.

The best bit is that it is completely transparent to the end user and doesn’t require them to forward messages with headers intact to a weird [email protected] address in order to conduct training.

If you find it useful or make changes, please let me know.

The code

#!/bin/bash

########
# TrainDspam.sh
#
#   Author: David Cannings <david @edeca.net>
#     Date: 21/02/2010
# Based on: http://tinyurl.com/yhky5w9
#
# This script scans mail directories for the Thunderbird "Junk" folder
# (or any other folder with the same name) and trains dspam with the
# messages contained within it.
#
# It can be used for periodic (e.g. daily) training of spam messages which
# a user has flagged as junk in their mail client.
#
# It will only train for accounts which appear to be using dspam.
########

########
# Configuration

# Path to mail directory, which should contain folders per domain and
# user e.g. /home/mail/<domain>/<user1>/
MAIL_PATH="/home/mail"

# Path to the directory containing 'dspam'
DSPAM_BIN_DIR="/usr/bin"

# Path to the directory containing the dspam user data files
DSPAM_DATA_DIR="/var/spool/dspam"

# If you want this script to delete messages from the Junk folder
# after training, set this to 1.
DELETE_MAIL=0

# DON'T EDIT BELOW THIS LINE
########

for FOLDER in `find $MAIL_PATH -name '.Junk' -type d -print`; do
 DOMAIN=`echo $FOLDER | awk -F/ '{print $(NF-3)}'`
 USER=`echo $FOLDER | awk -F/ '{print $(NF-2)}'`

 # We only want to train for accounts that are dspam users,
 # so check the data directory
 if [ -d "${DSPAM_DATA_DIR}/data/${DOMAIN}/${USER}" ]; then
 TRAINED_MESSAGES=0
 cd $FOLDER/cur/
 for MESSAGE in `ls -1`; do
   TRAINED_MESSAGES=`expr $TRAINED_MESSAGES + 1`
   cat $MESSAGE | $DSPAM_BIN_DIR/dspam --user ${USER}@${DOMAIN} --class=spam --source=error
   [ $DELETE_MAIL -gt 0 ] && rm -f $NAME
 done
 echo "- Trained $TRAINED_MESSAGES messages for ${USER}@${DOMAIN}"
 fi
done
David Cannings
David Cannings
Cyber Security

My interests include computer security, digital electronics and writing tools to help analysis of cyber attacks.