What:

 This document describes how to use bogofilter to filter mail that passes through a qmail mail server.

Theory:

 The idea is to setup a bogofilter on the mail server and have it filter all incoming mail.

 There are several advantages to doing so:

 1. Mail users on non-unix platforms will benefit from bogofilter spam filtering
 2. bogofilter learns better since it has access to a larger corpus.

 There is also a mechanism for users to register new spam/nonspam messages, as well as correcting misclassifications.

Assumptions:

- Most of the steps described here require root privileges.
- qmail is installed in /var/qmail. If you installed qmail from rpm, it is probably installed in /usr/qmail.
- You have installed Bruce Guenter's qmail-qfilter program which is available from:

	 http://untroubled.org/qmail-qfilter

- bogofilter is installed in /usr/bin/bogofilter on the mail server.

Installation:

- Build the initial spam and nonspam databases by feeding your corpus of mail.
   Assuming that the files are in mbox format in /home/bogofilter, you say:
  
   [/home/bogofilter]# bogofilter -d . -s < spam.mbx
   [/home/bogofilter]# bogofilter -d . -n < nonspam.mbx

Filtering:

- Setup a script to invoke bogofilter, say /home/bogofilter/bogofilter-run,
   and put:

	#!/bin/sh
	exec /usr/bin/qmail-qfilter /usr/bin/bogofilter -d /home/bogofilter -p -u -e

   Make sure the script is executable!
 
   Given a good initial corpus, it is better to have bogofilter update it's lists based
   on the message classification, since it is quite likely to get it right.
   Misclassifications will be corrected later.

- Setup an /etc/tcpcontrol/qmail.rules entry to trigger the  filtering.
  Ideally, you want to filter incoming mail, but not outgoing.

  in /etc/tcpcontrol/qmail.rules or its equivalent, enter the default rule:

	:allow,QMAILQUEUE="/home/bogofilter/bogofilter-run"

- Now, every incoming message will have the header line

	X-Bogosity: ...

  added to the headers.
  A bogofilter classified spam messages will have the entry:

	X-Bogosity: Yes ...

  Note that the actual header name is configurable at compile time and may have been changed.

- Educate your users on how to filter their spam based on the value of the X-Bogosity header.
  Spam messages should be diverted to a spam mailbox, rather than deleted.


Registration and Correction:

Setup an email alias, for users to register new spam or nonspam messages
as well as to correct misclassification.

- copy  contrib/dot-qmail-bogofilter-default to /var/qmail/alias/.qmail-bogofilter-default

  If you are not using the alias 'bogofilter', you'll need to rename the file to your
  alias and edit the contents

  If you are not using /home/bogofilter, you'll  need to edit the contents

- copy contrib/bogofilter-qfe script to /home/bogofilter.

  Change the domain variable in the script to your domain. Let's assume example.com
  The script will have to be edited if:
  1. You have multiple domains
  2. Your users use Mail User Agents I don't know about.

  Please send me your improvements!

  If you wish to save copies of the incoming messages, run the command

	# /var/qmail/bin/maildirmake /home/bogofilter/Maildir

  and to /var/qmail/alias/.qmail-bogofilter-default, append the line:
  
	/home/bogofilter/Maildir/

  The trailing slash is important. Thanks to John Crawford for reminding me.

- Make sure the script is executable
	
	# chmod +x /home/bogofilter/bogofilter-qfe

- Change the ownership of /home/bogofilter to the qmail alias user, usually called 'alias'

       # chown -R alias:alias /home/bogofilter

- Educate your users to bounce their messages to:

  To register:
	 spam, send to bogofilter-register-spam@example.com
  	 nonspam, send to bogofilter-register-nonspam@example.com

  To correct misclassified:

	 spam, send to bogofilter-spam@example.com
  	 nonspam, send to bogofilter-nonspam@example.com

  It is important that users bounce or redirect the messages rather than forward them
  since forwarding loses important header information.
 
- Done!

TODO:
 
 A message that is delivered to a group alias will be filtered once, but may be corrected multiple
 times. Ideally, the script should only accept the first correction and ignore the rest.
 
Author:
 Gyepi Sam <gyepi@praxis-sw.com>

