			       =================
				BOGOFILTER NEWS
			       =================

$Id: NEWS,v 1.55 2003/02/03 13:50:16 relson Exp $

0.10.2   2003-02-01

	* Stable release of 0.10.1.5

0.10.1.5 2003-02-01
0.10.1.4 2003-01-30
0.10.1.3 2003-01-29
0.10.1.2 2003-01-27
0.10.1.1 2003-01-25
0.10.1	 2003-01-22

	* A variety of robustness and portability changes, code and
          file cleanups and documentation updates.
	* Multiple fixes for mime and html processing.
	* Additional support and fixes for the various spam scoring
	  algorithms.
	** See file CHANGES-0.10 for details of the above items.

0.10.0  2003-01-19

	* Added mime processing capability, with decoding of base64,
	  quoted-printable and uuencoded sections.  Ignores attachments when
	  computing spamicity.
	* Added wordlist maintenance capability to bogoutil.  Can discard
	  tokens based on count, age, or length.  Can replace non-ASCII chars
	  with question marks.
	* Added dates to wordlist tokens.  Option "datestamp_tokens=true|false"
	  can be used to enable/disable them.
	* Moved most documentation files to doc directory.
	* Added sample procmail file, contrib/procmailrc.example
	* Spamicity score now computable from multiple word list pairs, i.e. all
	  spam and ham word lists in directories named on command line or in
	  config file (via "wordlist=" or "bogofilter_dir=" lines).
	* Lexer is now case insensitive
	* Increase MAXTOKENLEN from 20 to 30, allowing more and longer tokens
	  to be processed.
	* New options for setting of default charset and replacing of non-ASCII
	   characters.  New character set handling routines to provide charset 
	   specific token parsing.
	* New error handling routine will output error messages to stderr and,
	  if '-l' (logging) is enabled, to syslog.
	* New message formatting capability allows formats to be put in config
	  file for X-Bogosity line and logging messages.  Message content can
	  include status, spamicity, version, etc.
	* Long-standing locking bugs that caused corruption in the data base
	  have been resolved.
	* Work around ash-0.2 and bash-1 bugs, needed for make check.
	* Cater for malloc/calloc implementations that return NULL when 0 bytes
	  of memory are requested, some AIX versions e. g., that would
	  previously falsely claim an "out of memory" condition.
	  (also available as patch for 0.9.1.2)
	* Reorder gcc __attribute__ lines for gcc-2.7 compatibility.
	  (also available as patch for 0.9.1.2)

0.9.1.2	2002-12-05

	* A defect in the collect_words routines (in 0.9.1) caused
	  incorrect generation of "must get only one message to
	  calculate spamicity!" messages.  This has been fixed.
	* A defect in the contrib/bogopass script caused the unbase64-edited
	  version of the mail to be printed rather than the original with just
	  the header added. This has been fixed.
	* Documentation has been revised and updated.
	* Robinson-Fisher method now produces a tristate status, i.e.
	  spam/ham/unsure, if ham_cutoff is non-zero.  ham_cutoff defaults to
	  0.1 and can be set via config file.
	* Script contrib/bogopass has revised error and environment checking.

0.9.1	2002-11-30

	* New script contrib/bogopass allows processing of base64
	  attachments.  This is a temporary solution until base64
	  code can be built into bogofilter.
	* New file README.Robinson describes the tunable parameters
	  for the Robinson algorithm and what to do for best performance.
	* Changed the default behavior to use the Robinson algorithm.
	* Corrected incorrect sort order when printing statistics.
	* Added support for Fisher's method of combining probabilities,
	  as optimized for this purpose by Rob W. W. Hooft, to the
	  "Robinson" algorithm.
	* The new file METHODS describes the Graham, Robinson, and Fisher
	  methods that bogofilter supports for computing spamicity.
	* New file README.dcdflib gives some info on the dcdflib free
	  library of routines for cumulative distribution functions.
	* A new '-f' option tells bogofilter to use Fisher's method.
	* A new '-c' option in bogofilter allows specification of the
	  configuration file to read.
	* A new '-C' option tells bogofilter not to read any config file.
	* The syslog facility in '-l' mode has changed from "daemon" to "mail",
	  so your logs may now be in /var/log/maillog or /var/log/mail rather
	  than /var/log/messages. Check your /etc/syslog.conf.
	* The testing framework now works on Solaris.

	Internal Changes:
	* Fixed several portability problems uncovered by the new
	  regression tests.
	* Added three more regression tests designed to confirm that
	  bogofilter's results are matching saved reference results.
	* Implemented an object oriented API for using computational methods.
	* Split the main module into a registration module and three algorithm
	  modules - for the fisher, graham and robinson methods.
	* Registering big mbox files is much faster now, at the expense of some
	  memory.

0.8.0	2002-11-10

	* The lexer code now detects read errors (and exits with code 2
	  if it finds one.)
	* Fixed passthrough mode in bogofilter: it no longer strips the
	  spam-header from a mail body.
	* Fixed portability to some systems, notably, Solaris
	  and HP-UX, added README. for some systems to describe build
	  issues. 
	* Fixed "rpl_malloc" link failures.
	* Fixed bogofilter 0.7.6 passthrough regression on some
	  systems: The X-Bogofilter header would be added to the body
	  and a bogus blank line would be added.
	* Bogofilter now supports a configuration file named
	  /etc/bogofilter.cf and/or ~/.bogofilter.cf.
	* Bogofilter's use of '-v' for printing spamicity statistics
	  has been organized with increasing levels of details as
	  additional '-v's are added.
	* When using the Robinson algorithm, bogofilter can print a
	  simple histogram showing word probability distribution.
	* Bogoutil supports a new '-w' switch for displaying tokens
	  from the word list databases.
	* Bogolexer added to distribution.  Provides easy access to
	  parsing a file to examine the tokens.
	* Bogolexer has a new '-p' (passthrough) for printing tokens and
	  bogoutil has a new '-p' (probability) for printing the
	  probabilities of one or more tokens.  They can be connected
	  via pipe to display the probabilities of all words in a message.
	* DB 4.1 support has been fixed.
	* Documentation updates.

0.7.6	2002-10-27

	* Added README.hp-ux for those using HP-UX.
	* Added support for additional architectures - ia64, arm,
	  powerpc, and s390.
	* Bogofilter -p mode now preserves CR and NUL characters.
	* Bogofilter -p mode now detects if the computer runs out of
	  memory.
	* Bogofilter supports a new "-l" switch to write run-time log
	  information to syslog.
	* Bogofilter supports a new algorithm to calculate the
	  "spamicity", the "Robinson" algorithm. It is enabled with the
	  new "-r" switch. The old behavior is called the "Graham"
	  algorithm and can be enforced with the new "-g" switch.  The
	  default behavior is to use the "Graham" algorithm.
	* Bogofilter now has an "-R FILE" option (that implies -r) to print
	  an R data frame to FILE.
	* Bogofilter and bogoutil now have a "-x CLASSES" option to turn
	  on debugging.
	* Bogoupgrade.pl has been renamed to bogoupgrade.
	* There is now a man page for bogoupgrade.
	* BASE64 treatment has been fixed. It ignored whole lines if
	  they consisted of a single token. Now a token is only
	  considered base64 and ignored if it's >= 32 characters or ends
	  in one or two padding "=" signs.
	* MIME boundary lines are now emitted as tokens. Some of them
	  are typical of certain spam software, so they might turn out
	  to be useful.
	* All control characters are now considered token delimiters.
	* Bogofilter now aborts if it cannot figure where to look for its
	  data base directory.
	* The software no longer crashes on machines that do not allow
	  for unaligned memory access (m68k; many RISC, e. g. SPARC).
	* DB 4.1 is now supported.
	* Documentation updates.

0.7.5	Sun Oct 20 17:34:35 PDT 2002

	* The header in bogofilter -p mode now defaults to X-Bogosity, but 
	  can be changed by using "./configure --enable-spam-header=name" at
	  compile time.
	* The option names -h/-H are back to -n/-N like they were in version 
	  0.6, and -h now means "help".
	* A utility has been added to help upgrade wordlists from older 
	  versions of bogofilter to the current format. See the UPGRADE file 
	  for more information.
	* Support has been added for the environment variable BOGOFILTER_DIR 
	  to control where bogofilter looks for it's wordlists.
	* Now bogofilter no longer depends on the Judy package. We now use a 
	  high performance hashing algorithm for message evaluation. The Judy 
	  package is no longer required to compile or run bogofilter.
	* Support for the -e flag, which will cause bogofilter to exit with a 
	  value of 0 regardless of the spamicity of the message. This is useful
	  when using -p mode.
	* Support for -u flag. This allows message evaluation and training to 
	  happen in the same invocation of bogofilter.
	* Extended TOKEN patterns to improve support for European languages.
	* Improved wordlist locking to prevent data corruption.
	* Added procmail recipes for example usage in the man page.

0.7.4	Tue Sep 17 02:29:48 EDT 2002

	* Added infrastructure to support multiple wordlists
	* Fixed classification bug
	* Fixed errors in documentation
	* Improved portability of locking code
	* Fixed 'last line occasionally emitted twice' bug
	* Cleaned up underflow checking for word counts in bogofilter.c
	* Code readability improvements
	* Split main() function in bogofilter.c into smaller pieces
	* Message processing performance improvements

0.7.3	Thu Sep 12 13:28:37 PDT 2002

	Adrian Otto:
	* Added portable file locking support for files and databases
	David Relson
	* Bug fix for negative counts in word registration
	* Bug fix for SEGV in $HOME path code
	* Bug fix for trailing slash in -d option

0.7.2: Wed Sep 11 15:28:00 PDT 2002

	Adrian Otto:
	* Introduced GNU configure for portability code

0.7.1: Tue Sep 10 00:59:00 PDT 2002

	Adrian Otto: 
	* Skip existing X-Spam-Header
	* Performance improvement for -p mode
	Paul Tomblin:
	* Bug fix in getopt argument
		
0.7: Sat Sep  7 14:18:33 EDT 2002

	Eric S. Raymond:
	* Check your scripts!  Option names have changed.
	* Name changes: goodlist -> hamlist, badlist-> spamlist.
	  This is a step towards supporting more categories.
	* Autodaemon is gone.  Instead, the new implementation uses
	  DBM.  Optimization with mmap will be in a future release.
	* Speed-tuning of the bogofilter function.
	* We're back to not ignoring HTML comments.

0.6: Fri Aug 30 00:25:49 EDT 2002

	* Fixed a fluky bug in the socket-transmission logic
	* Fixed an edge case where a single message with a From line
	  was getting counted twice.
	* Unknown-word probability bumped from 0.2 to 0.4, tracking a
	  change by Paul Graham.
	* Documented -d option.

0.5: Thu Aug 29 13:38:12 EDT 2002

	* Passthrough option can be used to add an X-Spam-Status header.
	* There is now a per-message word frequency cap, so spammers can't do
	  an equivalent of Google fodder.
	* HTML comments are now ignored.
	* HTML 4.0 keywords and attributes are now ignored.
	* Improved extrema calculation.
	* Mutt patch withdrawn -- have a better version of mutt macros instead.
	* -S and -N options from matt@lickey.com (Matt Armstrong).
	* Client-server partitioning with a persistent server, drastically
	  reducing startup cost after the first run.
	* Minor bug fix by Eric Seppanen.

0.4: Sat Aug 24 09:07:45 EDT 2002

	* regenerated bogofilter mutt patch.
	* wordlist files are now automatically created in -s and -n modes.
	* Reversed the exit values, following a suggestion by Michael Elkins
	  about how to make bogofilter fail gracefully.
	* -Wall cleanup and uninitialized-variable fix from Eric Seppanen.
	* fcntl(2) file locking to head off a race condition in write_list.
	* Added the long-sought procmail recipe.

0.3: Fri 23 Aug 03:30:49 EDT 2002

	* Specfile/Makefile improvements from Graham Reed.
	* Case blindness fix from Eric Seppanen.
	* Deallocation fix from Mike Mayfield.
	* Wordlist file format changed.

0.2: Tue Aug 20 06:49:42 EDT 2002

	* Added mutt-1.4 interface patch
	* Note:  Location of the base directory has changed.

0.1: Mon 19 Aug 2002 03:07:31

	* Initial release.
 vim:tw=79 com=bf\:* ts=8 sts=8 sw=8 ai:

 LocalWords:  BOGOFILTER relson Exp procmail contrib Spamicity config spamicity
 LocalWords:  bogoutil datestamp MAXTOKENLEN charset stderr syslog gcc malloc
 LocalWords:  calloc Bogosity
