Friday, February 07, 2003

For those who are interested, I've updated the Visual Studio macro for generating NUnit test shells to support NUnit 2.0. The macro is licensed under the same license as NUnit 2.0. It's probably pretty straightforward to convert this into a Visual Studio addin as well.
12:39:30 PM  permalink Click here to send an email to the editor of this weblog. 

I've just installed Mark Hammond's Bayesian Spam Filter plugin for Outlook, part of the Python spambayes project. It was a bit difficult to get installed, I had to run addin.py about 4 times before it would take. The addin gave the following output, but it seems to work OK:

D:Python22libsite-packageswin32comuniversal.py:15: 
UserWarning: win32com.universal argument passing support is incomplete - 
only types covered in win32com. servers.test_pycomtest are supported 
warnings.warn(msg) Registered: SpamBayes.OutlookAddin 
Once I'd got it installed, I trained the filter using the 6 spams I received this morning against the real contents of my Inbox. Mark's addin has the ability to generate a report on individual messages to show you how it ranked the message. It's really fascinating to read these reports. I've been using Mozilla mail with the bayesian plugin at home on two separate accounts, but it seems to be slow to train, and doesn't seem to offer a way to peek into the process.

I'm no great fan of Outlook, but that's what my employer uses, so I have high hopes for this tool. I've been thinking about trying to come up with a way to generalize the filtering, so that I could categorize good emails into separate folders, or categorize my incoming RSS feeds from NewsGator in interesting ways, rather than the default, which is to organize by feed. For example, I might categorize posts into Java, .NET, XML, etc, so I could read them together. Since many of my feeds give me only message excerpts, this might be difficult to do accurately, but I'm just thinking out loud at this point. For now, I'll be ecstatic if I can get effective spam filtering on my work email.

Update: I also installed this on a co-worker's machine.  She gets around 50 spams a day.  In the 2 hours before she left for the day, the filter caught 14 spams with 0 false positives.  It seems to be a little conservative at first, as most of the spams ended up being flagged as "probable", but it's really interesting to watch how the spam words' ranking changes as you move the suspects into the definitely spam category.  Monday's going to be fun, she usually has several hundred spams waiting after the weekend.

12:09:12 PM  permalink Click here to send an email to the editor of this weblog. 


Stories
DateTitle
1/23/2003 Why XML?
8/13/2002 Resolution for IE and Windows problems
8/10/2002 Supporting VS.NET and NAnt
5/11/2002 When do you stop unit testing?
Contact
jabber: weakliem
YM: gweakliem
MSN: gweakliem@pcisys.net
email: Click here to send an email to the editor of this weblog.
Subscribe to "Gordon Weakliem's Weblog" in Radio UserLand.
Click to see the XML version of this web page.