RSS 2.0
Browse posts:
Unanswered |
Mark all read
| Author | Message | ||||
|---|---|---|---|---|---|
|
xiffy Nucleus Guru ![]() Joined: 27 Mar 2002 Posts: 1218 Location: Deventer |
All those interested in yet another spamfighting tool say: Aye! I would like to announce NP_SpamBayes. This plugin introduces Bayesian filtering to your weblog. Hooking in on major events when comments or trackbacks are posted to your weblog. The download link is now available and you should read more about this baby in the wiki: NP_SpamBayes. I started writing this plugin because Blacklist wasn't bulletproof anymore. I know there are other plugins, but I refuse to add captcha or javascript powered plugins. The current spammessages are not easily stopped by adding keywords to a list. After 1 day of extensive testing and a good corpus of ham and spam messages I did not have to delete 1 spam message today. SpamBayes missed 3 spams but they were catched by good old Blacklist. So if you are interested please read the wiki page and the consider if this is the anti spam plugin for you. warning, you should have a spam free blog when you start training the plugin Expected time of arrival for the first zipfiles: wednesday 6 sept.. update It's done. Go get your package: spambayes version 1.1.0 and remember. READ the wiki page! A non trained filter won't do you any good! version 1.0.1 sees the light. No urgent need to upgrade if you've got version 1.0 installed. 1 small bug and 1 convenience added in the log screen (totals per category in the title) version 1.0.2 has been born Lots of nice features added to the log facility so you can investigate spam and false positives efficiently. version 1.0.3 has been born This version solves a small bug with logging. All older versions have logging enabled wheter you say yes or no to the logging option .. Also added the option to train all yet untrained comments. This way you can keep you ham filter fresh. version 1.0.4 has been born Version 1.0.3 disabled all logging in PHP version 4. This has been fixed by this release, nothing else added. So if version 1.0.3 works, just leave it where it is (if it ain't broke, don't fix it ...) If you run PHP version 4, you should upgrade (just uploading the new release will suffice, no uninstall / install needed for upgrading.) version 1.0.5 has been born Update probabilities now has been obsoleted. The numbres are now calculated after each training action automaticly. No other features are added. (just uploading the new release will suffice, no uninstall / install needed for upgrading.) version 1.1.0 (beta) has been born Logging overhaul. Paging, number of items, explain option and promote to weblog. It's all there now. _________________ __deus ex machina__ http://xiffy.nl/weblog/ Japan photo's: http://2006.cooljapan.nl/main.php?g2_itemId=20 Last edited by xiffy on Wed Jan 10, 2007 12:16 am; edited 13 times in total |
||||
|
|
|||||
|
roel Nucleus Guru ![]() Joined: 16 Apr 2002 Posts: 4575 Location: Rotterdam, The Netherlands |
This sounds good! You off course need a weblog with quite some comments to get reliable results. So that may not be helpful to new bloggers. However, I setup a Nucleus 3.3 beta site on http://roelg.nl with Rakaz' anti-spam plugins and just left it there. And I haven't seen any spam there yet. Together with NP_CommentCensor and the text-based captcha plugin we are getting some good defenses against comment spam. (Btw, will this work for trackbacks too? And do you plan to plug it into the spamcheck api that 3.3 will provide?) Thansk for all the hard work, Xiffy! _________________ Is your question not solved yet?
|
||||
|
|
|||||
|
xiffy Nucleus Guru ![]() Joined: 27 Mar 2002 Posts: 1218 Location: Deventer |
If NP_SpamCheck has the same interface Rakaz and I first developed for Blacklist and NP_Referrer and TrackBack then the answer is yes, it works together with NP_SpamCheck. (I discovered this yesterday when I cleaned my referrer spam and Spambayes started to delete referrers before Blacklist did this And like I wrote in the Wiki, it all depends on training. So the more comments the better it is, what is best with Spam Bayes is that evenyually it becomes a filter for your site. No central repository. On my ducth site english comments are rare and 99.9% is spam. So I can train with more english words then someone with an english blog... It's operational on my site for 2 days, it catched over 200 spam comments and I had only one coming through. Luckily Blacklist catched that one. And with one click I could train SpamBayes to never let that kind of comments get through. Thursday ... (must sleep) _________________ __deus ex machina__ http://xiffy.nl/weblog/ Japan photo's: http://2006.cooljapan.nl/main.php?g2_itemId=20 |
||||
|
|
|||||
|
xiffy Nucleus Guru ![]() Joined: 27 Mar 2002 Posts: 1218 Location: Deventer |
Okay, I've been reading the extensice discussion started by Rakaz concerning the SpamCheck in version 3.3 At the moment this plugin is for Nucleus 3.23 and lower. When 3.3 goes public, 1 code code change would suffice to let the new Spam api control the plugin. All that needs to be done is the removal of the preAddComment event and the validateForm event. They are needed because the current nucleus version hasn't got the SpamCheck event enabled in the core. So yes, when 3.3 gets out, this plugin will have a 3.3 version as well. Considering Trackback. I've (re) enabled trackback on my site again and Spam Bayes started to filter those immediatly as well. (If you have the latest Trackback by Rakaz or a self-modded Trackback like me which calls for "SpamCheck" when a trackback is posted). So I think we are ready to bring spam figting to a new level with alle the anti spam plugins available. _________________ __deus ex machina__ http://xiffy.nl/weblog/ Japan photo's: http://2006.cooljapan.nl/main.php?g2_itemId=20 |
||||
|
|
|||||
|
Leng Nucleus Guru ![]() Joined: 19 Sep 2004 Posts: 2830 Location: Australia |
Just installed the plugin! I've been getting lots of trackback spam recently, so here's to hoping it will cut down on that. On a side note, when I use the "Spam Test" option, I get the following error message:
Line 72 merely checks to see if the admin area is turned on? Even turning on the quickmenu option still gives this error. Edit: Hrmm...trying to send a message to myself through the member contact form now gives this error when logged in:
_________________
deborahlau.com | To-Do List Questions? See the FAQ, read the docs, or browse our plugins!! |
||||
|
|
|||||
|
xiffy Nucleus Guru ![]() Joined: 27 Mar 2002 Posts: 1218 Location: Deventer |
you did train spam bayes with some samples? You should see a wordcount greater then zero and a probability greater then zero for both ham and spam categories ... Yep line 72 in spambayes/spambayes.php says it all: it's a very small probability which is divided by the amount of words trained by the filter. one side note for your consideration: On my main blog i've a wordcount of: Ham: 85960 Spam: 16100 and this filter is very effective (2 missed spams in a week, catched 6000 spams) On another blog: Ham: 696 Spam: 509 and to my amazement this one is even more effective. So you don't need a lot of data to get spam bayes running. This filter missed 0 spam and catched 333 spams. (less traffic) In the docs (see wiki) is a lot of explaining done for training the filter ... _________________ __deus ex machina__ http://xiffy.nl/weblog/ Japan photo's: http://2006.cooljapan.nl/main.php?g2_itemId=20 |
||||
|
|
|||||
|
cyblot Nucleus Guru ![]() Joined: 16 Sep 2003 Posts: 399 Location: Netherlands |
Which is definitely the best way to approach spam, since we don't have to rely on anyone else maintaining a central file or service. As long as NP_SpamBayes itself keeps being updated to work with the latest Nucleus version of course This sounds really good, I'm going to test it. One question while I do, does this mean comments won't show up until I have told Spam Bayes it is ham, or is it added to the site first, until I determine it is spam? I didn't see that info in your description, but maybe I just overlooked it. _________________ Blots of Info http://www.golb.org |
||||
|
|
|||||
|
xiffy Nucleus Guru ![]() Joined: 27 Mar 2002 Posts: 1218 Location: Deventer |
Ah, I did not metion because for me it was obvious (and that learns me that not all things obvious will be obvious for the rest of the world). Anything that is considered 'ham' will show up on your weblog as a legit comment / trackback. However comments will also be logged in the spam bayes log, if you have loggin turned on. So you can quickly train the filter to consider that particulair comment as spam. (I did not add 'ham' logging to the SpamCheck event because the amount of logged events could be overwhelming if you would use Spam Bayes for referrer blocking as well ) _________________ __deus ex machina__ http://xiffy.nl/weblog/ Japan photo's: http://2006.cooljapan.nl/main.php?g2_itemId=20 |
||||
|
|
|||||
|
Leng Nucleus Guru ![]() Joined: 19 Sep 2004 Posts: 2830 Location: Australia |
Yup, I trained it with all the comments currently, but since there were no spam comments, there is a probability of 0 for spam. Stuck in a couple of spam examples and now the error has disappeared. Yay! I'm now going to enable comments on my site without requiring registration to see how good SpamBayes is. For science! _________________
deborahlau.com | To-Do List Questions? See the FAQ, read the docs, or browse our plugins!! Last edited by Leng on Sat Sep 09, 2006 1:08 pm; edited 1 time in total |
||||
|
|
|||||
|
xiffy Nucleus Guru ![]() Joined: 27 Mar 2002 Posts: 1218 Location: Deventer |
yes, just copy-paste some spam trackbacks that you would like to stop in the train text area. you don't need a lot but at least 1 after that every comment / trackback that get's through. add it to the filter and after some time the spam will go away. if you enable logging, training will be easier (the log will have links to train ham / spam) _________________ __deus ex machina__ http://xiffy.nl/weblog/ Japan photo's: http://2006.cooljapan.nl/main.php?g2_itemId=20 |
||||
|
|
|||||
|
All times are GMT + 1 Hour
You cannot post new topics in this forum |
|||||