January 7th, 2008

The Scourge Of Blog Comment Spam

by

Let me begin with an open appeal to Akismet, provider of comment spam protection to Publishing 2.0 and many other blogs run on WordPress: Howard Owens is NOT spam

Every time Howard Owens leaves a comment on Publishing 2.0, it gets caught in the WordPress Akismet spam filter. Howard tells me this happens to him on most other blogs. Why? Not because Howard is spammy — he leaves great comments, which is why I’m always happy to fish them out of the unspeakable bucket of filth (more on this in moment) caught by Akismet.

It’s likely because Howard’s blog was hacked by spammers. Not once, but twice. So when Howard enters his blog URL in the comment form, it triggers the spam filter.

Why would a spammer want to hack Howard’s blog — or any blog?

Ah, that gets to the reason why Akismet comment spam filter comes standard on every WordPress install.

If you’ve every had to sift through your email spam folder looking for a real message, you probably think you know how bad spam can be. But you haven’t seen spam until you’ve seen blog comment spam.

“Unspeakable” is the best adjective I can use to describe it. Having to sift through the spam in Akismet makes me think of a line from an old Weird Al Yankovic tune — “I’d rather clean all the bathrooms in Grand Central Station with my tongue” (keep in mind, this was back in the early 80s, before GCS was cleaned up).

Most of the spam in my Akismet filter is not safe for work, and I won’t reproduce it here, but here’s a rather mild example to explain why blogs get spammed:

comments-spam-example.jpg

Do you remember the days before Google, when you would search for something on AltaVista or Excite and find pages that were filled with the keywords you search for? Google dealt a mighty blow to this kind of keyword search spam by figuring out a way to rank sites that didn’t depend on keyword density.

But it did not destroy the practice. Rather than use it on their own sites, spammers discovered they could actually do it on other people’s sites.

How? By leaving the spam in a comment.

Looking at the example above, you probably wonder what good that would do the spammer — what reader of Publishing 2.0 would ever click on those links?

But the spam isn’t there for you — it’s there for search engines, which tend to trust content on Publishing 2.0 — including content in the comments.

So if I allowed the comment above on one of my posts, it might cause that post to rank for one of those keywords. The person searching for “sex DVD” or whatever would find my post, search for the text on the page, and click on one of the links — which would be relevant, because that’s what they were searching for.

At least that’s the theory. The comment above is a pretty brutish example. But some are more difficult to catch.

Here’s one that got past Akismet and that I accidentally let through in my rush to moderate a pile of comments.

comment-spam-example-_2.jpg

Click on the image above and you’ll discover a unique form of spam on the web — a “blog” that exist for only one purpose — deliver ads.  The blog does nothing but link to other blogs.

With no content of its own, how would anyone discover it? One way is by generating comments to the blogs it links to in the form of trackbacks.

This is one of the reason why most blogging software automatically puts a rel=”nofollow” attribute on links in comments — so that comment spam links don’t influence search rankings.

That this spam comment was able to slip past Akismet, while Howard Owen’s comments got caught, is an example of a larger trend on the web — how spam threatens to squeeze out real content. For example, on Publishing 2.0, there are 8,822 real comments. Akismet has caught 362,719 spam comments.

But that’s a post for another day.

  • Yes, I agree blog spam should be stopped and new filters should be programmed, very have to stay on top of things its the only way efficient websites will go through. Cheers.

  • Davidof

    I've recently installed Akismet for EE but so far it has not caught a single spam, except for the test ones I put through myself with obvious words like Viagra in them. The thing it falls down on is all the new vanity spam stuff where someone replies

    "hey, great post yes I agree with you about this" then leaves a URL. I noticed these get through on Akismet's website.

    Now on my website 99% of these come from eastern block countries, columbia or asia. So for a start I would like to be able to dump all comments from these countries for manual review.

    Akismet only seems to be as good as its database, and even after I marked a whole load of stuff as spam the same users were still being approved. Very frustrating.

  • Please don’t go the Captcha route!!!!

  • Heather

    I hate spam as much as you guys - it sucks. Ike, I hate Captchas too...they are super-annoying. I think most would agree. But, I would disagree on Bad Behavior, I tried it for a while and its unreliable...I got locked out of my own blog. I've read that it blocks out some search engine spiders because it's not very well written. I just recently got a plugin that is amazing at stopping spambots - WP-SpamFree.

  • Hi Erik...

    There is an ExpressionEngine module:

    http://loweblog.com/archive/20...

    It has been running for several months, and I don't see any negative feedback in the comments.

    Good luck.

blog comments powered by Disqus

Subscribe

Receive new posts by email

Recent Posts