Let me begin with an open appeal to Akismet, provider of comment spam protection to Publishing 2.0 and many other blogs run on WordPress: Howard Owens is NOT spam

Every time Howard Owens leaves a comment on Publishing 2.0, it gets caught in the WordPress Akismet spam filter. Howard tells me this happens to him on most other blogs. Why? Not because Howard is spammy — he leaves great comments, which is why I’m always happy to fish them out of the unspeakable bucket of filth (more on this in moment) caught by Akismet.

It’s likely because Howard’s blog was hacked by spammers. Not once, but twice. So when Howard enters his blog URL in the comment form, it triggers the spam filter.

Why would a spammer want to hack Howard’s blog — or any blog?

Ah, that gets to the reason why Akismet comment spam filter comes standard on every WordPress install.

If you’ve every had to sift through your email spam folder looking for a real message, you probably think you know how bad spam can be. But you haven’t seen spam until you’ve seen blog comment spam.

“Unspeakable” is the best adjective I can use to describe it. Having to sift through the spam in Akismet makes me think of a line from an old Weird Al Yankovic tune — “I’d rather clean all the bathrooms in Grand Central Station with my tongue” (keep in mind, this was back in the early 80s, before GCS was cleaned up).

Most of the spam in my Akismet filter is not safe for work, and I won’t reproduce it here, but here’s a rather mild example to explain why blogs get spammed:

comments-spam-example.jpg

Do you remember the days before Google, when you would search for something on AltaVista or Excite and find pages that were filled with the keywords you search for? Google dealt a mighty blow to this kind of keyword search spam by figuring out a way to rank sites that didn’t depend on keyword density.

But it did not destroy the practice. Rather than use it on their own sites, spammers discovered they could actually do it on other people’s sites.

How? By leaving the spam in a comment.

Looking at the example above, you probably wonder what good that would do the spammer — what reader of Publishing 2.0 would ever click on those links?

But the spam isn’t there for you — it’s there for search engines, which tend to trust content on Publishing 2.0 — including content in the comments.

So if I allowed the comment above on one of my posts, it might cause that post to rank for one of those keywords. The person searching for “sex DVD” or whatever would find my post, search for the text on the page, and click on one of the links — which would be relevant, because that’s what they were searching for.

At least that’s the theory. The comment above is a pretty brutish example. But some are more difficult to catch.

Here’s one that got past Akismet and that I accidentally let through in my rush to moderate a pile of comments.

comment-spam-example-_2.jpg

Click on the image above and you’ll discover a unique form of spam on the web — a “blog” that exist for only one purpose — deliver ads.  The blog does nothing but link to other blogs.

With no content of its own, how would anyone discover it? One way is by generating comments to the blogs it links to in the form of trackbacks.

This is one of the reason why most blogging software automatically puts a rel=”nofollow” attribute on links in comments — so that comment spam links don’t influence search rankings.

That this spam comment was able to slip past Akismet, while Howard Owen’s comments got caught, is an example of a larger trend on the web — how spam threatens to squeeze out real content. For example, on Publishing 2.0, there are 8,822 real comments. Akismet has caught* *362,719 spam comments.

But that’s a post for another day.