Monday, June 05, 2006

Blogger Has Been Nightmarish

Blogger, without which I would not have been able to start blogging so easily, has behaved nightmarishly today. It is, unfortunately, not a unique occurrence in recent weeks. The status page lists the various outages that Blogger has experienced recently, a total of five outages since May 26. Because of Blogger's technical problems, today has not been an enjoyable. However, I did have one thought relating to a serious problem on Blogger that may actually be useful.

It is in the interest of Blogger to fix problems quickly, because bloggers here are an asset to Google. As nitecruzr at the Blogger Help Group put it:

"Blogger is free, as is our time. WE provide content, that gets web surfers to Blogger, that Blogger / Google sells ads on. We are the unpaid staff at Blogger. We are not an expense, we are an asset."

nitecruzr links to a site called The Real Blogger Status, which offers helpful advice not likely to be found in the regular help pages, but which is also on Blogger and not always available when Blogger goes down. Nonetheless, Chuck at The Real Blogger Status speculates about the negative effects of spam blogs, effects which may have contributed to today's problems:

The thing is, the effects of spam blogs, aka splogs, are hurting us in several ways.
  • Some of us have legitimate blogs, that are being incorrectly identified as splogs. In some cases, the blogs are deleted or disabled. We have to waste time getting Blogger to restore or re enable them. This wastes our time, and Blogger Support's time. Real, urgent, problems aren't being dealt with, because Blogger Support is busy dealing with splogs.
  • Blogger infrastructure is being overloaded, by the spammers creating and linking thousands of splogs. We are seeing hardware failures, caused by their volume. I have had to change my Blogger server, from miscellaneous hangups, 3 times in the last week. This is an increase of maybe 600% for me.
  • The search engines, like Google, are being overloaded by the splogs. The splog volume makes search engines work harder to index, and to retrieve, lists including the splogs.
  • The search engine hit lists are being overloaded by splogs producing hits. Our blogs can't be seen in the search hits lists, because of all of the splogs in there.
  • Maybe even Corrupted Blogs, caused by over redundant or badly coded hijacking processes.

As Blogger is free, it will always be susceptible to spammers, just as email has always been. Today as I waited and waited for Blogger to return I pondered how one might program a way to recognize a blog as spam. The trouble was, each time I came up with a way to do so I realized it would mark a blog I know is legitimate as spam.

The only useful tool I could come up with a to grade a selection of entries the way standardized test essays are graded. The one link between all the splogs I've seen -- admittedly a small sample -- is the utter stupidity of the text I encounter. The essay grading programs used by ETS look for complexity of language and internal coherency.

What makes the program perhaps inapplicable is that they are built from an enormous sample of essays on a single topic. Essentially, the program compares an essay with thousands of other essays on the same topic, and assigns a grade according to programmed rules of grammar and similarities in language and word choice to other essays. While such an application could not "grade" for content, perhaps they could be used to judge whether the text on a page was coherent and complex enough to have been written by a human being.

I would think it might be worth a try, as anything to defeat the scourge of spamming benefits us all. Of course, once such a program was put into place spammers would find a way to work around it. Spam, like cancer, requires constant vigilance.

Technorati tags: |

No comments: