"The more I find out, the less I know."

Monday - December 22, 2003 at 03:37 AM in

A Blog-Based Early Warning News-Ranking System


Blogdex is a cool tool for seeing what news and commentary is catching fire at the moment. It basically counts the number of new links to a given we page which appear over a certain period of time. The faster the links appear, the higher the rank. One article on this site even appeared on the front page of Blogdex, after it got slashdotted .

What would be really cool is to see what news what about to catch fire. In other words, a web page which predicts what Blogdex will show in, say, 12-24 hours.
This can be done, if you believe that some blogs are consistently ahead of the curve in posting links (i.e. what they post is more likely to become popular), whereas others are consistently at or behind the curve.

To construct a Future Blogdex, two stages are involved. The first stage is to assign every blog a predictive score. This would be a number based on how frequently links from a particular blog appear on the front page of Blogdex within 24 hours. There are probably a half-dozen different ways to do this, but here's one:

1) When a link first appears on a blog, note the current Blogdex score (the number of new links to that site in the past day).
2) 12 hours later, note the Blogdex score for that same link. The change over 12 hours is the "predictive score" for that blog for that link, and can be positive or negative.
3) The blog's overall predictive score is the average of its predictive scores for all links (or perhaps all links in the past 7 days).

The second stage is to create a version of the Blogdex rank which counts not the total new links to pages, but the total new links weighted by the predictive score of each blog making the link. Blogs which tend to be leaders will pull their links up in the rankings, and those which are laggards will pull their links down.

So what would this really be useful for? Well, nothing, but I bet the guys at the MIT Media Lab already have the data they need to calculate these scores. And it would give blogs with high predictive scores something to brag about.

Posted at 03:37 AM | Permalink | | |

©
Powered By iBlog, Comments By HaloScan
RSS Feed