Down to Earth

Science and engineering of natural systems

Sunday, September 24, 2006

Blogospheric Dynamics

Prompted by the Nature article on science blogs I probed deeper into the ranking of blogs, in order to better understand how the blogosphere works. My questions were to understand (i) how Technorati ranks related to incoming links (I already knew it was monotonic, and people have noticed it to be a power law), and (ii) how this relationship changes over time.

Yes, I know. Technorati Ranks aren’t ideal measures of the success of a blog. Ranking is based exclusively on the number of incoming links from other blogs in the last six months (one counted per blog). It approximates the respect other bloggers have for a particular blog (though you could game the system somewhat by submitting to lots of different carnivals or by creating puppet blogs).

But bloggers don't blog to get a pat on the back in the form of incoming links. Our goal is to communicate. What we really want is our messages to get to our target audience, and in most cases for the messages to be heeded. Better metrics could be total traffic, unique visitors, visit duration, visits from email accounts (tracking the referring URL), depth of discussion, or citations by commenters in other blogs. These data are all very difficult to obtain, and still they remain approximations. For the time being then, in order to make inter-blog comparisons, we’re stuck with Technorati.

I recorded the Technorati Rank and the number of incoming links for 63 blogs on two days: July 26 and September 22, 2006. 47 of the blogs were featured in Nature’s review of science blogs (3 of them are excluded because of discontinuous blogging); 2 were blogs that Nature missed (Bad Astronomy and Living the Scientific Life); 9 were blogs with higher ranks than any science blogs; 1 was Down to Earth; and the remaining 4 were blogs with very low ranks. The figure bellow shows the data: black dots are the July data and red dots are September's. 5 blogs are identified with a larger circle, allowing us to see how their rank and incoming links change between the two dates.


Figure 1. Technorati Rank for 63 blogs recorded on July 26 and September 22, 2006.

Two sample dates will hardly give any statistical significance to any long-term trend we may see, but let’s look anyway.

For each sample date, the data can be divided into three distinct groups. The majority of blogs plotted fall along a line (in log-log space). In linear-linear space, this would be a power law (TR = a ILb), so we say this data exhibits a power law relationship. The data to the right break away from the regular form, and so would the left-most data points if I plotted enough of them. There is some group dynamic that causes points to align themselves into a power law, but the left and right groups break this law because they’re too close to the edges.

Power laws are ubiquitous in natural and social systems. We see them in earthquake occurrences and stock market fluctuations. Per Bak argued that many (all?) were a reflection of the process of self-organised criticality – where a complex system evolves to state where even small disturbances can impact the entire system.

Edge effects arise when the elements of a system are not insulated from the outside. The closer they are to the edge, the greater the relative influence the outside will play, breaking the power law. It’s difficult to see the edge effects on the left, but they’re there. You would see lots of blogs with no incoming links. You also see edge effects on the right. These stand-out-in-the-crowd blogs are operating in a very sparse blogging neighbourhood, where there is little competition and where slight redistricting can have profound effects on their ranking.

What seems to be happening between July and September is a lot of jumping around for the higher ranks (on the right), and a general lowering of the slope of the power law relationship. This is consistent with the rise in blogs and the ensuing blog rolls. Even to retain the same rank, a blog must increase the number of incoming links. This will tend to flatten the relationship. What likely accompanies this, though the data don’t show it very well, is that the break point moves further to the right. This would be associated with more blogs becoming competitive, reducing the number of stand-out-in-the-crowd blogs like RealClimate and Pharyngula. If the break point is indeed moving to the right, then the blogging community is still in an adolescent stage.

Even if you normalise the number of incoming links by the maximum rank, you see the same pattern, implying that there is some reorganisation going on within the blogging network, which I imagine is manifested by individual blogs linking to fewer and fewer blogs as they become comfortable with their blogroll.

This analysis paints a richer picture of the blogosphere, but there is so much we don’t know about its dynamics. Why are some blogs better than others? How is "better" defined? And so on. Here’s a call to you social scientists out there: STUDY US.


8 Comments:

At 2:55 PM , Blogger coturnix said...

Hi, I linked to this with some comments of my own.

 
At 4:54 PM , Anonymous Daniel Collins said...

To be compelte, Coturnix's post is here.

 
At 7:59 AM , Blogger Pedro Beltrão said...

Nice work, it would be nice to see this with more time points. Also not just to see the number of incoming links over time but their turnover. I would think that highly connected hubs also tend to lose links faster but that will probably be hard to track.

 
At 1:50 PM , Anonymous Daniel Collins said...

Given this dry run, hopefully I'll get round to doing this in two months time.

One design flaw (of many?): I should have sought out more blogs to better define the break point.

As for turnover, that's a whole new can of worms. It would require some smart programming - out of my hands.

 
At 1:37 PM , Anonymous Arunn said...

One immediate thought I have is the reasonably fair assumption that the newcomer blogs usually start their blogroll with most popular blogs. This translates to the left most edge influencing the right most edge and keep its fluctuation less apparent. More data is required to see if this is a valid assumption...

 
At 3:20 PM , Anonymous kevin v said...

Not sure how much it would influence the data (probably wouldn't), but there are popular science blogs not on the Nature list b/c they we're getting technorati rankings at the time. (I hadn't "claimed" my blog before that nature article came out, but now I'm in the technorati rankings somewhere about halfway down that list.) What would be interesting is to compare the page hits of all the scienceblogs to each's technorati ranking, because just by eye they don't match well.

 
At 9:05 AM , Anonymous Bart said...

Compare:

http://www.cs.unm.edu/~dlchao/flake/whee/

Seen on:

http://planet.taint.org/

 
At 9:15 AM , Anonymous Daniel Collins said...

Hmmm, interesting.

 

Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

<< Home