9.7 Billion Web Pages and Nothing’s There

Matrix
As of January 2006, Google has indexed 9.7 billion web pages. When I search on a string that is even somewhat popular, I often get back hundreds of thousands or millions of results. In addition, I find it very difficult to obtain the most recent results unless I’m very, very careful about how I enter my search string. Why is it so hard to find really useful data?

Try answering these questions and tell me how easy it is for you to put your fingers on the data (without paying $2,500 and above for reports from some analyst firm):

  • How many total web sites are there?
  • Worldwide, what’s the installed base of mobile phones? How many are web enabled?
  • What are the various flavors of wireless, data-centric technologies (Wifi, Wimax, CDMA, GSM, EV-DO, et al)? How fast are they? What are the growth rates?
  • What is the guesstimate for the growth in, say, data? When you think about the demand side creation of media by consumers, is there any way to quantify this increase?
  • How many unique visitors does Wikipedia get per day?
  • How many blogs are delivered by spammers? (Out of the 29.8 million tracked by Technorati).

I could go on-and-on but you get the drift. For simple searches on Google, Yahoo, Icerocket and others, it’s fairly trivial to get good results back. But when you’re searching for more complex, meaty results, it’s stunningly difficult and time consuming to get answers.

One would assume that the Federal Communications Commission, the Dept of Commerce, World Wide Web Consortium or many of the other governmental or non-profit companies would provide this data (especially the US governmental agencies to whom I’m paying taxes!) but alas, they don’t.