The End of Theory: The Data Deluge Makes the Scientific Method Obsolete

The End of Theory: The Data Deluge Makes the Scientific Method Obsolete - while this article itself is fairly thing, it makes an important point. The quantity of data (about anything and everything) is increasing insanely, and will be (and already is) perhaps the most important thing in science and in business in the coming decade or so.

Most individuals today generate more data in a day than most countries did 200 years ago. I’m just making that number up, but think about the data you generate daily, from the photos you take, logs you generate globally by browsing the web, making phone calls, purchasing items, etc.

All of this data can only be analysed with computers, and it will (and is) tell us all sorts of things we did not know before, with greater accuracy.

Yahoo Embraces The Semantic Web - Expect The Internet To Organize Itself In A Hurry

Yahoo Embraces The Semantic Web - Expect The Internet To Organize Itself In A Hurry - wow. Watching things grow sloooowly for a long time, and then it finally seems like things are picking up… very exciting.

Update: link is The Yahoo! Search Open Ecosystem

Google Code - Updates: New GData API: Google Base

Google Code - Updates: New GData API: Google Base - two of my biggest hopes - Google opening up Google Base, and more wide adoption of APIs based around OpenSearch - all at once. This could be big.

Google Data APIs Protocol

Google Data APIs Protocol - interesting move from Google. I (and others) have thought for a while that combining OpenSearch’s read capabilities with the Atom Publishing Protocol’s write capabilities would create a very powerful API, and that’s roughly what Google is doing here.

It’s great to see the OpenSearch support (a bit - they’re using startIndex, totalResults and itemsPerPage), but I’d like to see them using it more. Some of what they’re doing is contrary to how OpenSearch works (that’s not a problem per-say), as they’re using predefined query names such as q and max-results (and a folder for categories) rather that allowing people to use whichever they want and then specify them in an OpenSearch Description file.

In that same vein, it would be nice to see them make use of autodiscovery, as Atom, RSS, OpenSearch, and others do. Upon first inspection I would say these autodiscovered documents could be OpenSearch Descriptions, but I may be wrong about that.

One interesting thing to note is that they mention how startIndex is 1-based (which is true), and then display an example with a value of “0″. Sounds like DeWitt is right, it does need to handle 0-based numbers too; even Google is making that mistake.

DeWitt brings up some other good points as well.

Via Niall.

Update: Joe Gregorio weighs in

Update 2: Marc Canter (one of my favourite bloggers) finds this linkworthy ;-) although I’m always amazed at the spellings my name gets.