Sunday, October 7, 2007

Set Your Dark Data Free!

A small article in the Wall Street Journal last week reminded me of an issue that bothered me since my university days: Dark Data.

The term "Dark Data" refers to results of failed experiments, usually discarded by scientists and researchers, on their way to prove their new theory or invent a new revolutionary product.

During my thesis-writing days, a quote, supposedly funny, made the circles: "If the facts don't support your theory, hide the facts". Even then, I didn't find this notion funny. First, it reeks of deception. If an experiment exists that negates your theory, bring it out to the open and intelligently discuss the ramifications and how you intend to overcome them. (A similar issue can be taken with car makers, who may know things are wrong with their cars, but will not disclose it until a certain number of accidents has been reached - I refer you the the movie "Fight Club").

But the second, more important issue I take with Dark Data is that what you consider a failed experiment, a fellow scientist, maybe in a different discipline, might consider a treasure trove.

Consider the following: let's say you're a scientist, on your path to discover a new cancer cure. In one of your experiments, your drug fails to kill the cancerous cells, but hits the cells responsible for, let's say, the common cold. You will consider this a failure, but another scientist, on the other side of the world, toiling in vain to find a cure for the common cold, could use this data.

Add to that the fact that many experiments are extremely expensive to run, preventing certain researchers access to them. So if you've already "wasted your money" on a failed experiment, why not share the results with the community?

In the past, one of the excuses for not keeping this data round was the cost of storage. But since everything nowadays is computerized and since the price of storage is constantly dropping, this excuse is invalid.

So, here's my humble call to the researchers of the world: bring your Dark Data into the light. Publish your failed experiments and let other people play with the results. The world will become a better place for it - and you would still get your credit.

More on the subject can be found in this Wired article.

No comments: