Thursday, 16 June 2011

The brilliant Genome Analysis Crowdsourcing repository

In the days following the deadly German E. coli outbreak various 'rapid response' sequencing, assembly and annotation efforts washed across my radar (mainly via twitter). In isolation each of these efforts represents little more than a shop-front for their respective creator's (albeit impressive) capabilities. There was always the nagging feeling that a coordinated effort would have been more credible, and ultimately more useful.

Having perused a couple of the available data sets to see which file formats were being distributed I was hoping to find a blog post that summarised them all. That's when I found the E.coli O104:H4 Genome Analysis Crowdsourcing repository at GitHub. This goes way beyond being a simple blog. It represents a living repository linking all of the data generation efforts to-date. If that in itself were not enough, there is also a day-by-day listing of analysis reports (mainly blog posts).

I now contend that "Genome Analysis Crowdsourcing", by pooling various independent data and analyses makes these as credible and useful, if not more so, than any coordinated project could possibly have been. The quantity and variety of data in the public domain, all generated within 2 weeks, linked from a central location, is staggering!

