Monday, 28 May 2012

Am I really a bioinformatician?

A bioinformatician is someone who creates computational tools that address biological questions.

A computational biologist is someone who uses computational tools to address biological questions.

There are four classes of computational tools, algorithms, workflows, databases and interfaces. I have created each in the past. I was a bioinformatician! These days most of my time is spent deploying and maintaining the computational tools of others; so am I still a bioinformatician? I'd like to hope so, and therefore propose a new classification - the operational bioinformatician to describe myself and those like me, to be distinguished from the research bioinformaticians who get all the fun creative jobs. Oh well, let's hope any future bifx union treats perl scripting as an operational rather than research activity.

Friday, 6 January 2012

The various semantics of Open-*

Several 'Open' movements have gained traction since 'Open Source' was coined in the late nineties. Those important to me, an applied informatics scientist include;

The meaning of 'Open' in each case can, however, be somewhat different for each. In the case of Open Source, it basically means that the end product of the (software) development process is available for anyone to see (if not always distribute) free of charge. Having the end product 'open' holds true for Open Access and Open Data. One difference is that reuse of Open Data is generally associated with fewer restrictions than Open Access and Open Source, which are typically limited somewhat by copyright and software licenses.

At about the same time as the rise of the Open Source, it was recognised that openness of the process of software development itself, in addition to that of the end product, could be a useful approach to the creation of free software. This was most famously described by the Eric Raymond's Cathedral and the Bazaar. This paradigm has been seized by the Open Science movement, who contend that the very process of science should be conducted in the open (e.g. blogs, live lab notebooks). It logically follows that the product of such endeavours should be Open (Access, Data, Source), but this is a secondary attribute.

Open Innovation, simply looking outside your organisation for innovative ideas, seems like the square peg in this discussion. Surely it's simple outsourcing of research and development activities? Where is the 'open' in that? However, if one accepts that the approach is epitomised by 'Open Innovation' challenge companies such as Innocentive, then the openness is in the clear statement of requirement/specification associated with each challenge. If one accepts 'open specification' as a key attribute of Open Innovation, then it's easy to see the fit with translational research; unmet R&D requirements for domestic industries provide justified focus for domestic science funding agencies.

It seems that we have 'Open' movements for all stages of the development lifecycle, from inception (Open Innovation), to implementation (Open Science), to end product (Open Source, Access, Data). The commonality is that the outputs of each stage are freely available for everyone to see. It does not, however, mandate that one should necessarily follow another.