The integrative ambitions of systems biology and neuroinformatics—to construct working models of the machinery of living cells and b rains—will flounder unless researchers have access to the huge amounts of diverse experimental data being collected. However, the vast majority of bioscience research data that is gathered is never made available to other researchers, partly for the want of an adequate software for annotating experimental data, and partly for social reasons (researchers are rarely rewarded for publishing the actual data sets—just for journal articles summarizing findings).We have developed a novel software solution aimed at making it simpler for researchers to annotate and publish their research data. The first part of this solution is a desktop application, Catalyzer, which lets researchers structure their data at source, and complements existing ad hoc solutions in use in labs(including cryptic file names, Word, Excel, paper lab books) while being simpler and more flexible than relational databases, which are too complex f o r most bioscience researchers to set up. The catalogs produced by Catalyzer are stored in XML with a user defined schema, which will simplify future data mining efforts across large numbers of distributed data sets. The approach can be summarized as ‘structure at source,integrate as required’, with the initial focus on enabling the researchers to structure their own research data; only then will other researchers be able to integrate across data sets.
|Number of pages||15|
|Journal||Concurrency and Computation: Practice and Experience|
|Publication status||Published - Feb 2007|