Edinburgh Research Explorer

Challenges in Integrating Biological Data Sources

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Open Access permissions

Open

Documents

Original languageEnglish
Pages (from-to)557-572
Number of pages16
JournalJournal of Computational Biology
Volume2
Issue number4
DOIs
Publication statusPublished - 1995

Abstract

Scientific data of importance to biologists reside in a number of different data sources, such as GenBank, GSDB, SWISS-PROT, EMBL, and OMIM, among many others. Some of these data sources are conventional databases implemented using database management systems (DBMSs) and others are structured files maintained in a number of different formats (e.g., ASN.1 and ACE). In addition, software packages such as sequence analysis packages (e.g., BLAST and FASTA) produce data and can therefore be viewed as data sources. To counter the increasing dispersion and heterogeneity of data, different approaches to integrating these data sources are appearing throughout the bioinformatics community. This paper surveys the technical challenges to integration, classifies the approaches, and critiques the available tools and methodologies.

Download statistics

No data available

ID: 10624772