Edinburgh Research Explorer

Challenges in Integrating Biological Data Sources

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Open Access permissions



Original languageEnglish
Pages (from-to)557-572
Number of pages16
JournalJournal of Computational Biology
Issue number4
Publication statusPublished - 1995


Scientific data of importance to biologists reside in a number of different data sources, such as GenBank, GSDB, SWISS-PROT, EMBL, and OMIM, among many others. Some of these data sources are conventional databases implemented using database management systems (DBMSs) and others are structured files maintained in a number of different formats (e.g., ASN.1 and ACE). In addition, software packages such as sequence analysis packages (e.g., BLAST and FASTA) produce data and can therefore be viewed as data sources. To counter the increasing dispersion and heterogeneity of data, different approaches to integrating these data sources are appearing throughout the bioinformatics community. This paper surveys the technical challenges to integration, classifies the approaches, and critiques the available tools and methodologies.

Download statistics

No data available

ID: 10624772