Abstract / Description of output
We present an approach to searching genetic
DNA sequences using an adaptation of the suf-
x tree data structure deployed on the general
purpose persistent Java platform, PJama.
Our implementation technique is novel, in
that it allows us to build sux trees on disk
for arbitrarily large sequences, for instance for
the longest human chromosome consisting of
263 million letters. We propose to use such
indexes as an alternative to the current practice
of serial scanning. We describe our tree
creation algorithm, analyse the performance
of our index, and discuss the interplay of the
data structure with ob ject store architectures.
Early measurements are presented.
Original language | English |
---|---|
Title of host publication | Proceedings of the 27th International Conference on Very Large Data Bases |
Place of Publication | San Francisco, CA, USA |
Publisher | Morgan Kaufmann Publishers Inc. |
Pages | 139-148 |
Number of pages | 10 |
ISBN (Print) | 1-55860-804-4 |
Publication status | Published - 2001 |