TEclass--a tool for automated classification of unknown eukaryotic transposable elements

Gyorgy Abrusan, Norbert Grundmann, Luc DeMester, Wojciech Makalowski

Research output: Contribution to journalArticlepeer-review

Abstract

MOTIVATION: The large number of sequenced genomes required the development of software that reconstructs the consensus sequences of transposons and other repetitive elements. However, the available tools usually focus on the accurate identification of raw repeats and provide no information about the taxonomic position of the reconstructed consensi. TEclass is a tool to classify unknown transposable elements into their four main functional categories, which reflect their mode of transposition: DNA transposons, long terminal repeats (LTRs), long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs). TEclass uses machine learning support vector machine (SVM) for classification based on oligomer frequencies. It achieves 90-97% accuracy in the classification of novel DNA and LTR repeats, and 75% for LINEs and SINEs.

AVAILABILITY: http://www.compgen.uni-muenster.de/teclass, stand alone program upon request.

Original languageEnglish
Pages (from-to)1329-30
Number of pages2
JournalBioinformatics
Volume25
Issue number10
DOIs
Publication statusPublished - 15 May 2009

Keywords

  • Computational Biology
  • DNA Transposable Elements
  • Eukaryotic Cells
  • Repetitive Sequences, Nucleic Acid
  • Short Interspersed Nucleotide Elements
  • Software
  • Terminal Repeat Sequences

Fingerprint

Dive into the research topics of 'TEclass--a tool for automated classification of unknown eukaryotic transposable elements'. Together they form a unique fingerprint.

Cite this