Edinburgh Research Explorer

Parallel classification and feature selection in microarray data using SPRINT

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Open Access permissions

Open

Documents

  • Download as Adobe PDF

    Rights statement: Available under Open Access. Copyright © 1999–2013 John Wiley & Sons, Inc. All Rights Reserved.

    Final published version, 208 KB, PDF-document

http://onlinelibrary.wiley.com/doi/10.1002/cpe.2928/full
Original languageEnglish
Pages (from-to)854-865
Number of pages12
JournalConcurrency and Computation: Practice and Experience
Volume26
Issue number4
Early online date13 Sep 2012
DOIs
Publication statusPublished - Mar 2014

Abstract

The statistical language R is favoured by many biostatisticians for processing microarray data. In recent times, the quantity of data that can be obtained in experiments has risen significantly, making previously fast analyses time consuming or even not possible at all with the existing software infrastructure. High performance computing (HPC) systems offer a solution to these problems but at the expense of increased complexity for the end user. The Simple Parallel R Interface is a library for R that aims to reduce the complexity of using HPC systems by providing biostatisticians with drop-in parallelised replacements of existing R functions. In this paper we describe parallel implementations of two popular techniques: exploratory clustering analyses using the random forest classifier and feature selection through identification of differentially expressed genes using the rank product method.

Download statistics

No data available

ID: 4823225