A parallel random forest classifier for R

Lawrence Mitchell, Terence Sloan, Muriel Mewissen, Peter Ghazal, Thorsten Forster, Michal Piotrowski, Arthur S. Trew

Research output: Contribution to conferencePaperpeer-review

Abstract

The statistical language R is favoured by many biostaticians for processing microarray data. In recent times, the quantity of data that can be obtained in experiments has risen significantly, making previously fast analyses time consuming, or even not possible at all with the existing software infrastructure. High Performance Computing (HPC) systems offer a solution to these problems, but at the expense of increased complexity for the end user. The Simple Parallel R Interface (SPRINT) is a library for R that aims to reduce the complexity of using HPC systems by providing biostatisticians with drop-in parallelized replacements of existing R functions. In this paper we describe the implementation of a parallel version of the Random Forest classifier in the SPRINT library.
Original languageEnglish
Pages1-6
Number of pages7
DOIs
Publication statusPublished - 1 Jan 2011
Eventsecond international workshop on Emerging computational methods for the life sciences - San Jose, United States
Duration: 8 Jun 2011 → …

Conference

Conferencesecond international workshop on Emerging computational methods for the life sciences
Country/TerritoryUnited States
CitySan Jose
Period8/06/11 → …

Fingerprint

Dive into the research topics of 'A parallel random forest classifier for R'. Together they form a unique fingerprint.
  • SPRINTing further with HECToR

    Sloan, T. (Principal Investigator), Mewissen, M. (Co-investigator) & Mitchell, L. (Researcher)

    1/10/1030/04/11

    Project: Awarded Facility Time

  • HPC Enabling of R Application Software

    Sloan, T. (Principal Investigator) & Ghazal, P. (Co-investigator)

    UK-based charities

    1/04/0931/03/11

    Project: Research

Cite this