Edinburgh Research Explorer

SPRINTing with HECToR

Project: Awarded Facility Time

  • Sloan, Terence (Principal Investigator)
  • Mewissen, Muriel (Co-investigator)
  • Petrou, Savvas (Researcher)
StatusFinished
Effective start/end date1/10/0931/03/10
Period1/10/0931/03/10

Description

This Distributed Computational Science and Engineering (dCSE) project optimised SPRINT for use on the HECToR, the UK's national supercomputing service. SPRINT is an add-on package for the R language and environment for statistical computing and graphics. SPRINT (Simple Parallel R INTerface) offers both a parallel functions library and an interface for adding parallel functions to R.

Key findings

- An installation guide on how to compile SPRINT on HECToR.
-The performance of the parallel correlation function (pcor) now scales for up to 512 processes. Originally, all results were gathered on and written by the master process. By using the underlying high performance Lustre filesystem the results are now distributed among all processes and written into the file with MPI-I/O.
- The permutation testing function (mt.maxT) was parallelised to give pmaxT. The parallelism is introduced by dividing the permutation count equally to the available processes. Each process gathers a few of the observations and at the end all partial observations are reduced on the master process. Using this information the p-values are computed.
- Based on the benchmarks performed on the HECToR XT4 system, both functions are now able to scale close to optimal for process counts up to 512. Statisticians can now use the parallel versions of these functions to process their large data sets and also get results within reasonable run times.
- The work performed under this dCSE project was presented at HPDC 2010 and useR! 2010.