These files contain genome-wide integrated haplotype scores (iHS) for each of the 26 populations in the phase 3 release of the 1000 genomes project. iHS were calculated using the hapbin program that can be downloaded from https://github.com/evotools/hapbin. The 1000 genomes phased haplotypes were obtained from mathgen.stats.ox.ac.uk/impute and hapbin was run with default parameters. The iHS are provided in two formats; BED and bedGraph. For each SNP the iHS for the allele with the positive iHS is shown. In the BED format file the allele in question (0 or 1 as annotated in the original IMPUTE data) is indicated in the fourth column following the “:”. The bedGraph formatted data can be easily viewed along the genome at the UCSC genome browser by specifying the URL to the corresponding file at http://genome-euro.ucsc.edu/cgi-bin/hgCustom?clade=mammal&org=Human&db=hg37.
The following files are available for download here:
(a) Documentation relating to methods ("fileDescriptions.txt"/"fileDescriptions.pdf");
(b) 572 files in bedGraph format (.bg.gz), covering different chromosomes and different populations from the 1000 Genomes project. These may be extracted using gunzip (on Linux/UNIX) or for example 7-Zip on Windows.
(c) One tar (tape archive) file containing 572 files in BED format, similarly covering different chromosomes and populations. These may be untarred using the tar program (on Linux/UNIX) or for example 7-Zip on Windows.
Prendergast, James; Maclean, Colin A.; Chue Hong, Neil. (2015). hapbin: An efficient program for performing haplotype based scans for positive selection in large genomic datasets, [Dataset]. University of Edinburgh. Roslin Institute. http://dx.doi.org/10.7488/ds/214.