Transcription control plays a key role during development and disease with trans- acting factors (TFs) regulating expression of genes through DNA interaction. ChIP sequencing is widely used to get the genome wide binding profiles of TFs in a cell type of interest. The reduction in cost of sequencing and the technological improvement has resulted in vast amount of ChIP sequencing data accumulating in the public domain. The ENCODE consortium alone provides 690 publicly available ChIP sequencing data sets across 91 human cell types. We performed a multi-facetted bioinformatics analysis of this data to unravel diverse properties of TFs in the cellular context. Specifically, we characterised genomic location as well as sequence motif preference of the factors. We demonstrated that the distal binding of factors is more cell type specific than the promoter proximal. We identified combinations of factors acting in concert at distinct genomic loci. Finally, we highlighted how this data is of value to associate novel regulators to disease by integrating it with disease-associated gene loci obtained from GWAS studies.
|Name||Bioinformatics and Biomedical Engineering|
- ChIP sequencing
- sequence motif