Edinburgh Research Explorer

Bayesian reassessment of the epigenetic architecture of complex traits

Research output: Contribution to journalArticle

  • Daniel Trejo Baños
  • Marion Patxot Beltran
  • Lucas Anchieri
  • Tom Battram
  • Colette Christiansen
  • Ricardo Costeira
  • Stewart W Morris
  • Qian Zhang
  • Allan F. Mcrae
  • Naomi R. Wray
  • Peter M Visscher
  • Gibran Hemani
  • Jordana Tzenova Bell
  • Matthew R. Robinson

Related Edinburgh Organisations

Open Access permissions



  • Download as Adobe PDF

    Rights statement: Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. © The Author(s) 2020

    Final published version, 1.41 MB, PDF document

    Licence: Creative Commons: Attribution (CC-BY)

Original languageEnglish
JournalNature Communications
Publication statusPublished - 8 Jun 2020


Epigenetic DNA modification is partly under genetic control, and occurs in response to a wide range of environmental exposures. Linking epigenetic marks to clinical outcomes may provide greater insight into underlying molecular processes of disease, assist in the identification of therapeutic targets, and improve risk prediction. Here, we present a statistical approach, based on Bayesian inference, that estimates associations between disease risk and all measured epigenetic probes jointly, automatically controlling for both data structure (including cell-count effects, relatedness, and experimental batch effects) and correlations among probes. We benchmark our approach in simulation study, finding improved estimation of probe associations across a wide range of scenarios over existing approaches. Our method estimates the total proportion of disease risk captured by epigenetic probe variation, and when we applied it to measures of body mass index (BMI) and cigarette consumption behaviour in 5,101 individuals, we find that 66.7% (95% CI 60.0-72.8) of the variation in BMI and 67.7% (95% CI 58.4-76.9) of the variation in cigarette consumption can be captured by methylation array data from whole blood, independent of the variation explained by single nucleotide polymorphism markers. We find novel associations, with smoking behaviour associated with a methylation probe at the MNDA gene with >95% posterior inclusion probability, which is a myeloid cell nuclear differentiation antigen gene previously implicated as a biomarker for inflammation and non-Hodgkin lymphoma risk. We conduct unique genome-wide enrichment analyses, identifying blood cholesterol, lipid transport and sterol metabolism pathways for BMI, and response to xenobiotic stimulus and negative regulation of RNA polymerase II promoter transcription for smoking, all with >95% posterior inclusion probability of having methylation probes with associations >1.5 times larger than the average. Finally, we improve phenotypic prediction in two independent cohorts by 28.7% and 10.2% for BMI and smoking respectively over a LASSO model. These results imply that probe measures may capture large amounts of variance because they are likely a consequence of the phenotype rather than a cause. As a result, ‘omics’ data may enable accurate characterization of disease progression and identification of individuals who are on a path to disease. Our approach facilitates better understanding of the underlying epigenetic architecture of complex common disease and is applicable to any kind of genomics data.

Download statistics

No data available

ID: 146473007