Projects per year
Abstract
Escherichia coli is a species of bacteria that can be present in a wide variety of mammalian hosts and potentially soil environments. E. coli has an open genome and can show considerable diversity in gene content between isolates. It is a reasonable assumption that gene content reflects evolution of strains in particular host environments and therefore can be used to predict the host most likely to be the source of an isolate. An extrapolation of this argument is that strains may also have gene content that favors success in multiple hosts and so the possibility of successful transmission from one host to another, for example, from cattle to human, can also be predicted based on gene content. In this methods chapter, we consider the issue of Shiga toxin (Stx)-producing E. coli (STEC) strains that are present in ruminants as the main host reservoir and for which we know that a subset causes life-threatening infections in humans. We show how the genome sequences of E. coli isolated from both cattle and humans can be used to build a classifier to predict human and cattle host association and how this can be applied to score key STEC serotypes known to be associated with human infection. With the example dataset used, serogroups O157, O26, and O111 show the highest, and O103 and O145 the lowest, predictions for human association. The long-term ambition is to combine such machine learning predictions with phylogeny to predict the zoonotic threat of an isolate based on its whole genome sequence (WGS).
Original language | English |
---|---|
Title of host publication | Shiga Toxin-Producing E. coli |
Editors | John M. Walker, Stephanie Schuller, Martina Bielaszewska |
Publisher | Springer-Verlag |
Pages | 99-117 |
Number of pages | 19 |
Volume | 2291 |
ISBN (Electronic) | 978-1-0716-1339-9 |
DOIs | |
Publication status | E-pub ahead of print - 12 Mar 2021 |
Publication series
Name | Methods in molecular biology (Clifton, N.J.) |
---|---|
Publisher | Humana Press |
ISSN (Print) | 1064-3745 |
Keywords
- Machine learning
- Host attribution
- Zoonotic threat
- STEC
- Whole genome sequence (WGS)
Fingerprint
Dive into the research topics of 'Predicting Host Association for Shiga Toxin-Producing E. coli Serogroups by Machine Learning'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Machine-learning to predict and understand the zoonotic threat of E.Coli 0157 isolates
1/10/17 → 31/01/21
Project: Research