Computational predictors of genetic variant effect have advanced rapidly in recent years. These programs provide clinical and research laboratories with a rapid and scalable method to assess the likely impacts of novel variants. However, it can be difficult to know to what extent we can trust their results. To benchmark their performance, predictors are often tested against large datasets of known pathogenic and benign variants. These benchmarking data may overlap with the data used to train some supervised predictors, which leads to data re-use or circularity, resulting in inflated performance estimates for those predictors. Furthermore, new predictors are usually found by their authors to be superior to all previous predictors, which suggests some degree of computational bias in their benchmarking. Large-scale functional assays known as deep mutational scans provide one possible solution to this problem, providing independent datasets of variant effect measurements. In this Review, we discuss some of the key advances in predictor methodology, current benchmarking strategies and how data derived from deep mutational scans can be used to overcome the issue of data circularity. We also discuss the ability of such functional assays to directly predict clinical impacts of mutations and how this might affect the future need for variant effect predictors.
|Journal||Disease Models and Mechanisms|
|Publication status||Published - 1 Jun 2022|
- Computational Biology/methods