Visual cleaning of genotype data

J. Kennedy, M. Graham, T. Paterson, A. Law

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract / Description of output

While some data cleaning tasks can be performed automatically, many more require expert human guidance to steer the cleaning process, especially if erroneous or unclean data is a product of relationships between entities. An example is pedigree genotype data: inheritance hierarchies in which the correctness of genotype data for any individual is judged on comparison to their relations' genotypes, as individuals should inherit DNA from their assumed ancestors. Thus, cleaning this data must consider the relationships between individuals; sometimes this means more data must be cleaned than first assumed, while in other situations it means errors across many individuals can be remedied by cleaning the data of a shared relation. Such judgements require a domain expert to hypothesise the effect changing particular data has on the wider data set. Using a visualization tool with the ability to undertake what-if interactions can assist a user in correctly cleaning such data. We achieve this by closely coupling an existing pedigree visualisation technique, VIPER, with a genotype cleaning algorithm, and then develop necessary extensions to the visualization to allow interactive data cleaning. A comparative user evaluation with biologists shows the advantages of this visualisation design over an existing cleaning tool and we discuss the challenges in the design of visual cleaning tools in which errors may be transitive.
Original languageEnglish
Title of host publicationBioVis 2013 - IEEE Symposium on Biological Data Visualization 2013, Proceedings
EditorsJos Roerdink, Jessie Kennedy
Pages105-112
Number of pages8
DOIs
Publication statusPublished - 1 Jan 2013
EventBioVis 2013 - IEEE Symposium on Biological Data Visualization 2013 - Atlanta, United States
Duration: 13 Oct 201314 Oct 2013

Conference

ConferenceBioVis 2013 - IEEE Symposium on Biological Data Visualization 2013
Country/TerritoryUnited States
CityAtlanta
Period13/10/1314/10/13

Fingerprint

Dive into the research topics of 'Visual cleaning of genotype data'. Together they form a unique fingerprint.

Cite this