A decision-tree-based alarming system for the validation of national genetic evaluations

S. Diplaris*, A. L. Symeonidis, P. A. Mitkas, G. Banos, Z. Abas

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


The aim of this work was to explore possibilities to build an alarming system based on the results of the application of data mining (DM) techniques in genetic evaluations of dairy cattle, in order to assess and assure data quality. The technique used combined data mining using classification and decision-tree algorithms, Gaussian binned fitting functions, and hypothesis tests. Data were quarterly national genetic evaluations, computed between February 1999 and February 2003 in nine countries. Each evaluation run included 73,000-90,000 bull records complete with their genetic values and evaluation information. Milk production traits were considered. Data mining algorithms were applied separately for each country and evaluation run to search for associations across several dimensions. including bull origin, type of proof, age of bull, and number of daughters. Then, data in each node were fitted to the Gaussian function and the quality of the fit was measured, thus providing a measure of the quality of data. In order to evaluate and ultimately predict decision-tree models, the implemented architecture can compare the node probabilities between two models and decide on their similarity, using hypothesis tests for the standard deviation of their distribution. The key utility of this technique lays in its capacity to identify the exact node where anomalies occur, and to fire a focused alarm pointing to erroneous data. (c) 2006 Elsevier B.V. All rights reserved.

Original languageEnglish
Pages (from-to)21-35
Number of pages15
JournalComputers and electronics in agriculture
Issue number1-2
Publication statusPublished - Jun 2006


  • data mining
  • alarming technique
  • quality control
  • genetic evaluations
  • dairy cattle evaluations

Cite this