Edinburgh Research Explorer

Extracting Statistically Significant Behaviour from Fish Tracking Data With and Without Large Dataset Cleaning

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Open Access permissions

Open

Documents

  • Download as Adobe PDF

    Accepted author manuscript, 824 KB, PDF document

    Licence: Creative Commons: Attribution (CC-BY)

http://digital-library.theiet.org/content/journals/iet-cvi
Original languageEnglish
Number of pages26
JournalIET Computer Vision
DOIs
Publication statusPublished - 1 Nov 2017

Abstract

Extracting a statistically significant result from video data of natural phenomenon can be difficult for two reasons: i) there can be considerable natural variation in the observed behaviour, and ii) computer vision algorithms applied to natural phenomena may not perform correctly on a significant number of samples. This paper presents one approach to cleaning of a large noisy visual tracking dataset to allow extracting statistically sound results from the image data. In particular, the paper presents an analysis of a dataset of 3.6 million underwater trajectories of a species of fish, which are also labelled with the water temperature at the time of acquisition. Although there are many
false detections and incorrect trajectory assignments, by a combination of data binning and robust estimation methods, we demonstrate reliable evidence for an increase in fish speed as water temperature increases. We also present a method for data cleaning which removes outliers arising from false detections and incorrect trajectory assignments using an effective deep learning based clustering algorithm. The corresponding results show a rise in fish speed as temperature goes up. Several statistical tests applied to both cleaned and not-cleaned data confirm that both results are statistically significant and show an increasing trend (not random). However, the latter approach also generates a cleaner dataset suitable for other analysis.

Download statistics

No data available

ID: 46491538