Edinburgh Research Explorer

Analysis of Video Feature Learning in Two-Stream CNNs on the Example of Zebrafish Swim Bout Classification

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions



  • Download as Adobe PDF

    Accepted author manuscript, 1.43 MB, PDF document

    Licence: Creative Commons: Attribution (CC-BY)

Original languageEnglish
Title of host publicationProceedings of the International Conference on Learning Representations, 2020
Place of PublicationAddis Ababa, Ethiopia
Number of pages18
Publication statusPublished - 30 Apr 2020
EventEighth International Conference on Learning Representations - Millennium Hall, Virtual conference formerly Addis Ababa, Ethiopia
Duration: 26 Apr 202030 Apr 2020


ConferenceEighth International Conference on Learning Representations
Abbreviated titleICLR 2020
CityVirtual conference formerly Addis Ababa
Internet address


Semmelhack et al. (2014) have achieved high classification accuracy in distinguishing swim bouts of zebrafish using a Support Vector Machine (SVM). Convolutional Neural Networks (CNNs) have reached superior performance in various image recognition tasks over SVMs, but these powerful networks remain a black box. Reaching better transparency helps to build trust in their classifications and makes learned features interpretable to experts. Using a recently developed technique called Deep Taylor Decomposition, we generated heatmaps to highlight input regions of high relevance for predictions. We find that our CNN makes predictions by analyzing the steadiness of the tail's trunk, which markedly differs from the manually extracted features used by Semmelhack et al. (2014). We further uncovered that the network paid attention to experimental artifacts. Removing these artifacts ensured the validity of predictions. After correction, our best CNN beats the SVM by 6.12%, achieving a classification accuracy of 96.32%. Our work thus demonstrates the utility of AI explainability for CNNs.

    Research areas

  • cs.CV, cs.LG, eess.IV


Eighth International Conference on Learning Representations


Virtual conference formerly Addis Ababa, Ethiopia

Event: Conference

Download statistics

No data available

ID: 131000066