Enhancing Music Information Retrieval by Incorporating Image-Based Local Features

Leszek Kaliciak, Ben Horsburgh, Dawei Song, Nirmalie Wiratunga, Jeff Pan

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper presents a novel approach to music genre classification. Having represented music tracks in the form of two dimensional images, we apply the ``bag of visual words'' method from visual IR in order to classify the songs into 19 genres. By switching to visual domain, we can abstract from musical concepts such as melody, timbre and rhythm. We obtained classification accuracy of 46% (with 5% theoretical baseline for random classification) which is comparable with existing state-of-the-art approaches. Moreover, the novel features characterize different properties of the signal than standard methods. Therefore, the combination of them should further improve the performance of existing techniques.
Original languageEnglish
Title of host publicationInformation Retrieval Technology
EditorsYuexian Hou, Jian-Yun Nie, Le Sun, Bo Wang, Peng Zhang
Place of PublicationBerlin, Heidelberg
PublisherSpringer Berlin Heidelberg
Number of pages12
ISBN (Electronic)978-3-642-35341-3
ISBN (Print)978-3-642-35340-6
Publication statusPublished - 5 Dec 2012
EventThe Eighth Asia Information Retrieval Societies Conference 2012 - Tianjin, China
Duration: 17 Dec 201219 Dec 2012
Conference number: 8

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Berlin, Heidelberg
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceThe Eighth Asia Information Retrieval Societies Conference 2012
Abbreviated titleAIRS 2012


  • Local features
  • Co-occurrence matrix
  • Colour moments
  • K-means algorithm
  • Fourier transform


Dive into the research topics of 'Enhancing Music Information Retrieval by Incorporating Image-Based Local Features'. Together they form a unique fingerprint.

Cite this