Edinburgh Research Explorer

Hierarchical recurrent neural network for story segmentation using fusion of lexical and acoustic features

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publication2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages9
ISBN (Electronic)978-1-5090-4788-8
ISBN (Print)978-1-5090-4789-5
Publication statusPublished - 25 Jan 2018
Event2017 IEEE Automatic Speech Recognition and Understanding Workshop - Okinawa, Japan
Duration: 16 Dec 201720 Dec 2017


Conference2017 IEEE Automatic Speech Recognition and Understanding Workshop
Abbreviated titleASRU 2017
Internet address


A broadcast news stream consists of a number of stories and it is an important task to find the boundaries of stories automatically in news analysis. We capture the topic structure using a hierarchical model based on a Recurrent Neural Network (RNN) sentence modeling layer and a bidirectional Long Short-Term Memory (LSTM) topic modeling layer, with a fusion of acoustic and lexical features. Both features are accumulated with RNNs and trained jointly within the model to be fused at the sentence level. We conduct experiments on the topic detection and tracking (TDT4) task comparing combinations of two modalities trained with limited amount of parallel data. Further we utilize additional sufficient text data for training to polish our model. Experimental results indicate that the hierarchical RNN topic modeling takes advantage of the fusion scheme, especially with additional text training data, with a higher F1-measure compared to conventional state-of-the-art methods.


2017 IEEE Automatic Speech Recognition and Understanding Workshop


Okinawa, Japan

Event: Conference

Download statistics

No data available

ID: 44898995