Projects per year
Abstract
The open-source SUMMA Platform is a highly scalable distributed architecture for monitoring a large number of media broadcasts in parallel, with a lag behind actual broadcast time of at most a few minutes.
It assembles numerous state-of-the-art NLP technologies into a fully automated media ingestion pipeline that can record live broadcasts, detect and transcribe spoken content, translate from several languages (original text or transcribed speech) into English1, recognize Named Entities, detect topics, cluster and summarize documents across language barriers, and extract and store factual claims in these news items.
This paper describes the intended use cases and discusses the system design decisions that allowed us to integrate state-of-the-art NLP modules into an effective workflow with comparatively little effort.
1: The choice of English as the lingua franca withinthe Platform is due to the working language of ouruse case partners; the highly modular design of thePlatform allows the easy integration of custom translationengines, if required.
It assembles numerous state-of-the-art NLP technologies into a fully automated media ingestion pipeline that can record live broadcasts, detect and transcribe spoken content, translate from several languages (original text or transcribed speech) into English1, recognize Named Entities, detect topics, cluster and summarize documents across language barriers, and extract and store factual claims in these news items.
This paper describes the intended use cases and discusses the system design decisions that allowed us to integrate state-of-the-art NLP modules into an effective workflow with comparatively little effort.
1: The choice of English as the lingua franca withinthe Platform is due to the working language of ouruse case partners; the highly modular design of thePlatform allows the easy integration of custom translationengines, if required.
Original language | English |
---|---|
Title of host publication | Proceedings of the ACL 2018 Workshop for Natural Language Processing Open Source Software |
Place of Publication | Melbourne, Australia |
Publisher | Association for Computational Linguistics |
Pages | 47-51 |
Number of pages | 5 |
Publication status | Published - Jul 2018 |
Event | ACL 2018 Workshop for Natural Language Processing Open Source Software - Melbourne, Australia Duration: 20 Jul 2018 → 20 Jul 2018 https://nlposs.github.io/ |
Workshop
Workshop | ACL 2018 Workshop for Natural Language Processing Open Source Software |
---|---|
Abbreviated title | NLP-OSS 2018 |
Country/Territory | Australia |
City | Melbourne |
Period | 20/07/18 → 20/07/18 |
Internet address |
Fingerprint
Dive into the research topics of 'SUMMA: Integrating Multiple NLP Technologies into an Open-source Platform for Multilingual Media Monitoring'. Together they form a unique fingerprint.Projects
- 1 Finished
-
SUMMA - Scalable Understanding of Mulitingual Media
Renals, S., Birch-Mayne, A. & Cohen, S.
1/02/16 → 31/01/19
Project: Research