frances: Cloud-based historical text mining with deep learning and parallel processing

Lilin Yu, Ash Charlton, Wilfrid Askins, Melissa Terras, Rosa Filgueira

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Frances is an advanced cloud-based text mining digital platform that leverages information extraction, knowledge graphs, natural language processing (NLP), deep learning, and parallel processing techniques. It has been specifically designed to unlock the full potential of historical digital textual collections, such as those from the National Library of Scotland, offering cloud-based capabilities and extended support for complex NLP analyses and data visualizations. frances enables realtime recurrent operational text mining and provides robust capabilities for temporal analysis, accompanied by automatic visualizations for easy result inspection. In this paper, we present the motivation behind the development of frances, emphasizing its innovative design and novel implementation aspects. We also outline future development directions. Additionally, we evaluate the platform through two comprehensive case studies in history and publishing history. Feedback from participants in these studies demonstrates that frances accelerates their work and facilitates rapid testingand dissemination of ideas.
Original languageEnglish
Title of host publicationProceedings 2023 IEEE 19th International Conference on e-Science
EditorsGeorge Angelos Papadopoulos, Rosa Filgueira, Rafael Ferreira Da Silva
PublisherIEEE
Pages1-10
Number of pages10
ISBN (Electronic)9798350322231
ISBN (Print)9798350322231
DOIs
Publication statusPublished - 25 Sept 2023
Event19th IEEE International Conference on e-Science - Limassol, Cyprus
Duration: 9 Oct 202314 Oct 2023
Conference number: 19
https://www.escience-conference.org/about/

Conference

Conference19th IEEE International Conference on e-Science
Abbreviated titleeScience 2023
Country/TerritoryCyprus
CityLimassol
Period9/10/2314/10/23
Internet address

Keywords / Materials (for Non-textual outputs)

  • digitised historical collections
  • information extraction
  • Apache Spark
  • parallel processing
  • text mining
  • cloud-based platform
  • knowledge graphs
  • natural language processing

Fingerprint

Dive into the research topics of 'frances: Cloud-based historical text mining with deep learning and parallel processing'. Together they form a unique fingerprint.

Cite this