Towards a grapho-phonologically parsed corpus of medieval Scots: Database design and technical solutions

Joanna Kopaczyk, Benjamin Molineaux Ress, Vasileios Karaiskos, Rhona Alcorn, Betty Los, Warren Maguire

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

This paper presents a newly constructed corpus of sound-to-spelling mappings in medieval Scots, which stems from the work of the From Inglis to Scots (FITS) project. We have developed a systematic approach to the relationships between individual spellings and proposed sound values, and recorded these mutual links in a relational database. In this paper, we introduce the theoretical underpinnings of sound-to-spelling and spelling-to-sound mappings, and show how a Scots root morpheme undergoes grapho-phonological parsing, the analytical procedure that is employed to break down spelling sequences into sound units. We explain the data collection and annotation for the FITS Corpus (Alcorn et al., forthcoming), drawing attention to the extensive meta-data which accompany each analysed unit of spelling and sound. The database records grammatical and lexical information about the root, the positional arrangement of segments within the root, labels for the nuclei, vowels and consonants, the morphological context, and extra-linguistic detail of the text a given root was taken from (date, place and text type). With this wealth of information, the FITS corpus is capable of answering complex queries about the sound and spelling systems of medieval Scots. We also suggest how our methodology can be transferred to other non-standardised spelling systems.
Original languageEnglish
Pages (from-to)255–269
JournalCorpora
Volume13
Issue number2
Early online date6 Aug 2018
DOIs
Publication statusPublished - Aug 2018

Keywords / Materials (for Non-textual outputs)

  • corpus building
  • grapheme
  • grapho-phonological parsing
  • historical corpus phonology
  • Scots
  • spelling

Fingerprint

Dive into the research topics of 'Towards a grapho-phonologically parsed corpus of medieval Scots: Database design and technical solutions'. Together they form a unique fingerprint.

Cite this