A Corpus of Narrative Etymologies from primitive Old English to early Middle English

  • Laing, Margaret (Principal Investigator)
  • Williamson, Keith (Co-investigator)
  • Alcorn, Rhona (Researcher)
  • Lass, Roger (Other)

Project Details


In Middle English (1150-1500) there are over 60 different spellings for 'she' and over 500 for 'through'; how has such variation come about? One of the most striking aspects of the written language of the Middle English period is the sheer number of spelling variants for what are single words with fixed spellings in Present Day Standard written English. It is widely accepted that much of this spelling diversity is systematic, yet no history of English so far attempted has offered a set of etymologies to account for the diachronic and regional diversity. Traditional etymologies, as exemplified in the Oxford English Dictionary, are designed primarily to elucidate the semantic history of a word, and so tend to explain form changes only in very broad detail. Although the handbooks and grammars of early English contain descriptions of the major sound changes that account for some of the spelling diversity, such descriptions tend to focus on developments towards Standard English and so fail to chronicle the particular processes which account for individual word-forms. CoNE radically reshapes the early narrative history of English. It explicates in unprecedented detail the evolution of individual words and affixes in the input language to the Anglo-Saxon settlement into the variant forms attested in Early Middle English (1150-1325). Underpinning CoNE is the 'Corpus of Changes' (CC), which documents and explains the linguistic changes referred to in our individual word-histories. CoNE complements the content of existing historical English dictionaries, both in the systematic presentation of all changes and in explaining all attested form types as a set of branching narratives. Indeed, CoNE makes explicit many of the assumptions about form histories in the historical dictionaries, and tests and revises those assumptions on the basis of new and detailed analysis. CC itself is a valuable resource for updating and correcting the existing grammars and philological handbooks to reflect any newly noticed and newly interpreted linguistic changes that it documents. The 'targets' for our narrative etymologies are all forms of Germanic lexical words and grammatical items (e.g. inflections, derivational affixes, articles and pronouns) recorded in the corpus of tagged texts compiled for A Linguistic Atlas of Early Middle English (LAEME). CoNE takes as the input form for each narrative the phonetic shape an item may be presumed to have had in the dialect-complex that served as input to Old English. It then formulates its story, over a time-depth of 700-800 years, by reference to the changes documented in CC, as far as the collection of shapes the item evidences in the LAEME corpus.

Layman's description

The purpose of CoNE is to explain a specific data set: the forms attested in the corpus of early Middle English texts collected for a Linguistic Atlas of Early Middle English (LAEME). This corpus is the largest available collection of texts written or copied in English during the period 1150-1325. The scribal languages of the texts in the LAEME corpus of tagged texts (LAEME CTT) are the product, in part, of developments from Old English. It is these developments that CoNE aims initially to explicate in detail, to account for each variant spelling type attested in the LAEME CTT.

Key findings

The variant forms of a word or morpheme in LAEME CTT are gathered under a ‘tag’ which taxonomizies the form lexically and / or grammatically with respect to each of its occurrences in one or more texts. This tag – as the label for a set of forms – is related to a form deemed to be original for Old English (the presumed form at the time of the original settlement of speakers of a West Germanic tongue in England). Each CoNE etymology in effect tells the story of how that original form changed (often taking divers paths) to give the various forms found subsumed under the LAEME tag. The LAEME CTT provides a specific
terminus ad quem for the linguistic narratives.
The two fundamental parts of CoNE are the set of narrative etymologies (the Corpus of Narrative Etymologies itself) and the set of linguistic changes (the Corpus of Changes, the CC). Taking an individual tag from a menu, a user can access the etymological narrative for the forms subsumed under that tag in the LAEME CTT. An etymological narrative falls into two parts. The first is a narrative which deals with the item’s evolution from its Proto-Old English origin to its attested Old English forms. These Old English forms are presumed to be
input into the Middle English narrative, which accounts for the orthographic types found in LAEME. The entries in CoNE can also be searched. The Corpus of Changes can be consulted in its own right, by browsing or searching for terms.
Effective start/end date1/09/1031/12/13


  • AHRC: £854,213.00