Empirical evaluation: Towards an automated index of lexical variety

Vander Viana, Natália Giordani Silveira, Sonia Zyngier

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract / Description of output

This chapter proposes an objective approach to the formal analysis of literary prose in English in order to investigate the relation between lexical density and judgments of canonicity. Based on the concepts of literariness proposed by the Russian Formalists and lexical variety, a mathematical index is designed, relating three variables which take the materiality of text into consideration: (a) relative frequency of lexical bundles, (b) lexical bundle type/token ratio, and (c) word type/token ratio. The index is described and illustrated with 46 canonical and non-canonical literary works. Statistical analysis shows no significant relation between lexical richness and decisions of what has been classified as canonical, indicating that these judgments may be influenced by factors other than the text itself.
Original languageEnglish
Title of host publicationDirections in empirical literary studies
Subtitle of host publicationIn honor of Willie van Peer
EditorsSonia Zyngier, Marisa Bortolussi, Anna Chesnokova, Jan Auracher
Place of PublicationAmsterdam
Number of pages12
Publication statusPublished - 2008

Publication series

NameLinguistic Approaches to Literature
NameLinguistic approaches to literature
PublisherJohn Benjamins
ISSN (Print)1569-3112

Keywords / Materials (for Non-textual outputs)

  • Lexical variety
  • Corpus linguistics
  • Literary discourse
  • Lexical bundles
  • Empirical study
  • Canonicity


Dive into the research topics of 'Empirical evaluation: Towards an automated index of lexical variety'. Together they form a unique fingerprint.

Cite this