TY - CONF
T1 - The emergence of rules and exceptions in a population of interacting agents
AU - Cuskley, Christine
AU - Loreto, Vittorio
PY - 2016/3/15
Y1 - 2016/3/15
N2 - Recent studies in language evolution have identified important roles for frequency (Cuskley et al., 2014), phonology (Bybee, 2001), and speaker population (Lupyan & Dale, 2010) in the dynamics of linguistic regularity. We present a model which integrates frequency, phonology, and speaker demographics to inves- tigate how and why regularity and irregularity persist together given the general bias to eliminate unpredictable variation (i.e., irregularity), especially in experi- mental contexts (e.g., Hudson Kam & Newport, 2005; Smith & Wonnacott, 2010, among others). Kirby (2001) points out that while many models aim to represent how regular structure emerges in language, very few models explain how irregu- larity emerges. Using the iterated learning framework, Kirby (2001) showed that a skewed frequency of meanings and a general pressure for least effort in production can lead to the emergence of both stable regulars and irregulars in a vocabulary.
The current work aims to extend this finding by investigating the role of non- native speakers and phonological similarity in regularity dynamics. A recent study showed that non-native speakers irregularize novel forms more than native speak- ers. For example, non-native speakers are more likely than native speakers to apply ‘rules’ inferred from existing irregulars with a high token frequency (i.e., to provide the past tense of spling as splung, as an analogy with spring Cuskley et al., 2015). A potential mechanism underlying this result is that native and non-native speakers extend rules in different ways, depending on how rules are represented in their input. In other words, since native speakers have more experience with the ‘long-tail’ of regular verb types (Cuskley et al., 2014), they are more likley to extend the ‘regular’ rule. On the other hand, non-natives’ input is skewed towards irregular types with high token frequency, and thus they are more likely extend quasiregularity when inflecting novel forms, especially when novel forms exhibit phonological similarity with existing irregulars (Cuskley et al., 2015).
We model the dynamics of regularity in a language evolving among a population of agents engaging in repeated communicative interactions (modelled after the Naming Game, hereafter NG; Loreto & Steels, 2007). The model broadly consists of repeated speaker (S) hearer (H) interactions. Unlike the NG, agents do not evolve labels for meanings, but inflections for forms: instead of naming meanings, the task of the S within the communicative interaction is to inflect an existing form, and success of the interaction is evaluated depending on whether the H shares the same inflection for the same form (see also Colaiori et al., 2015).
Agents begin with no inflections, but have an inventory of shared meanings labelled by strings randomly generated from a set of 10 characters. Meanings are chosen for each interaction based on a skewed, pre-deterimined frequency distribution. In early interactions, speaker agents choose a random two character string as an inflection; thus, at the outset, success is low, but agents nonetheless store inflections with weighted success (number of interactions/number of successes). Once agents acquire some inflections in their vocabulary as a result of interac- tion, they choose inflections for uninflected meanings in their vocabulary based on different “native” and “non-native” strategies. Both agent types have a first preference for extending inflections based on phonological similarity above a certain threshold: in other words, if the label for meaning A has a highly weighted inflection and a edit distance ≤ 0.5 away from the label for meaning B, they will generalise the inflection for meaning A to meaning B. Where this strategy fails, natives extend inflections based on type frequency (i.e., apply the inflection used across most items in the vocabulary), while non-natives extend inflections based on token frequency (i.e., apply the inflection from the most frequent item in the vocabulary).
Populations arrive at stable inflectional paradigms which include both regular and irregular forms. By altering the proportion of type and token preference agents in different iterations of the model, we are able to examine how these different strategies affect the structure of language over long timescales, and how changing proportions of token and type extension agents changes languages over time. Results from this framework support recent theories that the relative proportion of native and non-native speakers in a population has the potential to affect the structure of language.
AB - Recent studies in language evolution have identified important roles for frequency (Cuskley et al., 2014), phonology (Bybee, 2001), and speaker population (Lupyan & Dale, 2010) in the dynamics of linguistic regularity. We present a model which integrates frequency, phonology, and speaker demographics to inves- tigate how and why regularity and irregularity persist together given the general bias to eliminate unpredictable variation (i.e., irregularity), especially in experi- mental contexts (e.g., Hudson Kam & Newport, 2005; Smith & Wonnacott, 2010, among others). Kirby (2001) points out that while many models aim to represent how regular structure emerges in language, very few models explain how irregu- larity emerges. Using the iterated learning framework, Kirby (2001) showed that a skewed frequency of meanings and a general pressure for least effort in production can lead to the emergence of both stable regulars and irregulars in a vocabulary.
The current work aims to extend this finding by investigating the role of non- native speakers and phonological similarity in regularity dynamics. A recent study showed that non-native speakers irregularize novel forms more than native speak- ers. For example, non-native speakers are more likely than native speakers to apply ‘rules’ inferred from existing irregulars with a high token frequency (i.e., to provide the past tense of spling as splung, as an analogy with spring Cuskley et al., 2015). A potential mechanism underlying this result is that native and non-native speakers extend rules in different ways, depending on how rules are represented in their input. In other words, since native speakers have more experience with the ‘long-tail’ of regular verb types (Cuskley et al., 2014), they are more likley to extend the ‘regular’ rule. On the other hand, non-natives’ input is skewed towards irregular types with high token frequency, and thus they are more likely extend quasiregularity when inflecting novel forms, especially when novel forms exhibit phonological similarity with existing irregulars (Cuskley et al., 2015).
We model the dynamics of regularity in a language evolving among a population of agents engaging in repeated communicative interactions (modelled after the Naming Game, hereafter NG; Loreto & Steels, 2007). The model broadly consists of repeated speaker (S) hearer (H) interactions. Unlike the NG, agents do not evolve labels for meanings, but inflections for forms: instead of naming meanings, the task of the S within the communicative interaction is to inflect an existing form, and success of the interaction is evaluated depending on whether the H shares the same inflection for the same form (see also Colaiori et al., 2015).
Agents begin with no inflections, but have an inventory of shared meanings labelled by strings randomly generated from a set of 10 characters. Meanings are chosen for each interaction based on a skewed, pre-deterimined frequency distribution. In early interactions, speaker agents choose a random two character string as an inflection; thus, at the outset, success is low, but agents nonetheless store inflections with weighted success (number of interactions/number of successes). Once agents acquire some inflections in their vocabulary as a result of interac- tion, they choose inflections for uninflected meanings in their vocabulary based on different “native” and “non-native” strategies. Both agent types have a first preference for extending inflections based on phonological similarity above a certain threshold: in other words, if the label for meaning A has a highly weighted inflection and a edit distance ≤ 0.5 away from the label for meaning B, they will generalise the inflection for meaning A to meaning B. Where this strategy fails, natives extend inflections based on type frequency (i.e., apply the inflection used across most items in the vocabulary), while non-natives extend inflections based on token frequency (i.e., apply the inflection from the most frequent item in the vocabulary).
Populations arrive at stable inflectional paradigms which include both regular and irregular forms. By altering the proportion of type and token preference agents in different iterations of the model, we are able to examine how these different strategies affect the structure of language over long timescales, and how changing proportions of token and type extension agents changes languages over time. Results from this framework support recent theories that the relative proportion of native and non-native speakers in a population has the potential to affect the structure of language.
M3 - Abstract
ER -