Order Matters! Influences of Linear Order on Linguistic Category Learning

Dorothée B Hoppe¹, Jacolien van Rij², Petra Hendriks¹, Michael Ramscar³

Affiliations

¹ Center for Language and Cognition, University of Groningen.
² Department of Artificial Intelligence, University of Groningen.
³ Department of General and Computational Linguistics, University of Tübingen.

PMID: 33124103
PMCID: PMC7685149
DOI: 10.1111/cogs.12910

Order Matters! Influences of Linear Order on Linguistic Category Learning

Dorothée B Hoppe et al. Cogn Sci. 2020 Nov.

. 2020 Nov;44(11):e12910.

doi: 10.1111/cogs.12910.

Authors

Dorothée B Hoppe¹, Jacolien van Rij², Petra Hendriks¹, Michael Ramscar³

Affiliations

¹ Center for Language and Cognition, University of Groningen.
² Department of Artificial Intelligence, University of Groningen.
³ Department of General and Computational Linguistics, University of Tübingen.

PMID: 33124103
PMCID: PMC7685149
DOI: 10.1111/cogs.12910

Abstract

Linguistic category learning has been shown to be highly sensitive to linear order, and depending on the task, differentially sensitive to the information provided by preceding category markers (premarkers, e.g., gendered articles) or succeeding category markers (postmarkers, e.g., gendered suffixes). Given that numerous systems for marking grammatical categories exist in natural languages, it follows that a better understanding of these findings can shed light on the factors underlying this diversity. In two discriminative learning simulations and an artificial language learning experiment, we identify two factors that modulate linear order effects in linguistic category learning: category structure and the level of abstraction in a category hierarchy. Regarding category structure, we find that postmarking brings an advantage for learning category diagnostic stimulus dimensions, an effect not present when categories are non-confusable. Regarding levels of abstraction, we find that premarking of super-ordinate categories (e.g., noun class) facilitates learning of subordinate categories (e.g., nouns). We present detailed simulations using a plausible candidate mechanism for the observed effects, along with a comprehensive analysis of linear order effects within an expectation-based account of learning. Our findings indicate that linguistic category learning is differentially guided by pre- and postmarking, and that the influence of each is modulated by the specific characteristics of a given category system.

Keywords: Artificial language learning experiment; Behavioral experiment; Computational simulation; Discriminative learning; Error-driven learning; Linguistic categories.

PubMed Disclaimer

Figures

**Fig. 1**
Illustration of the difference between learning in (a) a premarking situation and (b) a postmarking situation. In this example, based on the materials used in the simulations and behavioral experiment (see Table 2), a learner either needs to associate noun class markers (e.g., *ima*) with a noun and its form features (e.g., stress or phones) and semantic features (e.g., *animal*) or the other way around. In the divergent premarking situation (a), there is little cue competition (dashed black box). In the postmarking situation (b), the relation between cues and outcomes is convergent, which leads to many cues competing with each other (dashed black box). Moreover, the pattern of association (black dashed lines) and dissociation (red dashed lines) is not mirrored between (a) and (b), which shows the asymmetry of the discriminative learning mechanism. Note that capitals mark syllable stress.

**Fig. 2**
Probabilities of correct categorization (a) on the distinct dimension and (b) on the partly overlapping dimension after premarking training and after postmarking training to asymptote (1,600 trials) in Simulation 1. Blue bars show the probability of correctly choosing a feature set given a premarker and the constant cue. Orange bars show the probability of correctly choosing a postmarker given a feature set and the constant cue. Baseline performance, which assumes a completely naive model making a random choice, is marked by the horizontal line. The dashed lines show probabilities of correct choice after the same amount of training trials as in the behavioral experiment (412 trials). See Table 3 for all possible feature combinations.

**Fig. 3**
Learned weights of Noun class 1 in Simulation 1 (a) between premarkers (i.e., marker 1) and item features (i.e., {D1form, O1form, O2form, D1meaning}) and (b) item features and postmarkers (i.e., also marker 1). Orange lines show the weight between a distinct feature (i.e., D1form or D1meaning) and a marker, blue lines the weight between a low‐frequency (LF) overlapping feature (i.e., O2form; LF because occurring in two noun classes) and a marker, and violet lines the weight between a high‐frequency (HF) overlapping feature (i.e., O1form; HF because occurring in three noun classes) and a marker. Solid lines mark the correct features and dotted lines the features of the wrong Noun class 2. The vertical dashed lines show 412 training trials, as administered in the behavioral experiment.

**Fig. 4**
Illustration of the difference between learning to discriminate subordinate categories, here artificial nouns, with (b) postmarking or (c) premarking. (a) shows example nouns from two noun classes, with their associated premarkers and postmarkers (see Table 2). In postmarking (b) discrimination is performed *across* noun classes, which can lead to dissociation (red dashed line in black dashed box) of features relevant for the noun discrimination but overlapping between classes, for example, the first sound of a noun #l. Noun class premarkers (c) can reduce uncertainty about following items such that discrimination sis performed *within* a noun class.

**Fig. 5**
Median probability of choosing the target object in the noun learning simulation (Simulation 2) after weights of frequent noun features to objects have reached asymptote. Error bars show the interquartile ranges (i.e., 25%−75% of data). Dashed lines show median probability of choosing the target after the same amount of training trials as in the behavioral experiment (412 trials).

**Fig. 6**
Sample training trial of the behavioral experiment. The image on the left depicts the sentence context matching the carrier phrase (he is dreaming of …), and the image on the right shows the context image with the noun meaning (apples) included.

**Fig. 7**
Sample test trials for the Noun, Form Category, and Meaning Category Test in the *premarking variant* (i.e., premarker varying with noun class and unspecific postmarker *agi*). Syllable stress is marked by capitals. The green boxes signal the correct answer options.

**Fig. 8**
Model estimates (excluding random effects, CI ± 1 SE, inverse logit transformed; using R package itsadug) of accuracy in the Noun Test, Form Category Test, and Meaning Category Test for *correct answer options preceding wrong answer options* in the forced‐choice task (see results for wrong answer options preceding correct answer options in Fig. C1). Dots represent the actual data, namely mean accuracies by participant.

**Fig. C1**
Model estimates (excluding random effects, CI ± 1 SE, inverse logit transformed; using R package itsadug) of accuracy in the Noun Test, Form Category Test, and Meaning Category Test models for wrong answer options preceding correct answer options in the forced choice task. Dots represent the actual data, namely mean accuracies by participant. The only significant difference between premarking and postmarking was an advantage of premarking for noun learning.

See this image and copyright information in PMC

References

1. Aizenberg, I. , Aizenberg, N. N. , & Vandewalle, J. P. (2013). Multi‐valued and universal binary neurons: Theory, learning and applications. Dordrecht, The Netherlands: Springer Science & Business Media.
1. Akaike, H. (2011). Akaike's information criterion In Lovric M. (Ed.), International encyclopedia of statistical science (p. 25). Berlin/Heidelberg: Springer.
1. Arnon, I. , & Ramscar, M. (2012). Granularity and the acquisition of grammatical gender: How order‐of‐acquisition affects what gets learned. Cognition, 122(3), 292–305. - PubMed
1. Arppe, A. , Hendrix, P. , Milin, P. , Baayen, R. H. , Sering, T. , & Shaoul, C. (2018). ndl: Naive discriminative learning. R package version 0.2.18. Available at: https://CRAN.R‐project.org/package=ndl. Accessed September 28, 2020.
1. Boersma, P. , & Escudero, P. (2008). Learning to perceive a smaller L2 vowel inventory: An optimality theory account In Avery P., Dresher B. E. & Rice K. (Eds.), Contrast in phonology: Theory, perception, acquisition (pp. 271–301). Berlin: Mouton de Gruyter.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Order Matters! Influences of Linear Order on Linguistic Category Learning

Affiliations

Order Matters! Influences of Linear Order on Linguistic Category Learning

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources