Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb;15(139):20170738.
doi: 10.1098/rsif.2017.0738.

How humans transmit language: horizontal transmission matches word frequencies among peers on Twitter

Affiliations

How humans transmit language: horizontal transmission matches word frequencies among peers on Twitter

John Bryden et al. J R Soc Interface. 2018 Feb.

Abstract

Language transmission, the passing on of language features such as words between people, is the process of inheritance that underlies linguistic evolution. To understand how language transmission works, we need a mechanistic understanding based on empirical evidence of lasting change of language usage. Here, we analysed 200 million online conversations to investigate transmission between individuals. We find that the frequency of word usage is inherited over conversations, rather than only the binary presence or absence of a word in a person's lexicon. We propose a mechanism for transmission whereby for each word someone encounters there is a chance they will use it more often. Using this mechanism, we measure that, for one word in around every hundred a person encounters, they will use that word more frequently. As more commonly used words are encountered more often, this means that it is the frequencies of words which are copied. Beyond this, our measurements indicate that this per-encounter mechanism is neutral and applies without any further distinction as to whether a word encountered in a conversation is commonly used or not. An important consequence of this is that frequencies of many words can be used in concert to observe and measure language transmission, and our results confirm this. These results indicate that our mechanism for transmission can be used to study language patterns and evolution within populations.

Keywords: Moran process; evolution of language; horizontal transmission; language transmission; linguistic evolution; word heritability.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

Figure 1.
Figure 1.
An osmosis-like process for horizontal language transmission used in our model. The two halves of the diagram show the internal language representations of two individuals as bags of words. The figure shows how an individual in our framework copies and stores a word from their conversation partner; an instance of word A is incorporated, replacing an instance of word C. The number of instances of a particular word defines how likely someone is to use the word in a given situation. In our model of this process, each bag contains s words; user i sends words to user j at a rate rij and the recipient replaces a randomly chosen word in their bag with a received word with incorporation rate α. Since the likelihood of a word being replaced depends on its frequency in the bag, word frequencies change similarly to osmosis in that over time the frequencies of words in both halves will tend to equilibrate.
Figure 2.
Figure 2.
Word heritability between conversing partners is greater than that for non-conversing partners. For each test word, we plot regressions (see Methods) for data from conversing partners (blue solid lines) and non-conversing partners (green solid lines). The regression lines were superimposed by translucently plotting lines for each regression, interleaving between the two datasets. We found relatively high levels of word heritability in non-conversing partners due to word usage changing at population levels. A Mann–Witney U-test indicated that the slopes for conversing partners tend to be steeper than those for non-conversing partners (pMW < 9.5 × 10−10). The two dashed lines (same colours) are slopes regressed over data collected for all of the words; the difference between these values was W = 0.0340, which is a measurement of word heritability due to Twitter conversations. We tested that W > 0 using a bootstrap (pB < 0.001, see Methods).
Figure 3.
Figure 3.
The rates with which words are incorporated is independent of usage frequency. Each circle is a word's incorporation rate (circles have translucency of 30%). Linear regression finds no correlation between a word's usage count (in our whole sample) and the incorporation rate (two-tailed Pearson correlation coefficient: r2 = 0.00040, p = 0.54). The mean value of the word incorporation rate α is 0.0043, which we found to be significantly greater than zero (p = 0.0083, bootstrapping with 10 000 resamples of 100 values, and calculating the proportion of resamples with mean greater than zero). The high variance for very low frequencies is due to sampling effects. (Online version in colour.)
Figure 4.
Figure 4.
The more messages were sent between two users, the more their language converged. (a) Plot of the means of bins of conversation pairs (binned along the x-axis showing x, y means of each bin) and fitted models (black line is the transmission model, green horizontal line is the null model; see Methods). The fitted line of our model crosses zero at approximately 310 messages sent. (b) Illustration of the large variance in the data (unbordered translucent circles which are superimposed). The convergence of 500 conversation pairs (sampled with replacement) are plotted per bin on the x-axis (bordered blue circles). Control values are also shown (bordered green circles).

References

    1. Bloomfield L. 1933. Language. Chicago, IL: University of Chicago Press.
    1. Dunn M, Terrill A, Reesink G, Foley RA, Levinson SC. 2005. Structural phylogenetics and the reconstruction of ancient language history. Science 309, 2072–2075. (10.1126/science.1114615) - DOI - PubMed
    1. Lieberman E, Michel J-B, Jackson J, Tang T, Nowak MA. 2007. Quantifying the evolutionary dynamics of language. Nature 449, 713–716. (10.1038/nature06137) - DOI - PMC - PubMed
    1. Gray RD, Drummond AJ, Greenhill SJ. 2009. Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323, 479–483. (10.1126/science.1166858) - DOI - PubMed
    1. Pagel M. 2009. Human language as a culturally transmitted replicator. Nat. Rev. Genet. 10, 405–415. (10.1038/nrg2560) - DOI - PubMed

Publication types

LinkOut - more resources