Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 31:4:5890.
doi: 10.1038/srep05890.

The dynamics of correlated novelties

Affiliations

The dynamics of correlated novelties

F Tria et al. Sci Rep. .

Abstract

Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called "expanding the adjacent possible". The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Heaps' law (a–e) and Zipf's law (f–l) in real datasets (a–d) and (f–i) and in the urn model with triggering (e,j).
Gutenberg (a), (f), Last.fm (b), (g), Wikipedia (c), (h), del.icio.us (d), (i) datasets, and the urn model with triggering (e), (j). Straight lines in the Heaps' law plots show functions of the form f(x) = axβ, with the exponent β equal respectively to β = 0.45 (Gutenberg), β = 0.68 (Last.fm lyrics), β = 0.56 (Last.fm artist), β = 0.77 (Wikipedia) and β = 0.78 (del.icio.us), and to the ratio ν/ρ in the urn model with triggering, showing that the exponents for the Heaps' law of the model predicted by the analytic results are confirmed in the simulations. Straight lines in the Zipf's law plots show functions of the form f(x) = axα, where the exponent α is equal to β−1 for the different β's considered above. Note that the frequency-rank plots in real data deviate from a pure power-law behavior and the correspondence between the β and α exponents is valid only asymptotically (see discussion above and the Supplementary Information for a discussion about finite-size effects).
Figure 2
Figure 2. Normalized entropy in real data and in the urn model with semantic triggering.
(a), (b), (c) Normalized entropy of a sequence associated to a specific label A vs. the number of events, k, with that label. The entropy is averaged for each k over the labels with the same number of occurrences. Results are displayed for Wikipedia (a), the Delicious dataset (b), the Last.fm dataset (c) and the urn model with semantic triggering (d). For the Wikipedia and Last.fm datasets we used the respective sequences Sunique as described in the section Methods, while for the Delicious dataset we used the full sequence of aggregated data. The plot for the model is an average over 10 realizations of the process, with parameters ρ = 8, ν = 10, η = 0.3 and N0 = ν + 1. The length of the considered sequences is N = 107 and the corresponding Heaps' exponent is formula image (see Supplementary Information for the relation of the Heaps' and Zipf's exponents with the model parameters). In all the cases, results for the actual data are compared with two null models, as described in the section Methods.
Figure 3
Figure 3. Distribution of triggering intervals in real data and in the urn model with semantic triggering.
Results for the distribution of triggering intervals (see the section Methods for the definition) for the same data as for the entropy measurements in figure 2: Wikipedia (a), the Delicious dataset (here the calculation of the local reshuffling was too time-consuming due to the fact that there are as many labels as words), (b), the Last.fm dataset (c) and the urn model with semantic triggering (d). The banner at the bottom(e) shows a formula image sequence for a particular label A of the Last.fm dataset. The color code is red for the actual sequence formula image and blue for the local reshuffle (see methods section) of formula image.
Figure 4
Figure 4. Normalized entropy and distribution of triggering intervals in single books of the Gutenberg dataset.
(a), (b), (c) Normalized entropy of words (as described in the main text) vs. the number of words occurrence, k. The entropy is averaged for each k over the words with the same number of occurrences. Results are displayed for the texts David Copperfield (a), Moby Dick (b) and Iliad (in the original greek version) (c). (d), (e), (f) Results for the distribution of triggering intervals for the same data as above. In all the cases, results for the actual data are compared with the two null models described in the section Methods.
Figure 5
Figure 5. Models.
Simple urn model with triggering (a), (b) and urn model with semantic triggering (c), (d). (a) Generic reinforcement step of the evolution. An element (the gray ball) that had previously been drawn from the urn formula image is drawn again. In this case one adds this element to formula image (depicted at the center of the figure) and, at the same time, puts ρ additional gray balls into formula image. (b) Generic adjacent possible step of the evolution. Here, upon drawing a new ball (red) from formula image, ν + 1 brand new balls are added to formula image along with the ρ red balls of the reinforcement step that takes place at each time step. (c), (d) Urn model with semantic triggering. Same as above except that now each ball has a label defining its semantic context. The label is conserved during a reinforcement event (e.g., the label A for the gray balls on panel c) while it appears as a brand new label, C, for the ν + 1 balls added for an adjacent possible event (panel d).
Figure 6
Figure 6. Entropy and intervals example.
Let us indicate with the same letters the occurrences, e.g., of lyrics of the same artist in the sequence. Suppose that A has just appeared in the sequence, which ends with G. Thus, A appears 4 times, i.e., k = 4. We divide the subsequence formula image in 4 parts and count the occurrences fi of A in each of them (bottom numbers). The normalized entropy of A will be formula image. As a value of S(k) we average all entropies of the elements occurring k-times in formula image. The numbers at the top show the length of the inter-times used in the interval distribution evaluation. The local reshuffling would shuffle only those 15 elements occurring after the first occurrence of A, and compute the normalized entropy and the time intervals distribution on this reduced sequence.

References

    1. Lazer D. et al. Computational social science. Science 323, 721–723 (2009). - PMC - PubMed
    1. Kauffman S. A. Investigations: The Nature of Autonomous Agents and the Worlds They Mutually Create. SFI working papers (Santa Fe Institute, 1996).
    1. Johnson S. Where Good Ideas Come From: The Natural History of Innovation (Riverhead Hardcover, 2010).
    1. Wagner A. & Rosen W. Spaces of the possible: universal Darwinism and the wall between technological and biological innovation. J. R. Soc. Interface 11, 20131190 (2014). - PMC - PubMed
    1. Kauffman S. A. Investigations (Oxford University Press, New York/Oxford, 2000).

Publication types

LinkOut - more resources