. 2014 Jul 31:4:5890.

doi: 10.1038/srep05890.

The dynamics of correlated novelties

F Tria¹, V Loreto², V D P Servedio³, S H Strogatz⁴

Affiliations

¹ Institute for Scientific Interchange (ISI), Via Alassio 11C, 10126 Torino, Italy.
² 1] Institute for Scientific Interchange (ISI), Via Alassio 11C, 10126 Torino, Italy [2] Sapienza University of Rome, Physics Dept., Piazzale Aldo Moro 5, 00185 Roma, Italy.
³ 1] Sapienza University of Rome, Physics Dept., Piazzale Aldo Moro 5, 00185 Roma, Italy [2] Institute for Complex Systems (ISC-CNR), Via dei Taurini 19, 00185 Roma, Italy.
⁴ Cornell University, Dept. of Mathematics, 310 Malott Hall, Ithaca, NY 14853, USA.

PMID: 25080941
PMCID: PMC5376195
DOI: 10.1038/srep05890

The dynamics of correlated novelties

F Tria et al. Sci Rep. 2014.

. 2014 Jul 31:4:5890.

doi: 10.1038/srep05890.

Authors

F Tria¹, V Loreto², V D P Servedio³, S H Strogatz⁴

Affiliations

¹ Institute for Scientific Interchange (ISI), Via Alassio 11C, 10126 Torino, Italy.
² 1] Institute for Scientific Interchange (ISI), Via Alassio 11C, 10126 Torino, Italy [2] Sapienza University of Rome, Physics Dept., Piazzale Aldo Moro 5, 00185 Roma, Italy.
³ 1] Sapienza University of Rome, Physics Dept., Piazzale Aldo Moro 5, 00185 Roma, Italy [2] Institute for Complex Systems (ISC-CNR), Via dei Taurini 19, 00185 Roma, Italy.
⁴ Cornell University, Dept. of Mathematics, 310 Malott Hall, Ithaca, NY 14853, USA.

PMID: 25080941
PMCID: PMC5376195
DOI: 10.1038/srep05890

Abstract

Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called "expanding the adjacent possible". The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

**Figure 1. Heaps' law (a–e) and Zipf's law (f–l) in real datasets (a–d) and (f–i) and in the urn model with triggering (e,j).**
Gutenberg (a), (f), Last.fm (b), (g), Wikipedia (c), (h), del.icio.us (d), (i) datasets, and the urn model with triggering (e), (j). Straight lines in the Heaps' law plots show functions of the form f(x) = *ax^β*, with the exponent β equal respectively to β = 0.45 (Gutenberg), β = 0.68 (Last.fm lyrics), β = 0.56 (Last.fm artist), β = 0.77 (Wikipedia) and β = 0.78 (del.icio.us), and to the ratio ν/ρ in the urn model with triggering, showing that the exponents for the Heaps' law of the model predicted by the analytic results are confirmed in the simulations. Straight lines in the Zipf's law plots show functions of the form f(x) = ax^−α, where the exponent α is equal to β⁻¹ for the different β's considered above. Note that the frequency-rank plots in real data deviate from a pure power-law behavior and the correspondence between the β and α exponents is valid only asymptotically (see discussion above and the Supplementary Information for a discussion about finite-size effects).

**Figure 2. Normalized entropy in real data and in the urn model with semantic triggering.**
(a), (b), (c) Normalized entropy of a sequence associated to a specific label A vs. the number of events, k, with that label. The entropy is averaged for each k over the labels with the same number of occurrences. Results are displayed for Wikipedia (a), the Delicious dataset (b), the Last.fm dataset (c) and the urn model with semantic triggering (d). For the Wikipedia and Last.fm datasets we used the respective sequences S_unique as described in the section Methods, while for the Delicious dataset we used the full sequence of aggregated data. The plot for the model is an average over 10 realizations of the process, with parameters ρ = 8, ν = 10, η = 0.3 and N₀ = ν + 1. The length of the considered sequences is N = 10⁷ and the corresponding Heaps' exponent is (see Supplementary Information for the relation of the Heaps' and Zipf's exponents with the model parameters). In all the cases, results for the actual data are compared with two null models, as described in the section Methods.

formula image — **Figure 2. Normalized entropy in real data and in the urn model with semantic triggering.**
(a), (b), (c) Normalized entropy of a sequence associated to a specific label A vs. the number of events, k, with that label. The entropy is averaged for each k over the labels with the same number of occurrences. Results are displayed for Wikipedia (a), the Delicious dataset (b), the Last.fm dataset (c) and the urn model with semantic triggering (d). For the Wikipedia and Last.fm datasets we used the respective sequences S_unique as described in the section Methods, while for the Delicious dataset we used the full sequence of aggregated data. The plot for the model is an average over 10 realizations of the process, with parameters ρ = 8, ν = 10, η = 0.3 and N₀ = ν + 1. The length of the considered sequences is N = 10⁷ and the corresponding Heaps' exponent is (see Supplementary Information for the relation of the Heaps' and Zipf's exponents with the model parameters). In all the cases, results for the actual data are compared with two null models, as described in the section Methods.

**Figure 3. Distribution of triggering intervals in real data and in the urn model with semantic triggering.**
Results for the distribution of triggering intervals (see the section Methods for the definition) for the same data as for the entropy measurements in figure 2: Wikipedia (a), the Delicious dataset (here the calculation of the local reshuffling was too time-consuming due to the fact that there are as many labels as words), (b), the Last.fm dataset (c) and the urn model with semantic triggering (d). The banner at the bottom(e) shows a sequence for a particular label A of the Last.fm dataset. The color code is red for the actual sequence and blue for the local reshuffle (see methods section) of .

**Figure 4. Normalized entropy and distribution of triggering intervals in single books of the Gutenberg dataset.**
(a), (b), (c) Normalized entropy of words (as described in the main text) vs. the number of words occurrence, k. The entropy is averaged for each k over the words with the same number of occurrences. Results are displayed for the texts David Copperfield (a), Moby Dick (b) and Iliad (in the original greek version) (c). (d), (e), (f) Results for the distribution of triggering intervals for the same data as above. In all the cases, results for the actual data are compared with the two null models described in the section Methods.

**Figure 5. Models.**
Simple urn model with triggering (a), (b) and urn model with semantic triggering (c), (d). (a) Generic reinforcement step of the evolution. An element (the gray ball) that had previously been drawn from the urn is drawn again. In this case one adds this element to (depicted at the center of the figure) and, at the same time, puts ρ additional gray balls into . (b) Generic adjacent possible step of the evolution. Here, upon drawing a new ball (red) from , ν + 1 brand new balls are added to along with the ρ red balls of the reinforcement step that takes place at each time step. (c), (d) Urn model with semantic triggering. Same as above except that now each ball has a label defining its semantic context. The label is conserved during a reinforcement event (e.g., the label A for the gray balls on panel c) while it appears as a brand new label, C, for the ν + 1 balls added for an adjacent possible event (panel d).

**Figure 6. Entropy and intervals example.**
Let us indicate with the same letters the occurrences, e.g., of lyrics of the same artist in the sequence. Suppose that A has just appeared in the sequence, which ends with G. Thus, A appears 4 times, i.e., k = 4. We divide the subsequence in 4 parts and count the occurrences *f_i* of A in each of them (bottom numbers). The normalized entropy of A will be . As a value of S(k) we average all entropies of the elements occurring k-times in . The numbers at the top show the length of the inter-times used in the interval distribution evaluation. The *local* reshuffling would shuffle only those 15 elements occurring after the first occurrence of A, and compute the normalized entropy and the time intervals distribution on this reduced sequence.

See this image and copyright information in PMC

References

1. Lazer D. et al. Computational social science. Science 323, 721–723 (2009). - PMC - PubMed
1. Kauffman S. A. Investigations: The Nature of Autonomous Agents and the Worlds They Mutually Create. SFI working papers (Santa Fe Institute, 1996).
1. Johnson S. Where Good Ideas Come From: The Natural History of Innovation (Riverhead Hardcover, 2010).
1. Wagner A. & Rosen W. Spaces of the possible: universal Darwinism and the wall between technological and biological innovation. J. R. Soc. Interface 11, 20131190 (2014). - PMC - PubMed
1. Kauffman S. A. Investigations (Oxford University Press, New York/Oxford, 2000).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The dynamics of correlated novelties

Affiliations

The dynamics of correlated novelties

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources