Knowledge gaps in the early growth of semantic feature networks

Ann E Sizemore¹, Elisabeth A Karuza², Chad Giusti³, Danielle S Bassett^{4

5

6

7}

Affiliations

¹ Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
² Department of Psychology, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA.
³ Department of Mathematical Sciences, University of Delaware, Newark, DE, USA.
⁴ Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.
⁵ Department of Physics and Astronomy, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.
⁶ Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.
⁷ Department of Electrical and Systems Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.

PMID: 30333998
PMCID: PMC6186390
DOI: 10.1038/s41562-018-0422-4

Knowledge gaps in the early growth of semantic feature networks

Ann E Sizemore et al. Nat Hum Behav. 2018 Sep.

. 2018 Sep;2(9):682-692.

doi: 10.1038/s41562-018-0422-4. Epub 2018 Sep 7.

Authors

Ann E Sizemore¹, Elisabeth A Karuza², Chad Giusti³, Danielle S Bassett^{4

5

6

7}

Affiliations

¹ Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
² Department of Psychology, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA.
³ Department of Mathematical Sciences, University of Delaware, Newark, DE, USA.
⁴ Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.
⁵ Department of Physics and Astronomy, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.
⁶ Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.
⁷ Department of Electrical and Systems Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA. dsb@seas.upenn.edu.

PMID: 30333998
PMCID: PMC6186390
DOI: 10.1038/s41562-018-0422-4

Abstract

Understanding language learning, and more general knowledge acquisition, requires characterization of inherently qualitative structures. Recent work has applied network science to this task by creating semantic feature networks, in which words correspond to nodes and connections to shared features, then characterizing the structure of strongly inter-related groups of words. However, the importance of sparse portions of the semantic network - knowledge gaps - remains unexplored. Using applied topology we query the prevalence of knowledge gaps, which we propose manifest as cavities within the growing semantic feature network of toddlers. We detect topological cavities of multiple dimensions and find that despite word order variation, global organization remains similar. We also show that nodal network measures correlate with filling cavities better than basic lexical properties. Finally, we discuss the importance of semantic feature network topology in language learning and speculate that the progression through knowledge gaps may be a robust feature of knowledge acquisition.

PubMed Disclaimer

Conflict of interest statement

Competing Interests The authors declare no competing interests.

Figures

**Figure 1:. Knowledge gaps manifesting as topological cavities within the growing semantic feature network.**
*(a)* (Left) Word ordering is given based on the month at which 50% of reported children produce each word. As an example, the word ‘spoon’ is first produced by ≥ 50% of children at 19 months, so it is placed at the appropriate location within the growing complex (purple node, towards the bottom). The word ‘moose’ is similarly placed at the 28 month mark (sienna node, towards the top). (Right) Semantic features connect nouns (corresponding to nodes), forming the semantic feature network. (Center) Combining the binary feature network and word production times creates a growing semantic feature network with nodes entering based on the first month at which ≥ 50% of children can produce the word. *(b)* A ‘knowledge gap’ could be seen as a topological void within the semantic feature network. The connection pattern between ‘balloon’, ‘bear’, ‘cheese’, and ‘banana’ leave a gap within the graph (top), but the addition of the node corresponding to ‘bus’ and its connections fills in the cavity (bottom).

**Figure 2:. Persistent homology distinguishes random from structured generative models of nodefiltered order complexes.**
Representative adjacency matrix (top), associated barcode plot (middle), and average Betti curves (bottom) for the *(a)* constant probability, *(b)* proportional probability, *(c)* modular growth, and *(d)* edge affinity models. Shaded regions in Betti curve plots indicate ±2 standard deviations.

**Figure 3:. Topological cavities form and die within the semantic feature network with a pattern that is resistant to random node reordering.**
*(a)* Barcode and Betti curves for the growing semantic feature network. The word added when the cavity is born (killed) is written on the left (right) of the corresponding bar. (Inset) Graph of persistent cycles with words as nodes in alphabetical order. An edge for each persistent cavity in *(a)* exists from the birth to the death node. Edges are weighted by the persistent cycle lifetime and colored according to the dimension. *(b)* The degree of each node throughout the growth process. Color indicates the number of nodes added. Representative adjacency matrix (top), associated barcode (middle), and average Betti curves (bottom) for the *(c)* randomized nodes, *(d)* decreasing degree, *(e)* distance from v₀, and *(f)* randomized edges models. Shaded areas of Betti curves indicate ±2 standard deviations.

**Figure 4:. Global semantic feature network architecture is consistent across maternal education levels despite local variations.**
(Left) Betti curves with barcodes overlaid and (right) persistent cycle networks for the *(a) secondary*, *(b) college*, and *(c) graduate* growing semantic feature networks. Red arrow in persistent cycle networks indicates a persistent cavity born and killed by the same word pair in each of the three education levels.

**Figure 5:. Number of persistent cycles killed correlates with topological properties instead of lexical features.**
Scatter plots of the number of persistent cycles killed by each node against *(a)* corresponding word length (Spearman correlation coefficient df = 118; *all: r* = −0.0661, p = 0.4734; *secondary: r* = 0.0998, p = 0.2781; *college: r* = −0.0881, p = 0.3386; *graduate: r* = 0.0023, p = 0.9799), *(b)* node degree (Spearman correlation coefficient df = 118; *all: r* = 0.3027, *p <* 0.001; *secondary: r* = 0.2849, p = 0016; *college: r* = 0.3423, *p <* 0.001; *graduate: r* = 0.3730, *p <* 0.001), *(c)* betweenness centrality (Spearman correlation coefficient df = 118; *all: r* = 0.2972, *p <* 0.001; *secondary: r* = 0.2489, p = 0.0061; *college: r* = 0.2995, *p <* 0.001; *graduate: r* = 0.3492, *p <* 0.001), and *(d)* clustering coefficient (Spearman correlation coefficient df = 118; *all: r* = −0.2897, p = 0.0013; *secondary: r* = −0.2766, p = 0.0022; *college: r* = 0.3037, *p <* 0.001; *graduate: r* = −0.3626, *p <* 0.001). Lines of best fit overlaid. *(e)* Example node (outlined in white) that kills multiple cavities while retaining a low clustering coefficient. Triangles formed by the cavity-killed node highlighted, and cycles tessellated outlined in red.

**Figure 6:. Persistent homology detects longevity of topological cavities within node-filtered order complexes.**
*(a)* Example graph G (top) and its clique complex X(G) (bottom) created by filling in cliques, or all-to-all connected subgraphs of G. *(b)* Examples in dimensions 1–3 of cavities enclosed by cycles (closed paths of cliques) (top) and how an added node can tessellate a cycle thus filling in the cavity (bottom). *(c)* The clique complex from *(a)* with an ordering on the nodes (left), and the associated ordered adjacency matrix (right). *(d)* Steps 9–13 in the filtration created by taking the node-filtered order complex of the clique complex X(G) in *(c)* and the shown ordering. At each step a new node is added along with its connections to nodes already present in the complex. *(e)* Barcode (top) and Betti curves (bottom) for the example node-filtered order complex. The barcode shows the lifespan of a persistent cavity as a bar extending from [*birth, death*) node, and the Betti curves count the number of k-dimensional cavities as a function of nodes added. Lavender lines through *(c)*, *(d)*, and *(e)* connect the adjacency matrix row i to the clique complex at step i and to the persistent homology outputs.

See this image and copyright information in PMC

References

1. Duff Fiona J and Hulme Charles. The role of children’s phonological and semantic knowledge in learning to read words. Scientific Studies of Reading, 16(6):504–525, 2012.
1. Ambridge Ben, Kidd Evan, Rowland Caroline F, and Theakston Anna L. The ubiquity of frequency effects in first language acquisition. Journal of child language, 42(2):239–273, 2015. - PMC - PubMed
1. Karuza Elisabeth A, Thompson-Schill Sharon L, and Bassett Danielle S. Local patterns to global architectures: influences of network topology on human learning. Trends in cognitive sciences, 20(8):629–640, 2016. - PMC - PubMed
1. Hills Thomas T, Maouene Mounir, Maouene Josita, Sheya Adam, and Smith Linda. Longitudinal analysis of early semantic networks preferential attachment or preferential acquisition? Psychological Science, 20(6):729–739, 2009. - PMC - PubMed
1. Goldstein Rutherford and Vitevitch Michael S. The influence of clustering coefficient on word-learning: how groups of similar sounding words facilitate acquisition. Frontiers in psychology, 5, 2014. - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Knowledge gaps in the early growth of semantic feature networks

Affiliations

Knowledge gaps in the early growth of semantic feature networks

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources