Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Feb 19;105(7):2745-50.
doi: 10.1073/pnas.0708424105. Epub 2008 Feb 11.

Bayesian learning of visual chunks by human observers

Affiliations

Bayesian learning of visual chunks by human observers

Gergo Orbán et al. Proc Natl Acad Sci U S A. .

Abstract

Efficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Experimental design. Schematic of scene generation in the experiments. Shapes from the inventory (Left) were organized into combos (pairs in this example). The spatial arrangement of the shapes within each combo was fixed across all scenes. For each familiarization scene, combos were pseudorandomly selected from this inventory and placed in adjacent positions within a rectangular grid (Center). Scenes were presented once every 2 sec during familiarization. The test phase consisted of 2AFC trials in which each of two scenes (Right) depicted a subset of shapes from the familiarization scenes. One subset was a true combo from the inventory (or a part thereof, called an embedded combo) and the other subset consisted of shapes from two different combos (a mixture combo).
Fig. 2.
Fig. 2.
Summary of experimental manipulations [inventories (Top) and test types (Middle)], and discrimination performance (Bottom) of human participants (gray bars with dark shading indicating standard error of the mean, SEM). The predictions of the AL (pink squares) and the BCL (red stars) are shown for a series of experiments from refs and using increasingly complex inventories. (Colors were not included in the actual shapes seen by participants.) For a stringent comparison, the parameters of the AL were adjusted independently for each experiment to obtain best fits, whereas the BCL used a single parameter optimization across all experiments. (A) Inventory containing six equal-frequency pairs. Human performance was above chance on the basic test of true pairs vs. mixture pairs. (B) Inventory containing six pairs of varying frequency. Human performance was above chance on the test of true rare pairs vs. frequency-balanced mixture pairs. (C) Inventory containing four equal-frequency triplets. Human performance was above chance on the basic test of true triplets vs. mixture triplets and at chance on the test of embedded pairs vs. mixture pairs. (D) Inventory containing two quadruples and two pairs, all with equal frequency. Human performance was above chance on the basic tests of true quadruples or pairs vs. mixture quadruples or pairs, and on the test of embedded triplets vs. mixture triplets, but it was at chance on the test of embedded pairs vs. mixture pairs. Both models captured the overall pattern of human performance in all these experiments.
Fig. 3.
Fig. 3.
Correlation-balanced experiment contrasting the BCL and the AL. (A) The inventory of shapes and their rules of combination. The inventory consisted of two groups of four shapes (shaded boxes), and two pairs. Shapes in the first group of 4 were always shown as one of four triplets sharing the same four shapes; shapes in the other group of 4 were shown either as single shapes or sometimes as a quadruple. The numbers below each subset of combos indicates their ratio of presentation across the entire familiarization phase. Shapes in the two groups of 4 had the same occurrence frequencies (1/2) and within-group correlations (1/3). The familiarization scenes were composed and presented the same way as in all prior experiments. (B) The three tests used to assess human performance and the actual performance (gray bars with error bars indicating SEM) along with predictions of the AL (pink bars) and the BCL (red bars). One test contrasted true triplets from the first group of 4 with mixture triplets: human performance was above chance (P < 0.017) and was predicted by both models (Left). The second test contrasted triplets constructed from the shapes of the second group of 4 (“false” triplets) with mixture triplets: human performance was not above chance (P > 0.24), and only the BCL predicted this result (Center). The final test contrasted true with false triplets: human performance was above chance (P < 0.0001), and again only the BCL predicted this result (Right). Significance was assessed by two-tailed Student's t tests.

References

    1. Harris ZS. Structural Linguistics. Chicago: Univ of Chicago Press; 1951.
    1. Peissig JJ, Tarr MJ. Visual object recognition: do we know more now than we did 20 years ago? Annu Rev Psychol. 2007;58:75–96. - PubMed
    1. Chomsky N, Halle M. The Sound Pattern of English. Cambridge, MA: MIT Press; 1968.
    1. Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nat Neurosci. 1999;2:1019–1025. - PubMed
    1. Ullman S, Vidal-Naquet M, Sali E. Visual features of intermediate complexity and their use in classification. Nat Neurosci. 2002;5:682–687. - PubMed

Publication types

LinkOut - more resources