An information integration theory of consciousness

Giulio Tononi¹

Affiliations

PMID: 15522121
PMCID: PMC543470
DOI: 10.1186/1471-2202-5-42

An information integration theory of consciousness

Giulio Tononi. BMC Neurosci. 2004.

. 2004 Nov 2:5:42.

doi: 10.1186/1471-2202-5-42.

Author

Giulio Tononi¹

Affiliation

¹ Department of Psychiatry, University of Wisconsin, Madison, USA. gtononi@wisc.edu

PMID: 15522121
PMCID: PMC543470
DOI: 10.1186/1471-2202-5-42

Abstract

Background: Consciousness poses two main problems. The first is understanding the conditions that determine to what extent a system has conscious experience. For instance, why is our consciousness generated by certain parts of our brain, such as the thalamocortical system, and not by other parts, such as the cerebellum? And why are we conscious during wakefulness and much less so during dreamless sleep? The second problem is understanding the conditions that determine what kind of consciousness a system has. For example, why do specific parts of the brain contribute specific qualities to our conscious experience, such as vision and audition?

Presentation of the hypothesis: This paper presents a theory about what consciousness is and how it can be measured. According to the theory, consciousness corresponds to the capacity of a system to integrate information. This claim is motivated by two key phenomenological properties of consciousness: differentiation - the availability of a very large number of conscious experiences; and integration - the unity of each such experience. The theory states that the quantity of consciousness available to a system can be measured as the Phi value of a complex of elements. Phi is the amount of causally effective information that can be integrated across the informational weakest link of a subset of elements. A complex is a subset of elements with Phi>0 that is not part of a subset of higher Phi. The theory also claims that the quality of consciousness is determined by the informational relationships among the elements of a complex, which are specified by the values of effective information among them. Finally, each particular conscious experience is specified by the value, at any given time, of the variables mediating informational interactions among the elements of a complex.

Testing the hypothesis: The information integration theory accounts, in a principled manner, for several neurobiological observations concerning consciousness. As shown here, these include the association of consciousness with certain neural systems rather than with others; the fact that neural processes underlying consciousness can influence or be influenced by neural processes that remain unconscious; the reduction of consciousness during dreamless sleep and generalized seizures; and the time requirements on neural interactions that support consciousness.

Implications of the hypothesis: The theory entails that consciousness is a fundamental quantity, that it is graded, that it is present in infants and animals, and that it should be possible to build conscious artifacts.

PubMed Disclaimer

Figures

**Figure 1**
**Effective information, minimum information bipartition, and complexes.** *a. Effective information*. Shown is a single subset S of 4 elements ({1,2,3,4}, blue circle), forming part of a larger system X (black ellipse). This subset is bisected into A and B by a bipartition ({1,3}/{2,4}, indicated by the dotted grey line). Arrows indicate causally effective connections linking A to B and B to A across the bipartition (other connections may link both A and B to the rest of the system X). To measure EI(A→B), maximum entropy H^maxis injected into the outgoing connections from A (corresponding to independent noise sources). The entropy of the states of B that is due to the input from A is then measured. Note that A can affect B directly through connections linking the two subsets, as well as indirectly via X. Applying maximum entropy to B allows one to measure EI(B→A). The effective information for this bipartition is EI(A B) = EI(A→B) + EI(B→A). *b. Minimum information bipartition*. For subset S = {1,2,3,4}, the horizontal bipartition {1,3}/{2,4} yields a positive value of EI. However, the bipartition {1,2}/{3,4} yields EI = 0 and is a minimum information bipartition (MIB) for this subset. The other bipartitions of subset S = {1,2,3,4} are {1,4}/{2,3}, {1}/{2,3,4}, {2}/{1,3,4}, {3}/{1,2,4}, {4}/{1,2,3}, all with EI>0. *c. Analysis of complexes*. By considering all subsets of system X one can identify its complexes and rank them by the respective values of Φ – the value of EI for their minimum information bipartition. Assuming that other elements in X are disconnected, it is easy to see that Φ>0 for subset {3,4} and {1,2}, but Φ = 0 for subsets {1,3}, {1,4}, {2,3}, {2,4}, {1,2,3}, {1,2,4}, {1,3,4}, {2,3,4}, and {1,2,3,4}. Subsets {3,4} and {1,2} are not part of a larger subset having higher Φ, and therefore they constitute complexes. This is indicated schematically by having them encircled by a grey oval (darker grey indicates higher Φ). *Methodological note*. In order to identify complexes and their Φ(S) for systems with many different connection patterns, each system X was implemented as a stationary multidimensional Gaussian process such that values for effective information could be obtained analytically (details in [8]). Briefly, in order to identify complexes and their Φ(S) for systems with many different connection patterns, we implemented numerous model systems X composed of n neural elements with connections CON_ijspecified by a connection matrix CON(X) (no self-connections). In order to compare different architectures, CON(X) was normalized so that the absolute value of the sum of the afferent synaptic weights per element corresponded to a constant value w<1 (here w = 0.5). If the system's dynamics corresponds to a multivariate Gaussian random process, its covariance matrix COV(X) can be derived analytically. As in previous work, we consider the vector X of random variables that represents the activity of the elements of X, subject to independent Gaussian noise R of magnitude c. We have that, when the elements settle under stationary conditions, X = X * CON(X) + cR. By defining Q = (1-CON(X))^-1and averaging over the states produced by successive values of R, we obtain the covariance matrix COV(X) = <X*X> = <Q^t* R^t* R * Q> = Q^t* Q, where the superscript t refers to the transpose. Under Gaussian assumptions, all deviations from independence among the two complementary parts A and B of a subset S of X are expressed by the covariances among the respective elements. Given these covariances, values for the individual entropies H(A) and H(B), as well as for the joint entropy of the subset H(S) = H(AB) can be obtained as, for example, H(A) = (1/2)ln [(2π e)ⁿ|COV(A)|], where |•| denotes the determinant. The mutual information between A and B is then given by MI(A;B) = H(A) + H(B) - H(AB). Note that MI(A:B) is symmetric and positive. To obtain the effective information between A and B within model systems, independent noise sources in A are enforced by setting to zero strength the connections within A and afferent to A. Then the covariance matrix for A is equal to the identity matrix (given independent Gaussian noise), and any statistical dependence between A and B must be due to the *causal* effects of A on B, mediated by the efferent connections of A. Moreover, all possible outputs from A that *could* affect B are evaluated. Under these conditions, EI(A→B) = MI(A^Hmax;B). The independent Gaussian noise R applied to A is multiplied by c_p, the perturbation coefficient, while the independent Gaussian noise applied to the rest of the system is given by c_i, the intrinsic noise coefficient. Here c_p= 1 and c_i= 0.00001 in order to emphasize the role of the connectivity and minimize that of noise. To identify complexes and obtain their capacity for information integration, one considers every subset S of X composed of k elements, with k = 2,..., n. For each subset S, we consider all bipartitions and calculate EI(A B) for each of them. We find the *minimum information bipartition* MIB(S), the bipartition for which the normalized effective information reaches a minimum, and the corresponding value of Φ(S). We then find the *complexes* of X as those subsets S with Φ>0 that are not included within a subset having higher Φ and rank them based on their Φ(S) value. The complex with the maximum value of Φ(S) is the *main complex*. MATLAB functions used for calculating effective information and complexes are at .

formula image — **Figure 1**
**Effective information, minimum information bipartition, and complexes.** *a. Effective information*. Shown is a single subset S of 4 elements ({1,2,3,4}, blue circle), forming part of a larger system X (black ellipse). This subset is bisected into A and B by a bipartition ({1,3}/{2,4}, indicated by the dotted grey line). Arrows indicate causally effective connections linking A to B and B to A across the bipartition (other connections may link both A and B to the rest of the system X). To measure EI(A→B), maximum entropy H^maxis injected into the outgoing connections from A (corresponding to independent noise sources). The entropy of the states of B that is due to the input from A is then measured. Note that A can affect B directly through connections linking the two subsets, as well as indirectly via X. Applying maximum entropy to B allows one to measure EI(B→A). The effective information for this bipartition is EI(A B) = EI(A→B) + EI(B→A). *b. Minimum information bipartition*. For subset S = {1,2,3,4}, the horizontal bipartition {1,3}/{2,4} yields a positive value of EI. However, the bipartition {1,2}/{3,4} yields EI = 0 and is a minimum information bipartition (MIB) for this subset. The other bipartitions of subset S = {1,2,3,4} are {1,4}/{2,3}, {1}/{2,3,4}, {2}/{1,3,4}, {3}/{1,2,4}, {4}/{1,2,3}, all with EI>0. *c. Analysis of complexes*. By considering all subsets of system X one can identify its complexes and rank them by the respective values of Φ – the value of EI for their minimum information bipartition. Assuming that other elements in X are disconnected, it is easy to see that Φ>0 for subset {3,4} and {1,2}, but Φ = 0 for subsets {1,3}, {1,4}, {2,3}, {2,4}, {1,2,3}, {1,2,4}, {1,3,4}, {2,3,4}, and {1,2,3,4}. Subsets {3,4} and {1,2} are not part of a larger subset having higher Φ, and therefore they constitute complexes. This is indicated schematically by having them encircled by a grey oval (darker grey indicates higher Φ). *Methodological note*. In order to identify complexes and their Φ(S) for systems with many different connection patterns, each system X was implemented as a stationary multidimensional Gaussian process such that values for effective information could be obtained analytically (details in [8]). Briefly, in order to identify complexes and their Φ(S) for systems with many different connection patterns, we implemented numerous model systems X composed of n neural elements with connections CON_ijspecified by a connection matrix CON(X) (no self-connections). In order to compare different architectures, CON(X) was normalized so that the absolute value of the sum of the afferent synaptic weights per element corresponded to a constant value w<1 (here w = 0.5). If the system's dynamics corresponds to a multivariate Gaussian random process, its covariance matrix COV(X) can be derived analytically. As in previous work, we consider the vector X of random variables that represents the activity of the elements of X, subject to independent Gaussian noise R of magnitude c. We have that, when the elements settle under stationary conditions, X = X * CON(X) + cR. By defining Q = (1-CON(X))^-1and averaging over the states produced by successive values of R, we obtain the covariance matrix COV(X) = <X*X> = <Q^t* R^t* R * Q> = Q^t* Q, where the superscript t refers to the transpose. Under Gaussian assumptions, all deviations from independence among the two complementary parts A and B of a subset S of X are expressed by the covariances among the respective elements. Given these covariances, values for the individual entropies H(A) and H(B), as well as for the joint entropy of the subset H(S) = H(AB) can be obtained as, for example, H(A) = (1/2)ln [(2π e)ⁿ|COV(A)|], where |•| denotes the determinant. The mutual information between A and B is then given by MI(A;B) = H(A) + H(B) - H(AB). Note that MI(A:B) is symmetric and positive. To obtain the effective information between A and B within model systems, independent noise sources in A are enforced by setting to zero strength the connections within A and afferent to A. Then the covariance matrix for A is equal to the identity matrix (given independent Gaussian noise), and any statistical dependence between A and B must be due to the *causal* effects of A on B, mediated by the efferent connections of A. Moreover, all possible outputs from A that *could* affect B are evaluated. Under these conditions, EI(A→B) = MI(A^Hmax;B). The independent Gaussian noise R applied to A is multiplied by c_p, the perturbation coefficient, while the independent Gaussian noise applied to the rest of the system is given by c_i, the intrinsic noise coefficient. Here c_p= 1 and c_i= 0.00001 in order to emphasize the role of the connectivity and minimize that of noise. To identify complexes and obtain their capacity for information integration, one considers every subset S of X composed of k elements, with k = 2,..., n. For each subset S, we consider all bipartitions and calculate EI(A B) for each of them. We find the *minimum information bipartition* MIB(S), the bipartition for which the normalized effective information reaches a minimum, and the corresponding value of Φ(S). We then find the *complexes* of X as those subsets S with Φ>0 that are not included within a subset having higher Φ and rank them based on their Φ(S) value. The complex with the maximum value of Φ(S) is the *main complex*. MATLAB functions used for calculating effective information and complexes are at .

**Figure 2**
**Effective information matrix and activity states for two complexes having the same value of Φ.** *a. Causal interactions diagram and analysis of complexes*. Shown are two systems, one with a "divergent" architecture (left) and one with a "chain" architecture (right). The analysis of complexes shows that both contain a complex of four elements having a Φ value of 10. *b. Effective information matrix*. Shown is the effective information matrix for the two complexes above. For each complex, all bipartitions are indicated by listing one part (subset A) on the upper row and the complementary part (subset B) on the lower row. In between are the values of effective information from A to B and from B to A for each bipartition, color-coded as black (zero), red (intermediate value) and yellow (high value). Note that the effective information matrix is different for the two complexes, even though Φ is the same. The effective information matrix defines the set of informational relationships, or "qualia space" for each complex. Note that the effective information matrix refers exclusively to the informational relationships within the main complex (relationships with elements outside the main complex, represented here by empty circles, do not contribute to qualia space). *c. State diagram*. Shown are five representative states for the two complexes. Each is represented by the activity state of the four elements of each complex arranged in a column (blue: active elements; black: inactive ones). The five states can be thought of, for instance, as evolving in time due the intrinsic dynamics of the system or to inputs from the environment. Although the states are identical for the two complexes, their meaning is different because of the difference in the effective information matrix. The last four columns represent four special states, those corresponding to the activation of one element at a time. Such states, if achievable, would correspond most closely to the specific "quale" contributed by that particular element in that particular complex.

**Figure 3**
**Information integration for a thalamocortical-like architecture.** *a. Optimization of information integration for a system that is both functionally specialized and functionally integrated.* Shown is the causal interaction diagram for a network whose connection matrix was obtained by optimization for Φ (Φ = 74 bits). Note the heterogeneous arrangement of the incoming and outgoing connections: each element is connected to a different subset of elements, with different weights. Further analysis indicates that this network jointly maximizes functional specialization and functional integration among its 8 elements, thereby resembling the anatomical organization of the thalamocortical system [8]. *b. Reduction of information integration through loss of specialization*. The same amount of connectivity, distributed homogeneously to eliminate functional specialization, yields a complex with much lower values of Φ (Φ = 20 bits). *c. Reduction of information integration through loss of integration*. The same amount of connectivity, distributed in such a way as to form four independent modules to eliminate functional integration, yields four separate complexes with much lower values of Φ (Φ = 20 bits).

**Figure 4**
**Information integration and complexes for other neural-like architectures.** *a. Schematic of a cerebellum-like organization*. Shown are three modules of eight elements each, with many feed forward and lateral connections within each module but minimal connections among them. The analysis of complexes reveals three separate complexes with low values of Φ (Φ = 20 bits). There is also a large complex encompassing all the elements, but its Φ value is extremely low (Φ = 5 bits). *b. Schematic of the organization of a reticular activating system.* Shown is a single subcortical "reticular" element providing common input to the eight elements of a thalamocortical-like main complex (both specialized and integrated, Φ = 61 bits). Despite the diffuse projections from the reticular element on the main complex, the complex comprising all 9 elements has a much lower value of Φ (Φ = 10 bits). *c. Schematic of the organization of afferent pathways*. Shown are three short chains that stand for afferent pathways. Each chain connects to a port-in of a main complex having a high value of Φ (61 bits) that is thalamocortical-like (both specialized and integrated). Note that the afferent pathways and the elements of the main complex together constitute a large complex, but its Φ value is low (Φ = 10 bits). Thus, elements in afferent pathways can affect the main complex without belonging to it. *d. Schematic of the organization of efferent pathways*. Shown are three short chains that stand for efferent pathways. Each chain receives a connection from a port-out of the thalamocortical-like main complex. Also in this case, the efferent pathways and the elements of the main complex together constitute a large complex, but its Φ value is low (Φ = 10 bits). *e. Schematic of the organization of cortico-subcortico-cortical loops*. Shown are three short chains that stand for cortico-subcortico-cortical loops, which are connected to the main complex at both ports-in and ports-out. Again, the subcortical loops and the elements of the main complex together constitute a large complex, but its Φ value is low (Φ = 10 bits). Thus, elements in loops connected to the main complex can affect it without belonging to it. Note, however, that the addition of these three loops slightly increased the Φ value of the main complex (from Φ = 61 to Φ = 63 bits) by providing additional pathways for interactions among its elements.

**Figure 5**
**Information integration and complexes after anatomical and functional disconnections.** *a. Schematic of a split-brain-like anatomical disconnection*. *Top*. Shown is a large main complex obtained by connecting two thalamocortical-like subsets through "callosum-like" reciprocal connections. There is also a single element that projects to all other elements, representing "subcortical" common input. Note that the Φ value for the main complex (16 elements) is high (Φ = 72 bits). There is also a larger complex including the "subcortical" element, but its Φ value is low (Φ = 10). *Bottom*. If the "callosum-like" connections are cut, one obtains two 8-element complexes, corresponding to the two "hemispheres", whose Φ value is reduced but still high (Φ = 61 bits). The two "hemispheres" still share some information due to common input from the "subcortical" element with which they form a large complex of low Φ. *b. Schematic of a functional disconnection. Top*. Shown is a large main complex obtained by linking with reciprocal connections a "supramodal" module of four elements (cornerstone) with a "visual" module (to its right) and an "auditory" module (below). Note that there are no direct connections between the "visual" and "auditory" modules. The 12 elements together form a main complex with Φ = 61 bits. *Bottom*. If the "auditory" module is functionally disconnected from the "supramodal" one by inactivating its four elements (indicated in blue), the main complex shrinks to include just the "supramodal" and "visual" modules. In this case, the Φ value is only minimally reduced (Φ = 57 bits).

See this image and copyright information in PMC

References

1. Tononi G, Edelman GM. Consciousness and complexity. Science. 1998;282:1846–1851. doi: 10.1126/science.282.5395.1846. - DOI - PubMed
1. Tononi G. Information measures for conscious experience. Arch Ital Biol. 2001;139:367–371. - PubMed
1. Tononi G. Consciousness and the brain: Theoretical aspects. In: Adelman G, Smith, B, editor. Encyclopedia of Neuroscience. 3. Elsevier; 2004.
1. Shannon CE, Weaver W. The mathematical theory of communication. Urbana: University of Illinois Press; 1963.
1. Sperry R. Consciousness, personal identity and the divided brain. Neuropsychologia. 1984;22:661–673. doi: 10.1016/0028-3932(84)90093-9. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- ClinicalTrials.gov
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An information integration theory of consciousness

Affiliation

An information integration theory of consciousness

Author

Affiliation

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Research Materials