Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Nov 22:10:49.
doi: 10.3389/fninf.2016.00049. eCollection 2016.

Pyrcca: Regularized Kernel Canonical Correlation Analysis in Python and Its Applications to Neuroimaging

Affiliations

Pyrcca: Regularized Kernel Canonical Correlation Analysis in Python and Its Applications to Neuroimaging

Natalia Y Bilenko et al. Front Neuroinform. .

Abstract

In this article we introduce Pyrcca, an open-source Python package for performing canonical correlation analysis (CCA). CCA is a multivariate analysis method for identifying relationships between sets of variables. Pyrcca supports CCA with or without regularization, and with or without linear, polynomial, or Gaussian kernelization. We first use an abstract example to describe Pyrcca functionality. We then demonstrate how Pyrcca can be used to analyze neuroimaging data. Specifically, we use Pyrcca to implement cross-subject comparison in a natural movie functional magnetic resonance imaging (fMRI) experiment by finding a data-driven set of functional response patterns that are similar across individuals. We validate this cross-subject comparison method in Pyrcca by predicting responses to novel natural movies across subjects. Finally, we show how Pyrcca can reveal retinotopic organization in brain responses to natural movies without the need for an explicit model.

Keywords: Python; canonical correlation analysis; covariance analysis; cross-subject alignment; fMRI; partial least squares regression.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pyrcca workflow. [1] The Pyrcca Python module is imported using the command import rcca. [2] A CCA object is initialized in one of two ways. If specific hyperparameters (the regularization coefficient and the number of canonical components) are used, the rcca.CCA class object is initialized. If the hyperparameters are chosen empirically using cross-validation, then the rcca.CCACrossValidate class object is initialized. [3] The CCA mapping is estimated using the rcca.train() method with training datasets dataset1, dataset2, etc. [4] Once the CCA mapping is estimated, its accuracy can be tested using the method rcca.validate() with held-out datasets vdataset1, vdataset2, etc. [5] The variance explained by each estimated canonical component for each feature in the held-out data is computed using the method rcca.compute_ev() with held-out datasets vdataset1, vdataset2, etc.
Figure 2
Figure 2
Accuracy of cross-subject prediction with Pyrcca. (A) Cross-subject prediction performance for subject 1 plotted on a flattened cortical map. The cortical map was created by digitally inflating the cortical surface of each hemisphere of the brain of subject 1, and then making relaxation cuts to create a flat map. Only the occipital lobe is shown here. Known regions of interest were identified in a separate retinotopic mapping experiment and are outlined in white. Each location in the cortical map represents a single voxel. The color of each voxel corresponds to the correlation between the held-out BOLD responses and responses predicted from the corresponding BOLD responses of subjects 2 and 3, and the estimated canonical components. Correlations for voxels in which prediction accuracy fell below the significance threshold (p < 0.05, corrected for multiple comparisons) are set to 0. The subject's responses are well predicted for voxels throughout the visual cortex. (B) Cross-subject prediction performance for all subjects plotted as an overlayed histogram. The correlations for subject 1 are plotted in red, the correlations for subject 2 are plotted in blue, and the correlations for subject 3 are plotted in green. The black vertical line indicates the threshold of statistical significance (p < 0.05, corrected for multiple comparisons). Cross-subject prediction accuracy is consistent across subjects. This figure demonstrates that Pyrcca can be used to accurately predict BOLD responses to novel visual stimuli based on cross-subject similarity.
Figure 3
Figure 3
Cortical map of voxelwise canonical weights. The canonical weights for all three canonical components estimated by Pyrcca are shown on a flattened cortical map for subject 1. Each of the canonical components is assigned to one color channel. The first canonical component is represented by the red channel, the second canonical component is represented by green, and the third canonical component is represented by blue. Thus, the color of each voxel reflects its canonical weights for all three canonical components, as shown in the three-dimensional RGB colormap at the center of the figure. Canonical weights have been rescaled to span the range from zero to one, and the absolute value of the weights has been taken. This map shows how the BOLD responses of each voxel are described by the three canonical components. The recovered map reveals retinotopic organization of the visual cortex.
Figure 4
Figure 4
Cortical maps of canonical weights and variance explained by each canonical component. Each panel shows both the canonical weights for one of the estimated canonical components and the variance of the held-out BOLD responses that was explained by that canonical component. Each voxel is colored according to a two-dimensional colormap shown in the center of each panel. The hue represents the canonical weight of each voxel. Blue indicates negative canonical weights, white indicates zero weights, and red indicates positive canonical weights. The canonical weights have been rescaled to span the range from −1 to 1. The brightness reflects the variance of each voxel's held-out BOLD responses that is explained by that canonical component. The variance ranges from 0 to 75%. (A) The first component best explains responses of the voxels that represent the visual periphery. (B) The second component best explains a contrast between responses of voxels located in V1 and those located in MT+ and intraparietal sulcus. (C) The third component explains a contrast between voxels that represent the visual fovea and those located in MT+ and intraparietal sulcus.

References

    1. Barnett T. P., Preisendorfer R. (1987). Origins and levels of monthly and seasonal forecast skill for United States surface air temperatures determined by canonical correlation analysis. Monthly Weather Rev. 115, 1825–1850.
    1. Collette A. (2013). Python and HDF5. Sebastopol, CA: O'Reilly Media, Inc.
    1. Conroy B. R., Singer B. D., Guntupalli J. S., Ramadge P. J., Haxby J. V. (2013). Inter-subject alignment of human cortical anatomy using functional connectivity. NeuroImage 81, 400–411. 10.1016/j.neuroimage.2013.05.009 - DOI - PMC - PubMed
    1. Correa N. M., Adali T., Li Y.-O., Calhoun V. D. (2010). Canonical correlation analysis for data fusion and group inferences. IEEE Signal Proc. Magazine 27, 39–50. 10.1109/MSP.2010.936725 - DOI - PMC - PubMed
    1. Dong L., Zhang Y., Zhang R., Zhang X., Gong D., Valdes-Sosa P. A., et al. . (2015). Characterizing nonlinear relationships in functional imaging data using eigenspace maximal information canonical correlation analysis (emiCCA). NeuroImage 109, 388–401. 10.1016/j.neuroimage.2015.01.006 - DOI - PubMed