Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 3;220(5):e202010003.
doi: 10.1083/jcb.202010003.

Parameter-free molecular super-structures quantification in single-molecule localization microscopy

Affiliations

Parameter-free molecular super-structures quantification in single-molecule localization microscopy

Mattia Marenda et al. J Cell Biol. .

Abstract

Understanding biological function requires the identification and characterization of complex patterns of molecules. Single-molecule localization microscopy (SMLM) can quantitatively measure molecular components and interactions at resolutions far beyond the diffraction limit, but this information is only useful if these patterns can be quantified and interpreted. We provide a new approach for the analysis of SMLM data that develops the concept of structures and super-structures formed by interconnected elements, such as smaller protein clusters. Using a formal framework and a parameter-free algorithm, (super-)structures formed from smaller components are found to be abundant in classes of nuclear proteins, such as heterogeneous nuclear ribonucleoprotein particles (hnRNPs), but are absent from ceramides located in the plasma membrane. We suggest that mesoscopic structures formed by interconnected protein clusters are common within the nucleus and have an important role in the organization and function of the genome. Our algorithm, SuperStructure, can be used to analyze and explore complex SMLM data and extract functionally relevant information.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Working principle of SuperStructure analysis. Left: SMLM data are taken as input for the analysis. Center left: Cluster analysis is run using the DBSCAN algorithm with Nmin=0 and ε progressively increasing in an adequate range for the system. SuperStructure curves describing the number of detected clusters Nc as a function of ε are generated. Center right: SuperStructure curves are plotted and inspected to identify super-cluster regimes representing the onset of connected structures. Right: Intra- and super-cluster regimes are fitted with our models (see Materials and methods) to quantify the emitter density inside clusters ρem and the connectivity among clusters (via the decay length λi for super-cluster regime i).
Figure S1.
Figure S1.
The Poissonian functional form in the intra-cluster regime. (A) To test the Poissonian functional form (Eq. 1) of the intra-cluster regime of SuperStructure curves, we simulated localizations inside clusters as a uniform distribution of Nem points distributed within a circle of radius Rcl. The resulting average density is ρem. The number of points included in any circular subregion of radius ε is, on average, nε=πρemε2, and is in fact itself Poisson distributed. (B) To check the theoretical prediction of Eq. 1, we have created simulated datasets for various ρem and Nem. The theoretical predictions (dotted lines) with m=2 are in good agreement with the SuperStructure curves, indicating that indeed Eq. 1 correctly captures the behavior of uniformly distributed points forming one idealized cluster. However, note that for m=2, there is already an overcounting of clusters at large values of ε due to the fact that DBSCAN merges indirectly related emitters in a single big cluster. This suggests not to extend the summation to higher values of m. From Eq. 1, the end of the intra-cluster regime can be approximated by the width of the Poisson function, i.e., ε*3κ0 (at 99% confidence level), where κ0=1/πρem is the decay length identified by Eq. 1. This is confirmed by observing that predicted ε* for the curves are ε*(ρem=2,000 μm2)38  nm, ε*(ρem=10,000  μm2)18  nm, and ε*(ρem=100,000 μm2)5.3  nm, which correspond to Ncl/Nem103 (when most of the points have been merged in a single cluster).
Figure 2.
Figure 2.
Evaluating SuperStructure on simulated datasets. (A) Sketch representing the artificial dataset consisting of interconnected clusters of localizations on a 2D plane. Clusters are characterized by an internal density of localizations ρem and radius Rcl and are randomly distributed on the plane at an average cluster density ρcl.  Clusters can be connected by a sparse point distribution with probability pr, and connections have a density of points ρconn (controlled by the prconn parameter). (B) Average SuperStructure curves (zoomed in the inset) for simulated datasets with different connectivity pr. Other parameters are kept fixed: average cluster radius Rcl40  nm, emitter density within clusters ρem=16,000 μm2, cluster density ρcl=8.2  μm2, and prconn=0.5 (which fixes the density of emitters within connections ρconn). The curves show the number of detected clusters normalized by the total number of localizations. Curves are the average of 20 independent simulated datasets. Shaded regions represent the standard deviation from the average. Three regimes can be distinguished: (1) the intra-cluster (red), (2) the first super-cluster (yellow), and (3) the second super-cluster (blue). The decay in the intra-cluster regime corresponds to a Poisson avoidance function with density parameter ρem=16,000 μm2 (Eq. 1, dotted line in the inset). The first super-cluster regime can be fitted by a single exponential (Eq. 4, dashed line in the inset) which returns an effective decay length λ. The second super-cluster regime can be fitted with another exponential if pr0 (Eq. 4, dashed line in the main figure). In case of pr=0, there is only one super-cluster regime, and it follows a Poisson function with density parameter ρcl=8.2 μm2  (Eq. 3, dotted line in the main figure). (C) Snapshots of detected clusters for an artificial dataset with connectivity pr=0.004 and by progressively increasing the value of the radius ε=4, 24, 44, 84 nm. (D) Decay length λ versus cluster density ρcl scales as ρcl0.5 for any value of connectivity pr. (E) Decay length λ versus connectivity pr scales as pr0.3 for different values of ρcl. In D and E, 20 independent datasets were fitted with Eq. 4, and the resulting λ values were averaged. Vertical bars represent the standard deviation from the average.
Figure S2.
Figure S2.
Average SuperStructure curves for simulated datasets in different conditions. SuperStructure analysis was run on 20 independent datasets (each in the same condition), and the resulting curves were then averaged. Shaded regions represent the standard deviation from the average. Parameters are set to their standard values if not otherwise specified (see Materials and methods). Palettes in the inset configurations represent cluster analysis at ε=80  nm. (A) Locally connected clusters with different grades of connectivity and doubling the cluster density (from left to right): ρcl=8.2 μm2 (left) and ρcl=16.3 μm2 (right), connection density prconn=0.5, and no noise and different values of connectivity pr. The higher cluster density makes SuperStructure curves more markedly distinct as a function of pr compared with the same curves for a lower density. (B) Locally connected clusters with low connectivity and increasing cluster density: connectivity pr=0.002, connection density prconn=0.5, and no noise and different cluster densities ρcl. The first super-cluster regime maintains the single exponential decay, but the decay length λ decreases with the cluster density. In the main text, we showed that this dependence goes as λρcl1/2. Also, the exponential decay λ2 of the second super-cluster regime decreases with the density of clusters, and this regime evolves from a Poisson-like (low ρcl) to an exponential decay (high ρcl). This behavior seems to be a pure effect of the cluster density, as all other parameters remain unchanged. Black curves are Poisson decays attempts eπρε2 to fit the second super-cluster regime. (C) Locally connected clusters with different grades of connectivity and sparse noise addition: cluster density ρcl=8.2 μm2, connection density prconn=0.5, noise density ρn=0 μm2 (left)/ρn=64 μm2 (right), and different values of connectivity pr. With high noise (eight times the cluster density), the second super-cluster regime becomes Poissonian; the first super-cluster regime maintains its typical exponential decay, but the decay length is altered. Dotted lines represent fit with Eq. 3 for ε[70:300] nm. (D) Unconnected clusters of points with increasing density of noise (other parameters are the same as C). Eq. 3 well describes the decay of the curves in the intercluster regime, with the density parameter ρcl and ρcl+ρn, respectively, in absence and presence of noise. (E) Average decay length of the first super-cluster regime for the connected systems represented in C as function of noise density ρn. The fit to calculate the decay length λ  has been made for ε[20,60] nm for 20 independent datasets. Values of λ are then averaged. Bars represent the standard deviation from the average. Decay lengths for systems with different connectivities pr are distinguishable as long as the noise density is below the connection density (~500 μm−2). However, low noise density also alters the estimation of the decay length. The alteration is less severe for highly connected clusters. (F) Fully connected meshes of clusters with increasing density of the mesh: cluster density ρcl= 8.2 μm2, connectivity p=0.025, and no noise and different values of connection density prconn. The super-cluster regime is unique, the decay is exponential, and the decay length λ  decreases with the density of the mesh. Fit of λ was performed for ε[20:60] nm. The inset shows the dependence of λ on prconn in a fully connected mesh, which is λprconn0.74.
Figure 3.
Figure 3.
Application of SuperStructure algorithm to SAF-A, hnRNP-C, and SC35 SMLM data. (A) Reconstructed dSTORM images by using the shifted histograms method with a pixel size of 10.6  nm. Insets of 4-μm2 size of reconstructed dSTORM images and spatial positions of the data. Palettes represent the cluster ID computed by running SuperStructure with Nmin=0 and ε  at the start of the first super-cluster regime. (B) Identified clusters for increasing values of ε  in the regimes where clusters merge. (C) Normalized average SuperStructure curves in the range [0:150]  nm. The number of detected clusters has been normalized with the total number of localizations in the system. The average is calculated over six independent datasets (nuclei). Solid curves indicate that SuperStructure analysis was run on the entire nucleus, and the resulting curves for the six independent datasets were averaged (all-nucleus curves). Dashed curves indicate that SuperStructure analysis was run in five local ROIs for each of the six nuclei, and then the curves of each region (for each nucleus) were averaged (local curves). In hnRNP-C, these local regions were chosen within the nuclear mesh (to exclude nucleoli), and in SC35, they were chosen within speckles. Vertical dashed lines highlight different SuperStructure regimes: intra-cluster, first super-cluster, and second super-cluster regime. For SAF-A and hnRNP-C, the exponential regime of clusters merging (first super-cluster regime) is highlighted with a solid straight line. In case of SC35, two regimes are highlighted, the merging of clusters within speckles (first super-cluster regime) and the merging of speckles with isolated clusters (second super-cluster regime). (D) Normalized all-nucleus average SuperStructure curves in the range [0:200] nm for the three proteins. Average is computed over six nuclei. Shaded regions represent standard deviation from the average. Poisson fits (Eq. 1) for the intra-cluster regime at small ε are shown in the inset. (E) Intra-cluster density of emitters ρem as parameter of Poisson fit for six independent nuclei (Eq. 1). (F) Normalized decay length λ* for the super-cluster regimes highlighted in C for six independent nuclei. SuperStructure curves were fit with Eq. 4 to extract the decay length λ, and then the normalization λ*=λ/ρcl1/2 was performed (where ρcl is the detected cluster density at the beginning of each regime of interest). Details are explained in Materials and methods and Fig. S3. P values were calculated using a Student’s t test: ns, P > 0.05; **, P < 0.01; ***, P < 0.001.
Figure S3.
Figure S3.
Details on λ normalisation and proof that connections are not technical artifacts in nuclear protein data. (A) dSTORM reconstructed images of SAF-A, hnRNP-C, and SC35 in a single cell where local circular regions for cluster density estimation purpose are highlighted. In the case of SC35, two different region types are used, one inside speckles for the first exponential regime and one outside speckles for the second exponential regime. In the case of hnRNP-C and SC35, local circular regions were also used to compute SuperStructure local curves and the decay length λ in the first super-cluster regime, as explained in Materials and methods. (B) Average SuperStructure curves for SAF-A, hnRNP-C, and SC35 are shown and explained in the main text. Solid lines are the result of all-nucleus analysis, while dashed lines are the result of a local analysis (in local circular regions). Exponential regimes of interest are highlighted, as well as the values of ε at which the cluster analysis is made for cluster density estimation purposes (purple dashed vertical line). (C) Check that connections are not the result of technical artifacts due to bad blinking quality both in SAF-A and hnRNP-C data by monitoring λ (left) and λ* (right) for different cluster densities ρcl. The bad blinking quality of fluorophores would lead to localization inaccuracy of emitters at the borders of protein clusters, and this in turn could lead to pseudo-connections between clusters. However, these pseudo-connections would be proportional to the cluster density; a higher cluster density would result in stronger pseudo-connections, which would reflect to a decrease of λ* with the cluster density. λ,  ρcl, and λ* were calculated for the six independent nuclei, as explained in Materials and methods, and are shown in Table S1. Every nucleus can be considered as a system where the blinking conditions are the same, but cluster densities may vary due to statistical fluctuations. While λ (left) decreases with ρcl,  as expected, λ* (right) is constant for different densities, ruling out the hypothesis that connections are artifacts due to bad blinking quality.
Figure 4.
Figure 4.
Application of SuperStructure algorithm to ceramide data. Analysis was performed on published data (Burgert et al., 2017). (A) dSTORM reconstruction of ceramides dataset using the shifted histogram method. The left panel represents signal from cells treated with bSMase; the right panel is a control without treatment. (B) SuperStructure curves of the two conditions for the entire dataset. Curves show the number of detected clusters normalized by the total number of localizations. The red region highlights the intra-cluster regime, while the blue region highlights the Poissonian unconnected super-cluster regime. The shaded purple region highlights the horizontal shift between the two curves. Dashed lines represent Poisson fits at low and high ε. (C–E) Average density of total localizations (C), intra-cluster density extracted as parameter from Poisson fit (Eq. 1; D), and overall density in the super-cluster regime extracted as parameter from Poisson fit (Eq. 3; E) for +bSMase and −bSMase treatment datasets. Calculations and fits were performed on data and SuperStructure curves from 16 independent circular regions of radius r=1.5 μm within the original dataset. P values were calculated using a Student’s t test: **, P < 0.01; ***, P < 0.001.
Figure S4.
Figure S4.
Absence of local connectivity and confirmation of original paper results in ceramide data. (A and B) The absence of local connectivity was confirmed by analyzing cluster density (A) and sparse localization density (B) in the crossover range. We monitored the density of ceramides clusters and that of free emitters at ε1=20  nm and ε2=36 nm. To calculate cluster density, DBSCAN was run at Nmin=0 and at the given value of ε, and we kept only clusters with at least 10 particles. The remaining particles were considered as free localizations. Clusters and free localizations were detected at Nmin=0 for 16 independent circular regions. The number of clusters remains constant in the considered ε regime, while the free localizations density significantly decreases (more severely for −bSMase cells). As a consequence, we can state that there is no significant merging of ceramide clusters, only embedding of nearby free localizations in already-formed clusters. (C and D) Confirmation of the original paper’s results by calculating the ceramide cluster size both as gyration radius (C) and number of emitters (D). Protein clusters were detected at Nmin=0 at ε+=20 nm and ε=24 nm. In accordance with the analysis in the paper, we looked at the size of clusters with a radius >30 nm. Note that +bSMase ceramide clusters consist of (on average) 180 emitters in a circle of radius 42 nm. The resulting density is 32,500 μm−2. This result is approximately in line with our prediction obtained with the Poisson intra-cluster fit by considering that the standard deviation of both cluster radius and emitters is high. Similarly, −bSMase clusters have on average 78 emitters in an average cluster radius of 40 nm. The resulting density is 15,500 μm−2.
Figure S5.
Figure S5.
Size and shape estimation of local super-structures emerging in SC35 dSTORM data (i.e., nuclear speckles) by using both SuperStructure and SR-Tesseler. Analysis was performed on a single cell as proof of concept. (A) Super-structure detection using SR-Tesseler software, a segmentation framework based on Voronoï tessellation (constructed from the localization coordinates). Adjustments of the density factor allows the detection of structures at different density levels, such as clusters (violet) or speckles (yellow). Blue dots represent no-segmented localizations. The software was downloaded from https://github.com/flevet/SR-Tesseler/releases/tag/v1.0 and run on a Windows operating system. (B) SuperStructure curve of the same data. Analysis of decay regimes allows the identification of ε=40  nm as a suitable value for super-structure identifications. (C) Identified clusters at ε=40  nm with SuperStructure. Speckle detections are visually compatible with those of SR-Tesseler. (D and E) Radius and circularity of super-structures using both SR-Tesseler and SuperStructure. Both radius and circularity are very similar, showing the power of SuperStructure in computing shape and size properties. In the analysis, we considered the 20 largest identified structures (i.e., speckles). For SuperStructure, the 2D symmetric gyration tensor R2 was computed and diagonalized for identified super-structures. The gyration tensor components Rxy2  are defined as Rxy2=12N2i=1Nj=1N(xixj)(yiyj), where N is the total number of localizations in a superstructure, and xi and yi are the x and y positions of the localization i. The diagonalization is necessary to obtain the square of the major and minor semi-axis of the speckles, namely γ1 and γ2. We then calculated the speckle radius Rg=γ1+γ2 and their circularity c=|γ1γ2|γ1+γ2.  For SR-Tessler, radius and circularity parameters were obtained as output after Voronoï tessellation. P values were calculated using a Student’s t test: ns, P > 0.05; *, P < 0.05.

References

    1. Baumgart, F., Arnold A.M., Leskovar K., Staszek K., Fölser M., Weghuber J., Stockinger H., and Schütz G.J.. 2016. Varying label density allows artifact-free analysis of membrane-protein nanoclusters. Nat. Methods. 13:661–664. 10.1038/nmeth.3897 - DOI - PMC - PubMed
    1. Beliveau, B.J., Boettiger A.N., Avendaño M.S., Jungmann R., McCole R.B., Joyce E.F., Kim-Kiselak C., Bantignies F., Fonseka C.Y., Erceg J., et al. 2015. Single-molecule super-resolution imaging of chromosomes and in situ haplotype visualization using Oligopaint FISH probes. Nat. Commun. 6:7147. 10.1038/ncomms8147 - DOI - PMC - PubMed
    1. Bintu, B., Mateo L.J., Su J.H., Sinnott-Armstrong N.A., Parker M., Kinrot S., Yamaya K., Boettiger A.N., and Zhuang X.. 2018. Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science. 362:eaau1783. 10.1126/science.aau1783 - DOI - PMC - PubMed
    1. Boettiger, A.N., Bintu B., Moffitt J.R., Wang S., Beliveau B.J., Fudenberg G., Imakaev M., Mirny L.A., Wu C.T., and Zhuang X.. 2016. Super-resolution imaging reveals distinct chromatin folding for different epigenetic states. Nature. 529:418–422. 10.1038/nature16496 - DOI - PMC - PubMed
    1. Brangwynne, C.P., Tompa P., and Pappu R.V.. 2015. Polymer physics of intracellular phase transitions. Nat. Phys. 11:899–904. 10.1038/nphys3532 - DOI

Publication types

MeSH terms

Substances