Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 20;122(20):e2409426122.
doi: 10.1073/pnas.2409426122. Epub 2025 May 12.

Identifying intermolecular interactions in single-molecule localization microscopy

Affiliations

Identifying intermolecular interactions in single-molecule localization microscopy

Xingchi Yan et al. Proc Natl Acad Sci U S A. .

Abstract

Intermolecular interactions underlie all cellular functions, yet visualizing these interactions at the single-molecule level remains challenging. Single-molecule localization microscopy (SMLM) offers a potential solution. Given a nanoscale map of two putative interaction partners, it should be possible to assign molecules either to the class of coupled pairs or to the class of noncoupled bystanders. Here, we developed a probabilistic algorithm that allows accurate determination of both the absolute number and the proportion of molecules that form coupled pairs. The algorithm calculates interaction probabilities for all possible pairs of localized molecules, selects the most likely interaction set, and corrects for any spurious colocalizations. Benchmarking this approach across a set of simulated molecular localization maps with varying densities (up to ∼55 molecules μm-2) and localization precisions (1 to 50 nm) showed typical errors in the identification of correct pairs of only a few percent. At molecular densities of ∼5 to 10 molecules μm-2 and localization precisions of 20 to 30 nm, which are typical parameters for SMLM imaging, the recall was ∼90%. The algorithm was effective at differentiating between noninteracting and coupled molecules both in simulations and experiments. Finally, it correctly inferred the number of coupled pairs over time in a simulated reaction-diffusion system, enabling determination of the underlying rate constants. The proposed approach promises to enable direct visualization and quantification of intermolecular interactions using SMLM.

Keywords: biomolecular interactions; inverse problem; probabilistic model; single-molecule.

PubMed Disclaimer

Conflict of interest statement

Competing interests statement:The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Defining and estimating the proximity probability Pprox. (A) Illustration of three conformational states of a complex between two fluorescently labeled membrane proteins, A and B, with a variable distance dtrue between the dyes, where the random variable dtrue depends on the conformational state of the proteins and the flexibility of the dye linkers. Created using Biorender. EC and IC indicate extracellular and intracellular regions, respectively. (B) The observed distance dobs is a random variable. Left: The magenta rings represent the Gaussian distribution centered at the true position xA,true of the fluorophore A with localization precision σA. Similarly, the green rings represent the Gaussian distribution at the location of the fluorophore B. The black line denotes the true distance dtrue. Right: The observed positions xA,obs, xB,obs are random variables drawn from these distributions. Magenta and green halos illustrate signals obtained from the two fluorophores. The black line denotes the observed distance dobs. (C) Illustration of how the proximity probability Pprox is estimated by sampling points from a Gaussian distribution and counting the fraction of points that land in an annulus. The Gaussian NxA,obsxB,obs,σA2+σB2, represented by the yellow rings, depends on the observed distance (black line) and localization precisions. The annulus, represented by the gray region, depends on a structural model of the complex AB and its fluorophores, where the lower bound dlower and upper bound dupper are constraints on dobs consistent with colocalization. (D) Pprox can be estimated for arbitrary dobs, σA, σB, dlower, and dupper. For dlower=0, Pprox approaches 1 as dobs, σA, and σB0.
Fig. 2.
Fig. 2.
Graph Matching Optimization (GMO) selects the most probable configuration of molecular pairing. (A) A simulated SMLM image of proteins A (magenta) and B (green). Size of marker correlates with localization precision. (B) A connected component of the bipartite graph constructed for the dataset in (A). A node represents a localization of either A or B. An edge (dashed lines) connects Ai and Bj if their proximity probability pij is positive and their normalized observed distance is less than a data-driven threshold. (C) GMO selects a maximal weight matching (indicated by blue lines), a subgraph that maximizes the sum of pij, and where each node is selected at most once. The matching represents a possible configuration of molecular pairing. (D) Matchings selected in four example scenarios. (I): (A1,B1) is chosen over A1 and B2 if p11>p12; (II): (A1,B1) and (A2,B2) are pairs if p11+p22>p12+p21; (III): (A2,B1) is a pair if p21>p11+p22; and (IV): (A1,B1) and (A2,B2) are pairs if p11+p22>p21.
Fig. 3.
Fig. 3.
Iterative Monte Carlo Estimation of Molecular Couplings and Background Pairings (iMEC) estimates the number of background pairs among a configuration of molecular pairing. (A) Process overview of iMEC: Starting with the most probable configuration with Npairs pairings from GMO, distribute the localizations in an experimentally imaged region uniformly random to estimate the number of background pairs Nbgi. For the next iteration, repeat the process with 2×Nbgi fewer localizations, which is equivalent to deleting Nbgi pairs from the dataset. The number of couplings Ncoupled is estimated successively by Ncoupledi=NpairsNbgi. (B) Two example progressions shown on a plot of Ncoupledi versus Nbgi. Of the 2,000 localizations from A and 2,000 localizations from B, the number of true molecular couplings were Ncoupled=1,545 (blue) and 900 (pink). The lines of slope 1 denote the conserved sum Npairs=Ncoupledi+Nbgi. Arrows show iterative progression to convergence. After 15 iterations, the estimates were Ncoupled15=1,545 (blue) and 903 (pink). (C) Nbgi and Ncoupledi for the examples shown in (B), averaged over 10 Monte Carlo trials.
Fig. 4.
Fig. 4.
Algorithm performance on simulated datasets. Recall rate (GMO), precision (GMO), and error rate (GMO+iMEC) as a function of (A) localization precision and (B) molecular density. At fixed density, precise localizations lead to better performance. At fixed localization precision, recall and precision are better at lower densities, while error remains mostly unchanged. Error bars indicate ±1 SD across 20 trials.
Fig. 5.
Fig. 5.
Validation of the algorithm on simulated and experimental SMLM data. (A) Schematics of Halo-TM-SNAP (positive control) and Halo-β2AR:SNAP-CaaX (negative control)—two test systems of membrane proteins. Halo-TM-SNAP was expected to show significantly more colocalization events than Halo-β2AR:SNAP-CaaX. (B) Confocal images of Halo-TM-SNAP (Left) and Halo-β2AR:SNAP-CaaX (Right) expressed in HEK 293FT cells demonstrating localization of proteins to the plasma membrane. Confocal imaging does not provide information about individual molecular localization, and confocal data were not used in colocalization analysis. (C) Sample regions of experimental SMLM images for Halo-TM-SNAP (Left) and Halo-β2AR:SNAP-CaaX (Right). Paired molecules are highlighted with yellow rectangles. Localizations are shown with PA-JF549-Halo in green and PA-JF646-SNAP in magenta. (D) Percentage of identified colocalizations for Halo-TM-SNAP and Halo-β2AR:SNAP-CaaX. Each dot represents data from a cell; bars represent averages over all cells. Error bars indicate ±1 SD. Dashed horizontal lines denote the means in each scenario. (E) Schematic of imaging planes in confocal imaging and SMLM. As expected, proteins were observed at the plasma membrane. Created using Biorender.
Fig. 6.
Fig. 6.
Validation of the algorithm on simulated nonequilibrium binding dynamics. (A) Schematic of a hypothetical biological system: Following ligand stimulation at t = 0, membrane protein A (blue) becomes active and can bind to membrane protein B (red). EC and IC indicate extracellular and intracellular regions, respectively. Created using Biorender. (B) Schematic of a hypothetical experimental protocol: Ensembles of cells are stimulated with ligand at t = 0 and are fixed at times t1,t2, and t3 for SMLM imaging. (C) Snapshots from a simulation modeling protein binding, A+BAB. AB shown indicate true binding. Snapshots were taken at t = 0, 0.25 s, and 3 s after ligand addition. (D) Top: Density of AB identified by the algorithm at various time points (each time point consists of 10 simulated datasets). The variations in the shaded regions originate from stochastic simulations, while the variations in the error bars are attributed to the inference algorithm. Bottom: Error rate at various time points. Error bars indicate ±1 SD.

Update of

References

    1. Forster T., Energiewanderung und fluoreszenz. Naturwissenschaften 33, 166–175 (1946).
    1. Ha T., et al. , Probing the interaction between two single molecules: Fluorescence resonance energy transfer between a single donor and a single acceptor. Proc. Natl. Acad. Sci. U.S.A. 93, 6264–6268 (1996). - PMC - PubMed
    1. Lane D. P., Crawford L. V., T antigen is bound to a host protein in SY40-transformed cells. Nature 278, 261–263 (1979). - PubMed
    1. Green M. R., Sambrook J., Molecular Cloning: A Laboratory Manual 4th (Cold Spring Harbor Laboratory Press, 2012), vol. I, II, III.
    1. Rhee H. W., et al. , Proteomic mapping of mitochondria in living cells via spatially restricted enzymatic tagging. Science 339, 1328–1331 (2013). - PMC - PubMed

LinkOut - more resources