. 2024 Feb 26;15(1):1763.

doi: 10.1038/s41467-024-46106-0.

SEMORE: SEgmentation and MORphological fingErprinting by machine learning automates super-resolution data analysis

Steen W B Bender^{1

2

3}, Marcus W Dreisler^{1

2

3}, Min Zhang^{1

2

3}, Jacob Kæstel-Hansen^{4

5

6}, Nikos S Hatzakis^{7

8

9

10}

Affiliations

¹ Department of Chemistry, University of Copenhagen, Copenhagen, Denmark.
² Center for 4D cellular dynamics, University of Copenhagen, Copenhagen, Denmark.
³ Novo Nordisk Center for Optimised Oligo Escape and Control of Disease, University of Copenhagen, Copenhagen, Denmark.
⁴ Department of Chemistry, University of Copenhagen, Copenhagen, Denmark. jkh@chem.ku.dk.
⁵ Center for 4D cellular dynamics, University of Copenhagen, Copenhagen, Denmark. jkh@chem.ku.dk.
⁶ Novo Nordisk Center for Optimised Oligo Escape and Control of Disease, University of Copenhagen, Copenhagen, Denmark. jkh@chem.ku.dk.
⁷ Department of Chemistry, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.
⁸ Center for 4D cellular dynamics, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.
⁹ Novo Nordisk Center for Optimised Oligo Escape and Control of Disease, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.
¹⁰ Novo Nordisk Center for Protein Research, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.

PMID: 38409214
PMCID: PMC10897458
DOI: 10.1038/s41467-024-46106-0

SEMORE: SEgmentation and MORphological fingErprinting by machine learning automates super-resolution data analysis

Steen W B Bender et al. Nat Commun. 2024.

. 2024 Feb 26;15(1):1763.

doi: 10.1038/s41467-024-46106-0.

Authors

Steen W B Bender^{1

2

3}, Marcus W Dreisler^{1

2

3}, Min Zhang^{1

2

3}, Jacob Kæstel-Hansen^{4

5

6}, Nikos S Hatzakis^{7

8

9

10}

Affiliations

¹ Department of Chemistry, University of Copenhagen, Copenhagen, Denmark.
² Center for 4D cellular dynamics, University of Copenhagen, Copenhagen, Denmark.
³ Novo Nordisk Center for Optimised Oligo Escape and Control of Disease, University of Copenhagen, Copenhagen, Denmark.
⁴ Department of Chemistry, University of Copenhagen, Copenhagen, Denmark. jkh@chem.ku.dk.
⁵ Center for 4D cellular dynamics, University of Copenhagen, Copenhagen, Denmark. jkh@chem.ku.dk.
⁶ Novo Nordisk Center for Optimised Oligo Escape and Control of Disease, University of Copenhagen, Copenhagen, Denmark. jkh@chem.ku.dk.
⁷ Department of Chemistry, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.
⁸ Center for 4D cellular dynamics, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.
⁹ Novo Nordisk Center for Optimised Oligo Escape and Control of Disease, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.
¹⁰ Novo Nordisk Center for Protein Research, University of Copenhagen, Copenhagen, Denmark. hatzakis@chem.ku.dk.

PMID: 38409214
PMCID: PMC10897458
DOI: 10.1038/s41467-024-46106-0

Abstract

The morphology of protein assemblies impacts their behaviour and contributes to beneficial and aberrant cellular responses. While single-molecule localization microscopy provides the required spatial resolution to investigate these assemblies, the lack of universal robust analytical tools to extract and quantify underlying structures limits this powerful technique. Here we present SEMORE, a semi-automatic machine learning framework for universal, system- and input-dependent, analysis of super-resolution data. SEMORE implements a multi-layered density-based clustering module to dissect biological assemblies and a morphology fingerprinting module for quantification by multiple geometric and kinetics-based descriptors. We demonstrate SEMORE on simulations and diverse raw super-resolution data: time-resolved insulin aggregates, and published data of dSTORM imaging of nuclear pore complexes, fibroblast growth receptor 1, sptPALM of Syntaxin 1a and dynamic live-cell PALM of ryanodine receptors. SEMORE extracts and quantifies all protein assemblies, their temporal morphology evolution and provides quantitative insights, e.g. classification of heterogeneous insulin aggregation pathways and NPC geometry in minutes. SEMORE is a general analysis platform for super-resolution data, and being a time-aware framework can also support the rise of 4D super-resolution data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Schematic illustration of SEMORE, an automated pipeline to agnostically cluster and classify temporarily and morphologically distinct protein aggregates.**
a SEMORE input is a set of x, y SMLM input (PALM or STORM) or x, y, t time-resolved SMLM (TR-SMLM) (REPLOM input) coordinates of individual localization/aggregation events, here shown for temporally resolved insulin aggregation imaged using the REPLOM approach on a TIRF microscope. b The first step of SEMORE clusters data by a density-based clustering method in three dimensions of the spatial coordinates, x and y, and time, t. Colours indicate clusters and scale bar shows 10 μm. c The second step is a temporal refinement of the initial clusters to identify and dissect underlying sub-clusters, utilizing a time-directional clustering through the iteration of frames. d The final output of the temporal refinement is a set of individual spatially resolved structures that are now separable even if grown close to other aggregations. e Each identified cluster is fed to a morphology-fingerprinting module that computes four groups of descriptive features, including circularity of the morphology, graph network within the aggregation, general symmetry, and geometric interior. Combined these feature groups construct the individual self-assembly fingerprint of a total of 40+ features. f The calculated morphology fingerprints are stored for each extracted protein assembly allowing for complete quantification and insights into the distribution of heterogeneous morphology or growth pathways. Source data are provided as a Source Data file.

**Fig. 2. Performance evaluation of SEMORE clustering module on classification of three diverse types of morphologies inspired by biological systems.**
a Three classes of time-resolved aggregations were simulated to capture a broad aspect of biological systems (see Methods): isotropic, where aggregates grow radially, where aggregates grow in response to steric hindrance and branching fibrils where aggregates grow linearly followed by branching. The three inserts depict the general pipeline for cluster identification: From left to right Aggregates with diverse final morphologies are produced in a frame-by-frame manner, with the amount and locations of particles randomly drawn based on previous localizations and start and end times randomly drawn. Uniform noise is added in all three dimensions (x, y, time). The model accurately predicts diverse aggregates, showcased by different colours. The black point corresponds to data points predicted as the wrong label, i.e., either noise predicted as an aggregate point or multiple predicted aggregates for the same ground truth label (FP) while the brown points correspond to aggregational locations predicted as noise (FN). b Quantification of operational performance by a confusion matrix. Predictions are shown from 50 experiments for each aggregation type, each containing 10 individual aggregations for isotropic and random, and 25 for fibril growth. Errors are standard deviations calculated across accuracies for each individual aggregate. Common classification metrics for the evaluation are shown in the table on the right side of the corresponding confusion matrix. Source data are provided as a Source Data file.

**Fig. 3. Performance evaluation of Morphology fingerprinting module on three diverse assembly morphologies.**
a The three diverse morphological structures of Fig. 2 are subjected to the morphology fingerprinting module. Each colour represents a cluster but brown-red that represents noise detections. b The derived features are dimensionality reduced by a 3-component UMAP to visualize the separation of the identified clusters in the latent space and the grouping of the diverse morphologies. The dimensionality-reduced features are clustered using DBSCAN to identify groups of fingerprints. The four identified cluster groups are displayed, corresponding to three different simulated aggregational structures, as well as a cluster containing only pure noise. (Spherical zoom, points coloured by frame value), Further analysis of the group corresponding to fibrils by an additional 3-component UMAP and a new DBSCAN (square dashed line zoom on top), identified two local clusters mainly containing branched and non-branched fibrils respectively (see Supplementary Fig. 13) (spherical zoom, points coloured by frame value). c The count of each simulation type is found through a simple investigation of clusters 1 to 4, where cluster 2 only contains data from the fibril simulation and is deemed noise by visual inspection. d Confusion matrix of classification accuracy for each cluster after the removal of noise, Cluster 1 predicts fibril (sensitivity 99.92%, F1 99.96%), 3 random (sensitivity 99.21%, F1 99.31%), and 4 isotropic growth (sensitivity 99.58%, F1 99.38%), resulting in an average F1 score at 99.55 ± 0.21%, clearly showing the descriptive information of morphology captured within the fingerprinting. Errors are standard deviations calculated across all aggregates. Source data are provided as a Source Data file.

**Fig. 4. The SEMORE pipeline generalizes across widely diverse experimental systems, time-resolved insulin aggregation and Nuclear Pore complex (NPC) assembly.**
a Top: Final frame of accumulated super-resolution localizations from temporally resolved insulin aggregation. Bottom: Identification of each aggregate depicted as a distinct colour and calculation of its corresponding fingerprint by SEMORE. The scale bar shows 10μm. b The collective fingerprints are processed through a 2-component UMAP and clustered using DBSCAN, resulting in two clusters: cluster 1 (red) contains low-density elongated anisotropic growth patterns, and cluster 2 (gray) contains isotopically grown high-density spherical-like structures. c Nine representative aggregates for each of the anisotropic and isotropic clusters, with points coloured by frame value. d Top: Accumulated super localizations of NPC assemblies from ref. . Bottom: Identification of each assembly depicted as distinct colour and calculation of its corresponding fingerprint by SEMORE. The scale bar shows 1μm. e Processed fingerprints of NPC and 2-component UMAP and clustered using DBSCAN in 3 clusters: Cluster 1 (green) corresponds to individual NPC assemblies, cluster 2 (black) to overlapping NPC assemblies and cluster 3 (gray) to noise. f Overlay of the clustered NPC color-coded based on their classification. Scale bar shows 1μm. g extracted radius of NPC consistent with earlier reports. Source data are provided as a Source Data file.

See this image and copyright information in PMC

Cited by

AI analysis of super-resolution microscopy: Biological discovery in the absence of ground truth.
Nabi IR, Cardoen B, Khater IM, Gao G, Wong TH, Hamarneh G. Nabi IR, et al. J Cell Biol. 2024 Aug 5;223(8):e202311073. doi: 10.1083/jcb.202311073. Epub 2024 Jun 12. J Cell Biol. 2024. PMID: 38865088 Free PMC article. Review.
ECLiPSE: a versatile classification technique for structural and morphological analysis of 2D and 3D single-molecule localization microscopy data.
Hugelier S, Tang Q, Kim HH, Gyparaki MT, Bond C, Santiago-Ruiz AN, Porta S, Lakadamyali M. Hugelier S, et al. Nat Methods. 2024 Oct;21(10):1909-1915. doi: 10.1038/s41592-024-02414-3. Epub 2024 Sep 10. Nat Methods. 2024. PMID: 39256629 Free PMC article.
From Biophysics to Biomedical Physics.
Almahayni K, Bachir Salvador J, Moonnukandathil Jospeh D, Yürekli N, Möllmert S, Möckl L. Almahayni K, et al. ACS Bio Med Chem Au. 2024 Dec 19;5(3):320-333. doi: 10.1021/acsbiomedchemau.4c00096. eCollection 2025 Jun 18. ACS Bio Med Chem Au. 2024. PMID: 40556778 Free PMC article. Review.
Guardians of memory: The urgency of early dementia screening in an aging society.
Hu X, Ma YN, Karako K, Song P, Tang W, Xia Y. Hu X, et al. Intractable Rare Dis Res. 2024 Aug 31;13(3):133-137. doi: 10.5582/irdr.2024.01026. Intractable Rare Dis Res. 2024. PMID: 39220280 Free PMC article. Review.
Advancing Multicolor Super-Resolution Volume Imaging: Illuminating Complex Cellular Dynamics.
Rabiee N, Lan X. Rabiee N, et al. JACS Au. 2025 Jun 9;5(6):2388-2419. doi: 10.1021/jacsau.5c00314. eCollection 2025 Jun 23. JACS Au. 2025. PMID: 40575297 Free PMC article. Review.

See all "Cited by" articles

References

1. Vendruscolo M, Fuxreiter M. Protein condensation diseases: therapeutic opportunities. Nat. Commun. 2022;13:5550. - PMC - PubMed
1. Laursen T, et al. Characterization of a dynamic metabolon producing the defense compound dhurrin in sorghum. Science. 2016;354:890–893. - PubMed
1. Wu H, Fuxreiter M. The structure and dynamics of higher-order assemblies: amyloids, signalosomes, and granules. Cell. 2016;165:1055–1066. - PMC - PubMed
1. Gutierrez C, et al. Structural dynamics of the human COP9 signalosome revealed by cross-linking mass spectrometry and integrative modeling. Proc. Natl Acad. Sci. USA. 2020;117:4088–4098. - PMC - PubMed
1. Bodily PM, et al. Heterozygous genome assembly via binary classification of homologous sequence. BMC Bioinforma. 2015;16:S5. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

SEMORE: SEgmentation and MORphological fingErprinting by machine learning automates super-resolution data analysis

Affiliations

SEMORE: SEgmentation and MORphological fingErprinting by machine learning automates super-resolution data analysis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical