. 2014 Aug 26;15(1):287.

doi: 10.1186/1471-2105-15-287.

Content-based histopathology image retrieval using CometCloud

Xin Qi¹, Daihou Wang, Ivan Rodero, Javier Diaz-Montes, Rebekah H Gensure, Fuyong Xing, Hua Zhong, Lauri Goodell, Manish Parashar, David J Foran, Lin Yang

Affiliations

PMID: 25155691
PMCID: PMC4161917
DOI: 10.1186/1471-2105-15-287

Content-based histopathology image retrieval using CometCloud

Xin Qi et al. BMC Bioinformatics. 2014.

. 2014 Aug 26;15(1):287.

doi: 10.1186/1471-2105-15-287.

Authors

Xin Qi¹, Daihou Wang, Ivan Rodero, Javier Diaz-Montes, Rebekah H Gensure, Fuyong Xing, Hua Zhong, Lauri Goodell, Manish Parashar, David J Foran, Lin Yang

Affiliation

¹ Department of Pathology and Laboratory Medicine, Rutger Robert Wood Johnson Medical School, 675 Hoes Lane, Piscataway, NJ, USA. qixi@rutgers.edu.

PMID: 25155691
PMCID: PMC4161917
DOI: 10.1186/1471-2105-15-287

Abstract

Background: The development of digital imaging technology is creating extraordinary levels of accuracy that provide support for improved reliability in different aspects of the image analysis, such as content-based image retrieval, image segmentation, and classification. This has dramatically increased the volume and rate at which data are generated. Together these facts make querying and sharing non-trivial and render centralized solutions unfeasible. Moreover, in many cases this data is often distributed and must be shared across multiple institutions requiring decentralized solutions. In this context, a new generation of data/information driven applications must be developed to take advantage of the national advanced cyber-infrastructure (ACI) which enable investigators to seamlessly and securely interact with information/data which is distributed across geographically disparate resources. This paper presents the development and evaluation of a novel content-based image retrieval (CBIR) framework. The methods were tested extensively using both peripheral blood smears and renal glomeruli specimens. The datasets and performance were evaluated by two pathologists to determine the concordance.

Results: The CBIR algorithms that were developed can reliably retrieve the candidate image patches exhibiting intensity and morphological characteristics that are most similar to a given query image. The methods described in this paper are able to reliably discriminate among subtle staining differences and spatial pattern distributions. By integrating a newly developed dual-similarity relevance feedback module into the CBIR framework, the CBIR results were improved substantially. By aggregating the computational power of high performance computing (HPC) and cloud resources, we demonstrated that the method can be successfully executed in minutes on the Cloud compared to weeks using standard computers.

Conclusions: In this paper, we present a set of newly developed CBIR algorithms and validate them using two different pathology applications, which are regularly evaluated in the practice of pathology. Comparative experimental results demonstrate excellent performance throughout the course of a set of systematic studies. Additionally, we present and evaluate a framework to enable the execution of these algorithms across distributed resources. We show how parallel searching of content-wise similar images in the dataset significantly reduces the overall computational time to ensure the practical utility of the proposed CBIR algorithms.

PubMed Disclaimer

Figures

**Figure 1**
**Workflow of the proposed CBIR algorithm.**

**Figure 2**
**An illustration of the hierarchical searching framework: (a) region of interest, (b) coarse searching step, and (c) fine searching step.**

**Figure 3**
**An illustration of HAH calculation. (a)** Color histogram of the central ring. **(b)** Color histogram of the fourth ring from the center. An example of two patches with **(c)** different HAH, but **(d)** similar color histogram of the entire image.

**Figure 4**
An illustration of results of the three-stage CBIR searching using one neutrophil as a query image from peripheral blood smears acquired at 20× objective, in which green box labeled regions represent the candidate patches.

**Figure 5**
**CBIR results using different classes of leukocytes as query images, including basophil, eosinophil, lymphocyte, monocyte, and neutrophil, respectively.** Here green box labeled regions represent the candidate patches that are similar to the query image patch. Each box has a number to indicate the ranking order of every candidate patch in the dataset. The original sizes of the images were adjusted to fit in the figure.

**Figure 6**
**An example of top 10% CBIR results for a necrotic glomerulus query image.** Red box labeled regions indicate the query image. Blue box labeled regions represent the healthy glomeruli for comparison. Green box labeled regions denote the top 10% ranked retrieved patches, which include multiple scaled regions at 1/2, 1, 2, 3, and 4 times of the original size of the query image. The original sizes of the images were adjusted to fit in the figure.

**Figure 7**
**Local feature comparison using HAH, Gabor wavelet and COOC texture features. (a)** Precision-recall curves of CBIR results using HAH, Gabor wavelet, and COOC texture features on peripheral blood smears. **(b)** Average of feature calculation times per image patch using HAH, Gabor wavelet, and COOC texture features on peripheral blood smears. **(c)** Precision-recall curves of CBIR results using HAH, Gabor wavelet, and COOC texture features on renal glomeruli images. **(d)** Average of feature calculation times per image patch using HAH, Gabor wavelet, and COOC texture features on renal glomeruli images.

**Figure 8**
**Top ranked patches before and after relevance feedback of three classes of leukocytes ((a) neutrophil, (b) monocyte, and (c) lymphocyte).** Patches with red rectangles represent the incorrect results (negative examples), and blue rectangles denote the correct results (positive examples), which were re-assigned to higher rankings through the relevance feedback process. The original sizes of the images were adjusted to fit in the figure.

**Figure 9**
**Top ranked patches before and after relevance feedback of the renal glomeruli dataset.** Patches with red rectangles represent the incorrect results (negative examples), and blue rectangles represent the correct results (positive examples), which were re-assigned to higher rankings through the relevance feedback process.

**Figure 10**
**The ROC curves of the dual-similarity relevance feedback using the peripheral blood smear image dataset (neutrophil, monocyte, and lymphocyte), and the renal glomeruli dataset.**

**Figure 11**
The recall curves after relevance feedback (RF) and their calculation times using the peripheral blood smear image dataset ((a) neutrophil, (b) monocyte, and (c) lymphocyte)), and (d) the renal glomeruli dataset. The numbers of training samples were 20, 50 and 90.

**Figure 12**
The execution time of hierarchical searching process using (a) peripheral blood smear dataset, and (b) renal glomeruli dataset, with different combinations of number of the HAH rings and the percentage of overlapping.

**Figure 13**
The execution time of sequential and federated infrastructure using peripheral blood smear dataset and renal glomeruli dataset with different combinations of number of the rings and the percentage of overlapping. Here the Y-axis is in a logarithmic scale.

**Figure 14**
**The number of completed tasks over time when testing the CBIR algorithm using the renal glomeruli dataset.** The area under “FutureGrid” represents the contribution from the cloud resources.

**Figure 15**
**The average task execution time per platform using the peripheral blood smear and renal glomeruli datasets with different combinations of number of the rings and the percentage of overlapping.** Here “FutureGrid” is abbreviated as “fg”.

**Figure 16**
**The execution time per task using (a) the peripheral blood smear dataset and (b) the renal glomeruli dataset.**

See this image and copyright information in PMC

Cited by

Applications of discriminative and deep learning feature extraction methods for whole slide image analysis: A survey.
Al-Thelaya K, Gilal NU, Alzubaidi M, Majeed F, Agus M, Schneider J, Househ M. Al-Thelaya K, et al. J Pathol Inform. 2023 Sep 14;14:100335. doi: 10.1016/j.jpi.2023.100335. eCollection 2023. J Pathol Inform. 2023. PMID: 37928897 Free PMC article. Review.
An Assessment of Imaging Informatics for Precision Medicine in Cancer.
Chennubhotla C, Clarke LP, Fedorov A, Foran D, Harris G, Helton E, Nordstrom R, Prior F, Rubin D, Saltz JH, Shalley E, Sharma A. Chennubhotla C, et al. Yearb Med Inform. 2017 Aug;26(1):110-119. doi: 10.15265/IY-2017-041. Epub 2017 Sep 11. Yearb Med Inform. 2017. PMID: 29063549 Free PMC article. Review.
Accelerating biomedical discoveries in brain health through transformative neuropathology of aging and neurodegeneration.
Murray ME, Smith C, Menon V, Keene CD, Lein E, Hawrylycz M, Aguzzi A, Benedetti B, Brose K, Caetano-Anolles K, Sillero MIC, Crary JF, De Jager PL, Faustin A, Flanagan ME, Gokce O, Grant SGN, Grinberg LT, Gutman DA, Hillman EMC, Huang Z, Irwin DJ, Jones DT, Kapasi A, Karch CM, Kukull WT, Lashley T, Lee EB, Lehner T, Parkkinen L, Pedersen M, Pritchett D, Rutledge MH, Schneider JA, Seeley WW, Shepherd CE, Spires-Jones TL, Steen JA, Sutherland M, Vickovic S, Zhang B, Stewart DJ, Keiser MJ, Vogel JW, Dugger BN, Phatnani H. Murray ME, et al. Neuron. 2025 Jul 15:S0896-6273(25)00472-6. doi: 10.1016/j.neuron.2025.06.014. Online ahead of print. Neuron. 2025. PMID: 40683248 Review.
Parallel Versus Distributed Data Access for Gigapixel-Resolution Histology Images: Challenges and Opportunities.
Yildirim E, Foran DJ. Yildirim E, et al. IEEE J Biomed Health Inform. 2017 Jul;21(4):1049-1057. doi: 10.1109/JBHI.2016.2580145. Epub 2016 Jun 13. IEEE J Biomed Health Inform. 2017. PMID: 27323383 Free PMC article.
Image analysis and machine learning in digital pathology: Challenges and opportunities.
Madabhushi A, Lee G. Madabhushi A, et al. Med Image Anal. 2016 Oct;33:170-175. doi: 10.1016/j.media.2016.06.037. Epub 2016 Jul 4. Med Image Anal. 2016. PMID: 27423409 Free PMC article. Review.

See all "Cited by" articles

References

1. Gudivada VN, Raghavan VV. Content-based image retrieval systems. Computer. 1995;28(9):18–22. doi: 10.1109/2.410145. - DOI
1. Flickener M, Sawhney H, Niblack W, Ashley J, Huang Q, Dom B, Gorkani M, Hafner J, Lee D, Petkovic D, Steele D, Yanker P. Query by image and video content: the QBIC system. Computer. 1995;28(9):23–32. doi: 10.1109/2.410146. - DOI
1. Smith JR, Chang SF. Proceedings of the Fourth ACM International conference on Multimedia. 1996. Visualseek: a fully automated content-based image query system; pp. 87–98.
1. Tagare HD, Jaffe CC, Duncan J. Medical image databases: a content-based retrieval approach. J Am Med Inform Assoc. 1997;4:184–198. doi: 10.1136/jamia.1997.0040184. - DOI - PMC - PubMed
1. Smeulders AWM, Worring M, Santini S, Gupta A, Jainh R. Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell. 2000;22:1349–1380. doi: 10.1109/34.895972. - DOI

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Content-based histopathology image retrieval using CometCloud

Affiliation

Content-based histopathology image retrieval using CometCloud

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical