Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug 26;15(1):287.
doi: 10.1186/1471-2105-15-287.

Content-based histopathology image retrieval using CometCloud

Affiliations

Content-based histopathology image retrieval using CometCloud

Xin Qi et al. BMC Bioinformatics. .

Abstract

Background: The development of digital imaging technology is creating extraordinary levels of accuracy that provide support for improved reliability in different aspects of the image analysis, such as content-based image retrieval, image segmentation, and classification. This has dramatically increased the volume and rate at which data are generated. Together these facts make querying and sharing non-trivial and render centralized solutions unfeasible. Moreover, in many cases this data is often distributed and must be shared across multiple institutions requiring decentralized solutions. In this context, a new generation of data/information driven applications must be developed to take advantage of the national advanced cyber-infrastructure (ACI) which enable investigators to seamlessly and securely interact with information/data which is distributed across geographically disparate resources. This paper presents the development and evaluation of a novel content-based image retrieval (CBIR) framework. The methods were tested extensively using both peripheral blood smears and renal glomeruli specimens. The datasets and performance were evaluated by two pathologists to determine the concordance.

Results: The CBIR algorithms that were developed can reliably retrieve the candidate image patches exhibiting intensity and morphological characteristics that are most similar to a given query image. The methods described in this paper are able to reliably discriminate among subtle staining differences and spatial pattern distributions. By integrating a newly developed dual-similarity relevance feedback module into the CBIR framework, the CBIR results were improved substantially. By aggregating the computational power of high performance computing (HPC) and cloud resources, we demonstrated that the method can be successfully executed in minutes on the Cloud compared to weeks using standard computers.

Conclusions: In this paper, we present a set of newly developed CBIR algorithms and validate them using two different pathology applications, which are regularly evaluated in the practice of pathology. Comparative experimental results demonstrate excellent performance throughout the course of a set of systematic studies. Additionally, we present and evaluate a framework to enable the execution of these algorithms across distributed resources. We show how parallel searching of content-wise similar images in the dataset significantly reduces the overall computational time to ensure the practical utility of the proposed CBIR algorithms.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Workflow of the proposed CBIR algorithm.
Figure 2
Figure 2
An illustration of the hierarchical searching framework: (a) region of interest, (b) coarse searching step, and (c) fine searching step.
Figure 3
Figure 3
An illustration of HAH calculation. (a) Color histogram of the central ring. (b) Color histogram of the fourth ring from the center. An example of two patches with (c) different HAH, but (d) similar color histogram of the entire image.
Figure 4
Figure 4
An illustration of results of the three-stage CBIR searching using one neutrophil as a query image from peripheral blood smears acquired at 20× objective, in which green box labeled regions represent the candidate patches.
Figure 5
Figure 5
CBIR results using different classes of leukocytes as query images, including basophil, eosinophil, lymphocyte, monocyte, and neutrophil, respectively. Here green box labeled regions represent the candidate patches that are similar to the query image patch. Each box has a number to indicate the ranking order of every candidate patch in the dataset. The original sizes of the images were adjusted to fit in the figure.
Figure 6
Figure 6
An example of top 10% CBIR results for a necrotic glomerulus query image. Red box labeled regions indicate the query image. Blue box labeled regions represent the healthy glomeruli for comparison. Green box labeled regions denote the top 10% ranked retrieved patches, which include multiple scaled regions at 1/2, 1, 2, 3, and 4 times of the original size of the query image. The original sizes of the images were adjusted to fit in the figure.
Figure 7
Figure 7
Local feature comparison using HAH, Gabor wavelet and COOC texture features. (a) Precision-recall curves of CBIR results using HAH, Gabor wavelet, and COOC texture features on peripheral blood smears. (b) Average of feature calculation times per image patch using HAH, Gabor wavelet, and COOC texture features on peripheral blood smears. (c) Precision-recall curves of CBIR results using HAH, Gabor wavelet, and COOC texture features on renal glomeruli images. (d) Average of feature calculation times per image patch using HAH, Gabor wavelet, and COOC texture features on renal glomeruli images.
Figure 8
Figure 8
Top ranked patches before and after relevance feedback of three classes of leukocytes ((a) neutrophil, (b) monocyte, and (c) lymphocyte). Patches with red rectangles represent the incorrect results (negative examples), and blue rectangles denote the correct results (positive examples), which were re-assigned to higher rankings through the relevance feedback process. The original sizes of the images were adjusted to fit in the figure.
Figure 9
Figure 9
Top ranked patches before and after relevance feedback of the renal glomeruli dataset. Patches with red rectangles represent the incorrect results (negative examples), and blue rectangles represent the correct results (positive examples), which were re-assigned to higher rankings through the relevance feedback process.
Figure 10
Figure 10
The ROC curves of the dual-similarity relevance feedback using the peripheral blood smear image dataset (neutrophil, monocyte, and lymphocyte), and the renal glomeruli dataset.
Figure 11
Figure 11
The recall curves after relevance feedback (RF) and their calculation times using the peripheral blood smear image dataset ((a) neutrophil, (b) monocyte, and (c) lymphocyte)), and (d) the renal glomeruli dataset. The numbers of training samples were 20, 50 and 90.
Figure 12
Figure 12
The execution time of hierarchical searching process using (a) peripheral blood smear dataset, and (b) renal glomeruli dataset, with different combinations of number of the HAH rings and the percentage of overlapping.
Figure 13
Figure 13
The execution time of sequential and federated infrastructure using peripheral blood smear dataset and renal glomeruli dataset with different combinations of number of the rings and the percentage of overlapping. Here the Y-axis is in a logarithmic scale.
Figure 14
Figure 14
The number of completed tasks over time when testing the CBIR algorithm using the renal glomeruli dataset. The area under “FutureGrid” represents the contribution from the cloud resources.
Figure 15
Figure 15
The average task execution time per platform using the peripheral blood smear and renal glomeruli datasets with different combinations of number of the rings and the percentage of overlapping. Here “FutureGrid” is abbreviated as “fg”.
Figure 16
Figure 16
The execution time per task using (a) the peripheral blood smear dataset and (b) the renal glomeruli dataset.

Similar articles

Cited by

References

    1. Gudivada VN, Raghavan VV. Content-based image retrieval systems. Computer. 1995;28(9):18–22. doi: 10.1109/2.410145. - DOI
    1. Flickener M, Sawhney H, Niblack W, Ashley J, Huang Q, Dom B, Gorkani M, Hafner J, Lee D, Petkovic D, Steele D, Yanker P. Query by image and video content: the QBIC system. Computer. 1995;28(9):23–32. doi: 10.1109/2.410146. - DOI
    1. Smith JR, Chang SF. Proceedings of the Fourth ACM International conference on Multimedia. 1996. Visualseek: a fully automated content-based image query system; pp. 87–98.
    1. Tagare HD, Jaffe CC, Duncan J. Medical image databases: a content-based retrieval approach. J Am Med Inform Assoc. 1997;4:184–198. doi: 10.1136/jamia.1997.0040184. - DOI - PMC - PubMed
    1. Smeulders AWM, Worring M, Santini S, Gupta A, Jainh R. Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell. 2000;22:1349–1380. doi: 10.1109/34.895972. - DOI

Publication types