Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Clinical Trial
. 2020 Feb 24;10(1):3313.
doi: 10.1038/s41598-020-59847-x.

Machine Learning Analysis for Quantitative Discrimination of Dried Blood Droplets

Affiliations
Clinical Trial

Machine Learning Analysis for Quantitative Discrimination of Dried Blood Droplets

Lama Hamadeh et al. Sci Rep. .

Abstract

One of the most interesting and everyday natural phenomenon is the formation of different patterns after the evaporation of liquid droplets on a solid surface. The analysis of dried patterns from blood droplets has recently gained a lot of attention, experimentally and theoretically, due to its potential application in diagnostic medicine and forensic science. This paper presents evidence that images of dried blood droplets have a signature revealing the exhaustion level of the person, and discloses an entirely novel approach to studying human dried blood droplet patterns. We took blood samples from 30 healthy young male volunteers before and after exhaustive exercise, which is well known to cause large changes to blood chemistry. We objectively and quantitatively analysed 1800 images of dried blood droplets, developing sophisticated image processing analysis routines and optimising a multivariate statistical machine learning algorithm. We looked for statistically relevant correlations between the patterns in the dried blood droplets and exercise-induced changes in blood chemistry. An analysis of the various measured physiological parameters was also investigated. We found that when our machine learning algorithm, which optimises a statistical model combining Principal Component Analysis (PCA) as an unsupervised learning method and Linear Discriminant Analysis (LDA) as a supervised learning method, is applied on the logarithmic power spectrum of the images, it can provide up to 95% prediction accuracy, in discriminating the physiological conditions, i.e., before or after physical exercise. This correlation is strongest when all ten images taken per volunteer per condition are averaged, rather than treated individually. Having demonstrated proof-of-principle, this method can be applied to identify diseases.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Statistical analysis of the measured blood chemistry properties. Correlation matrices are calculated for the 14 properties measured at (a) baseline, (b) peak exercise, (c) after 2 minutes exercise, (d) after 4 minutes exercise and (e) after 6 minutes exercise. The first three scores of the unsupervised dimensionality reduction and clustering method, principal component analysis, are shown in separate 2D plots (f,h,i) for the chemical properties: [H+], PCO2, [Hb], ΔBV, PO2, [K+], [Na+], [Ca2+], [Cl], [Glucose], [Lactate] and [HCO3], discarding pH and [SID] as they are related to the other quantities. The best discrimination between all five blood conditions is revealed along the first principal component score shown in (f). The centroid’s behaviour related to the conditions clusters is shown in (g) which strongly suggests that the blood condition, from the chemistry point of view, is returnings towards baseline after 6 minutes of exercise.
Figure 2
Figure 2
The discrimination outcome of the optimised machine learning algorithm to discriminate the logarithmic power spectrum of two blood conditions: (a) baseline against peak, (b) baseline against after 2 mins, (c) baseline against after 4 mins and (d) baseline against after 6 mins. The lowest error rate has been obtained when discriminating baseline and after 6 mins with e = 5% shown in the scatter plot (d,3) that illustrates the relationship between the first two LDA scores, i.e., LDs1 and LDs2. Here MCi, where i = 1, 2, 3, refers to the misclassified baseline points. Figures (d,1) and (d,2) represent the first and second LDA functions, respectively, where each element of each row (frequency) corresponds to the average pixel intensity at a given radius. The spatial frequency of the linear discriminant functions is measured in units of cycles per revolution.
Figure 3
Figure 3
The effect of averaging separate images from the same volunteer and the same physiological state, on the final discrimination outcome. In (a) individual images with no averaging are used for the discrimination process. The accuracy of this approach is 75.7%. The accuracy increases noticeably with increasing the number of averaging images in (bf). As shown previously in Fig. 2d,3, all images (usually ten to twelve) taken per volunteer per condition are averaged and the accuracy has considerably increased to 95%. An optimised training exercise was undertaken for each separate case.
Figure 4
Figure 4
The trajectory of the centroid of the blood condition clusters, with images averaged over each person. The individual volunteer’s images, projected onto our DF space, are also shown, to disclose the inter-subject variability.
Figure 5
Figure 5
Work methodology. The study begins by building the database of the dried blood images taken from 30 healthy volunteers before and after a cycling exercise. These images are then pre-processed individually using different image analysis techniques; cracks filling, Gaussian filter, edge detection, polar coordinate and power spectrum. The database of the logarithmic power spectrum of the images are later used to be passed on to our machine learning algorithm. This algorithm starts with reducing the dimensionality of the image database using principal component analysis. Using the resulting feature space, i.e., the subspace that contains the highest varying principal components, we can move on to the classification step where we use linear discriminant analysis to find the best line that best separates the conditions of the images. The performance of this algorithm is further enhanced by applying an optimising iterative search to select the ‘ideal’ few participants who would best train the algorithm and result in the lowest overall error rate possible.
Figure 6
Figure 6
Image pre-processing methods where (a) shows the raw blood image, (b) represents the red channel of the image, (c) illustrates the 2D interpolated cracks, (d) shows a 2D Gaussian filtered image, (e) shows the Laplacian of the image function, (f) is the polar coordinate version of the image, (g) is the 1D power spectrum of the image previously shown and (h) is the logarithmic power spectrum.

References

    1. Zang, D., Tarafdar, S., Tarasevich, Y. Y., Choudhury, M. D. & Dutta, T. Evaporation of a droplet: From physics to applications. Physics Reports (2019).
    1. Deegan R, et al. Capillary flow as the cause of ring stains from dried liquid drops. Nature. 1997;389:827–829. doi: 10.1038/39827. - DOI
    1. Larson R. Transport and deposition patterns in drying sessile droplets. Transport Phenomena and Fluid Mechanics. 2014;60:1538–1571.
    1. Zeid W, Vicente J, Brutin D. Influence of evaporation rate on cracks’ formation of a drying drop of whole blood. Colloids and Surfaces A: Physicochemical and Engineering Aspects. 2013;432:139–146. doi: 10.1016/j.colsurfa.2013.04.044. - DOI
    1. Deegan, R. et al. Contact line deposits in an evaporating drop. Phy. Rev. E756 (2000). - PubMed

Publication types