Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 13:3:22.
doi: 10.3389/fmed.2016.00022. eCollection 2016.

Separating Putative Pathogens from Background Contamination with Principal Orthogonal Decomposition: Evidence for Leptospira in the Ugandan Neonatal Septisome

Affiliations

Separating Putative Pathogens from Background Contamination with Principal Orthogonal Decomposition: Evidence for Leptospira in the Ugandan Neonatal Septisome

Steven J Schiff et al. Front Med (Lausanne). .

Abstract

Neonatal sepsis (NS) is responsible for over 1 million yearly deaths worldwide. In the developing world, NS is often treated without an identified microbial pathogen. Amplicon sequencing of the bacterial 16S rRNA gene can be used to identify organisms that are difficult to detect by routine microbiological methods. However, contaminating bacteria are ubiquitous in both hospital settings and research reagents and must be accounted for to make effective use of these data. In this study, we sequenced the bacterial 16S rRNA gene obtained from blood and cerebrospinal fluid (CSF) of 80 neonates presenting with NS to the Mbarara Regional Hospital in Uganda. Assuming that patterns of background contamination would be independent of pathogenic microorganism DNA, we applied a novel quantitative approach using principal orthogonal decomposition to separate background contamination from potential pathogens in sequencing data. We designed our quantitative approach contrasting blood, CSF, and control specimens and employed a variety of statistical random matrix bootstrap hypotheses to estimate statistical significance. These analyses demonstrate that Leptospira appears present in some infants presenting within 48 h of birth, indicative of infection in utero, and up to 28 days of age, suggesting environmental exposure. This organism cannot be cultured in routine bacteriological settings and is enzootic in the cattle that often live in close proximity to the rural peoples of western Uganda. Our findings demonstrate that statistical approaches to remove background organisms common in 16S sequence data can reveal putative pathogens in small volume biological samples from newborns. This computational analysis thus reveals an important medical finding that has the potential to alter therapy and prevention efforts in a critically ill population.

Keywords: 16S rRNA; Leptospira; bacteria; neonatal sepsis; principal orthogonal decomposition; singular value decomposition.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The characterization of the dataset and modes. (A) The graphical representation of read counts, sorted by columns of total reads for each taxa from left to right in descending order for 131 genus identifications in 95 samples. Color map is scaled to amplify the lowest 1% of read counts, and the color bar maximum dark red color is the same for counts from 320 to 32,000 in order to aid visualization of the dataset. This image is a visualization of the data in Table S1 in Supplementary Material, and the taxa for each column, from left to right, are given in the table in the same columnar order. (B) Fisher’s canonical linear discrimination demonstrates the optimal linear combinations of the read counts (Z1 and Z2) that separate samples from blood, CSF, and controls. These Fisher’s discriminants are optimal combinations of the read counts that maximally separate the different groups. Two of the three control samples overlap in the plot. Group means are large symbols. (C) First 10 eigenmodes from principal orthogonal decomposition and total energy [cumulative energy fraction, E] accounted for by summing modes progressively from left to right. Only the first 10 columns are plotted in each mode. The sum of all modes, which are weighted by their eigenvalues, would equal the original data set [a full discussion of this geometry can be found in Chapter 7.3 of Schiff (16)]. (D) The weighting of each mode (log of eigenvalue amplitudes) are shown, as well as the tolerance for insignificance (dashed line) below which eigenvalues are not resolvable. There are 95 eigenvalues, one for each patient sample and control. (E) Composition of the first three modes in terms of their representative genera sorted in descending order as blue, green, and red.
Figure 2
Figure 2
Hypothesis testing for modes using random matrices. (A) Random matrix bootstrap ensemble distribution for all samples showing the mean (black solid line) and ±1 SD (blue dotted lines) for 1000 randomizations of all matrix eigenvalue amplitudes, and original data set eigenvalues (red asterisks). (B) Graphical representation of a randomization of Figure 1A using same color map scale. (C) All samples with mode 1 removed, and comparable mode composition in (D). (E) shows eigenvalue distribution for blood samples only, with mode composition in (F). (G) shows eigenvalues for blood with mode 1 removed, and in (H) the mode composition. (I) illustrates the probabilities of obtaining the first mode eigenvalues for all eigenvalues, and the bootstrap histograms that underlie the probabilities of the first three modal eigenvalues from (G,H) illustrating the significance of dominant Leptospira mode from (H) (similar results randomizing only by bacterial type not shown). Note that by removing the mode dominated by Ralstonia in the blood sample, the Leptospira dominant mode has an eigenvalue far larger than any eigenvalue generated from the randomized dataset. In contrast, the next two modes generate relatively small eigenvalues compared with the bootstrapped values. These results demonstrate that with the removal of the contaminating mode, it is highly statistically unlikely that random contamination was responsible for the pattern of Leptospira reads observed.

References

    1. UN. Levels & Trends in Child Mortality. Inter-Agency Group for Child Mortality Estimation; (2012). p. 1–32.
    1. John CC, Carabin H, Montano SM, Bangirana P, Zunt JR, Peterson PK. Global research priorities for infections that affect the nervous system. Nature (2015) 527(7578):S178–86.10.1038/nature16033 - DOI - PMC - PubMed
    1. Williams EJ, Thorson S, Maskey M, Mahat S, Hamaluba M, Dongol S, et al. Hospital-based surveillance of invasive pneumococcal disease among young children in urban Nepal. Clin Infect Dis (2009) 48(Suppl 2):S114–22.10.1086/596488 - DOI - PubMed
    1. Darmstadt GL, Saha SK, Choi Y, El Arifeen S, Ahmed NU, Bari S, et al. Population-based incidence and etiology of community-acquired neonatal bacteremia in Mirzapur, Bangladesh: an observational study. J Infect Dis (2009) 200:906–15.10.1086/605473 - DOI - PMC - PubMed
    1. Mugalu J, Nakakeeto MK, Kiguli S, Kaddu-Mulindwa DH. Aetiology, risk factors and immediate outcome of bacteriologically confirmed neonatal septicaemia in Mulago Hospital, Uganda. Afr Health Sci (2006) 6:120–6.10.5555/afhs.2006.6.2.120 - DOI - PMC - PubMed

LinkOut - more resources