Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 20;7(1):11921.
doi: 10.1038/s41598-017-11940-4.

Identification and correction of spatial bias are essential for obtaining quality data in high-throughput screening technologies

Affiliations

Identification and correction of spatial bias are essential for obtaining quality data in high-throughput screening technologies

Bogdan Mazoure et al. Sci Rep. .

Abstract

Spatial bias continues to be a major challenge in high-throughput screening technologies. Its successful detection and elimination are critical for identifying the most promising drug candidates. Here, we examine experimental small molecule assays from the popular ChemBank database and show that screening data are widely affected by both assay-specific and plate-specific spatial biases. Importantly, the bias affecting screening data can fit an additive or multiplicative model. We show that the use of appropriate statistical methods is essential for improving the quality of experimental screening data. The presented methodology can be recommended for the analysis of current and next-generation screening data.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Average true positive rate (Panels (a) and (c)) and average total number of false positive and false negative hits per assay (Panels (b) and (d)) obtained by No Correction, Well Correction, B-score, and PMP followed by robust Z-scores for the significance levels α = 0.01 and 0.05. Panels (a) and (b) show the results obtained for datasets with a fixed bias magnitude (SD = 1.8). Panels (c) and (d) show the results for datasets with a fixed hit percentage of 1%.
Figure 2
Figure 2
Assay-specific bias detected in 12 experimental assays from the ChemBank database. Here, 4 high-throughput screening assays, 4 high-content screening assays and 4 small-molecule microarrays were examined. The ChemBank IDs of these assays are indicated between parentheses: ABeta42 aggregation inhibitors (1103.0016), Bacterial viability profiling (1064.0002), E. coli filamentation (1038.0004), M. tuberculosis sulfur assimilation (130.0018), Autophagy cell count (1050.0009), Autophagy EGFP (1050.0111), Toxoplasma invasion imaging screening (141.0027), C. elegans assay for anti-infective reagents (1109.0003), HPV-E7 SMM (1049.0001), Male germ cell targets SMM (1154.0015), Male germ cell targets SMM (1154.0009) and NeuroSMM screen on torsin A (1069.0001); for more details see Supplementary Table 1.
Figure 3
Figure 3
Plate-specific bias detected across data of 3 screening technologies and 8 screening categories available in ChemBank - per plate representation; 175 assays were analyzed in total (see Supplementary Table 2 for the complete list of the assays considered); all control wells were ignored.
Figure 4
Figure 4
Plate-specific bias detected across data of 3 screening technologies and 8 screening categories available in ChemBank - per assay representation; 175 assays were analyzed in total (see Supplementary Table 2 for the complete list of the assays considered); all control wells were ignored. Assays, in which the number of plates containing additive bias was bigger than the number of plates containing multiplicative bias, are reported in the first column of each screening category. Assays with a bigger number of plates affected by multiplicative bias are reported in the second column of each screening category. Assays, in which the number of plates containing additive bias was equal to the number of plates containing the multiplicative bias as well as assays without any biased plate are reported as “Undetermined”. Darker portions of bars show the proportion of assays that have a dominant, additive or multiplicative, trend. Lighter portions of bars show the proportion of assays in which the indicated model of bias was present more frequently, but without a clear-cut dominance.
Figure 5
Figure 5
Hit maps showing the presence of spatial bias in the McMaster Test assay screened during the McMaster Data Mining and Docking Competition: (a) hit distribution surface for raw data, (b) hit distribution surface corrected both plate and assay-wise, (c) Plate 428 raw measurements, and (d) Plate 428 corrected measurements. Control columns 1 and 12 are not shown here. Higher hit counts (panels a and b) and intensity levels (panels c and d) are in red; lower hit counts (panels a and b) and intensity levels (panels c and d) are in blue. The hit selection threshold of µ-2σ was used to compute the hit distribution surfaces. The Mann-Whitney U test carried out to detect plate-specific spatial bias suggested that 377 McMaster plates were affected by systematic error, and 873 of them were clean. Error detection was done at the significance level α = 0.01. Plate-specific spatial bias was corrected using the additive PMP algorithm, as suggested by our method. The exact values of the raw and corrected measurements of Plate 428 and the raw and the corrected hit distribution surfaces are reported in Supplementary Tables 3–6.
Figure 6
Figure 6
Q-Q plots for McMaster’s Plate 428 from the McMaster Data Mining and Docking Competition HTS Test assay before (a) and after (b) the correction of additive spatial bias. All control wells (columns 1 and 12 of each plate) were excluded from the analysis. Low values (i.e., false positives) appearing in row H prevent a clear-cut identification of the hit located in well (E,3) (panel a of the figure; see also Supplementary Table 5 for the exact values of the raw HTS measurements). After the plate-specific correction by additive PMP, the hit appearing in well (E,3) becomes much better separated from the rest of the measurements of Plate 428 (panel b of the figure; see also Supplementary Table 6 for the exact values of the corrected HTS measurements). The same trend is maintained for the outlier (i.e., a high measurement value in the McMaster inhibition assay) located in well (E,11).

References

    1. Azvolinsky A, Schmidt C, Waltz E, Webb S. 20 years of Nature Biotechnology biomedical research. Nat. Biotechnol. 2016;34:262–266. doi: 10.1038/nbt.3509. - DOI - PubMed
    1. Malo N, Hanley JA, Cerquozzi S, Pelletier J, Nadon R. Statistical practice in high-throughput screening data analysis. Nat. Biotechnol. 2006;24:167–175. doi: 10.1038/nbt1186. - DOI - PubMed
    1. Bansal M, et al. A community computational challenge to predict the activity of pairs of compounds. Nat. Biotechnol. 2014;32:1213–1222. doi: 10.1038/nbt.3052. - DOI - PMC - PubMed
    1. Atanasov AG, et al. Discovery and resupply of pharmacologically active plant-derived natural products: A review. Biotechnol. Adv. 2015;33:1582–1614. doi: 10.1016/j.biotechadv.2015.08.001. - DOI - PMC - PubMed
    1. Xu M, et al. Identification of small-molecule inhibitors of Zika virus infection and induced neural cell death via a drug repurposing screen. Nat. Med. 2016;22:1101–1107. doi: 10.1038/nm.4184. - DOI - PMC - PubMed

Publication types