Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 24:19:2518-2525.
doi: 10.1016/j.csbj.2021.04.044. eCollection 2021.

protGear: A protein microarray data pre-processing suite

Affiliations

protGear: A protein microarray data pre-processing suite

Kennedy Mwai et al. Comput Struct Biotechnol J. .

Abstract

Protein microarrays are versatile tools for high throughput study of the human proteome, but systematic and non-systematic sources of bias constrain optimal interpretation and the ultimate utility of the data. Published guidelines to limit technical variability whilst maintaining important biological variation favour DNA-based microarrays that often differ fundamentally in their experimental design. Rigorous tools to guide background correction, the quantification of within-sample variation, normalisation, and batch correction specifically for protein microarrays are limited, require extensive investigation and are not centrally accessible. Here, we develop a generic one-stop-shop pre-processing suite for protein microarrays that is compatible with data from the major protein microarray scanners. Our graphical and tabular interfaces facilitate a detailed inspection of data and are coupled with supporting guidelines that enable users to select the most appropriate algorithms to systematically address bias arising in customized experiments. The localization and distribution of background signal intensities determine the optimal correction strategy. A novel function overcomes the limitations in the interpretation of the coefficient of variation when signal intensities are at the lower end of the detection threshold. We demonstrate essential considerations in the experimental design and their impact on a range of algorithms for normalization and minimization of batch effects. Our user-friendly interactive web-based platform eliminates the need for prowess in programming. The open-source R interface includes illustrative examples, generates an auditable record, enables reproducibility, and can incorporate additional custom scripts through its online repository. This versatility will enhance its broad uptake in the infectious disease and vaccine development community.

Keywords: Background correction; Batch correction; Normalisation; Protein microarray; Reproducibility.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
protGear data processing scheme. Dotted lines indicate optional steps. Tag subtraction is applied for antigens containing purification tags. Batch correction is relevant when multiple samples from the same sample set are processed in more than one experiment.
Fig. 2
Fig. 2
Background correction: artefacts add noise to the signal intensity A) A microarray slide with 21 mini arrays and a barcode. Each mini array has a specific number of features represented by a spot. B) Artefacts [spots surrounded with yellow boxes] Kamuyu 2018. C) The total foreground intensity associated with feature spot typically includes the local background. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 3
Fig. 3
Example of background diagnostic plots produced by protGear. (A) is the background MFI vs foreground MFI plot that is useful to assist in selecting the appropriate background correction method. (B) is a boxplot of the blocks/mini arrays categorised into the technical repeats. This plot is important to check whether there is a block artefact in the background MFI values.
Fig. 4
Fig. 4
The visualization of the CV A) Correlation of the technical replicates, B) Proportion of CV by the CV cut off, C) Proportion of CV after “cv_based_filtering”, D) A static image of an Interactive table to inspect the CV cut off values. The table shows the specific slide id (.id), the serum sample identifier (sampleID), count of CV’s < 20% (CV<=20), % of CV’s < 20%, count of CV’s > 20% (CV > 20), % of CV’s > 20% and out of range CVs on the 1st to 8th columns, respectively.
Fig. 5
Fig. 5
Standard deviation vs mean plots (meanSdPlot) of A) Non normalised data B) log2 normalisation C) Cyclic loess normalisation D) Robust Linear Model normalisation and E) VSN normalisation.

References

    1. Sundaresh S., Doolan D.L., Hirst S., Mu Y., Unal B., Davies D.H. Identification of humoral immune responses in protein microarrays using DNA microarray data analysis techniques. Bioinformatics. 2006;22(14):1760–1766. - PubMed
    1. Doolan D.L., Mu Y., Unal B., Sundaresh S., Hirst S., Valdez C. Profiling humoral immune responses to P. falciparum infection with protein microarrays. Proteomics. 2008;8(22):4680–4694. - PMC - PubMed
    1. Kamuyu G., Tuju J., Kimathi R., Mwai K., Mburu J., Kibinge N. KILchip v1. 0: a novel Plasmodium falciparum merozoite protein microarray to facilitate malaria vaccine candidate prioritization. Front Immunol. 2018;9:2866. - PMC - PubMed
    1. De Assis R.R., Jain A., Nakajima R., Jasinskas A., Felgner J., Obiero J.M. Analysis of SARS-CoV-2 antibodies in COVID-19 convalescent blood using a coronavirus antigen microarray. Nat Commun. 2021;12(1):1–9. - PMC - PubMed
    1. Duarte J.G., Blackburn J.M. Advances in the development of human protein microarrays. Expert Rev Proteomics. 2017;14(7):627–641. - PubMed