Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 27;12(12):1360.
doi: 10.3390/v12121360.

Application of Next Generation Sequencing (NGS) in Phage Displayed Peptide Selection to Support the Identification of Arsenic-Binding Motifs

Affiliations

Application of Next Generation Sequencing (NGS) in Phage Displayed Peptide Selection to Support the Identification of Arsenic-Binding Motifs

Robert Braun et al. Viruses. .

Abstract

Next generation sequencing (NGS) in combination with phage surface display (PSD) are powerful tools in the newly equipped molecular biology toolbox for the identification of specific target binding biomolecules. Application of PSD led to the discovery of manifold ligands in clinical and material research. However, limitations of traditional phage display hinder the identification process. Growth-based library biases and target-unrelated peptides often result in the dominance of parasitic sequences and the collapse of library diversity. This study describes the effective enrichment of specific peptide motifs potentially binding to arsenic as proof-of-concept using the combination of PSD and NGS. Arsenic is an environmental toxin, which is applied in various semiconductors as gallium arsenide and selective recovery of this element is crucial for recycling and remediation. The development of biomolecules as specific arsenic-binding sorbents is a new approach for its recovery. Usage of NGS for all biopanning fractions allowed for evaluation of motif enrichment, in-depth insight into the selection process and the discrimination of biopanning artefacts, e.g., the amplification-induced library-wide reduction in hydrophobic amino acid proportion. Application of bioinformatics tools led to the identification of an SxHS and a carboxy-terminal QxQ motif, which are potentially involved in the binding of arsenic. To the best of our knowledge, this is the first report of PSD combined with NGS of all relevant biopanning fractions.

Keywords: Illumina; NGS; arsenic; biopanning; interaction; motif; oxyanion; peptide; phage display; target-unrelated peptide.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure A1
Figure A1
Read and sequence distribution of the fractions of three rounds of biopanning against on-column immobilized arsenic with Illumina sequencing (B). Top right (A), a figurative explanation is shown. The horizontal black stacked bar shows the distribution of the one hundred most occurring sequences relative to each other. Below in the white subdivided bar the read distribution is shown. Shown are 1%, 5%, 10%, 20%, …, 90%, 100% of the reads. The shaded area shows the read number occupied by the one hundred most occurring sequences. Read (r) and unique sequences (s) are given in numbers.
Figure A2
Figure A2
Amino acid composition of the fractions of three rounds of biopanning against on-column immobilized arsenic (A). Shown is the relative occurrence of each amino acid at each position of the randomized 12-mer sequence displayed on the outmost part of M13KE phage in the Ph.D.TM–12 phage library (New England Biolabs, Ipswich, MA, USA) relative to the percentage of occurrence of the amino acids in the naïve library. (B) shows an enlarged view of the stripping fraction of biopanning round 3.
Figure A3
Figure A3
Logos of the fractions of three rounds of biopanning against on-column immobilized arsenic. Shown are logos, calculated using pLogo [18] based on the significance of the individual residues in context to the naïve phage library Ph.D.TM–12 as background frequency.
Figure A4
Figure A4
Occurrence of sequences carrying the motif SxHS in the randomized 12-mer displayed on the Ph.D.TM–12 phage library. The occurrence in reads (blue) and sequences (orange) of the respective fraction of three rounds of biopanning against on-column immobilized arsenic (A) and of the calculated core fractions (B) is shown.
Figure A5
Figure A5
Proportion of reads (blue) and sequences (orange) carrying the motif SxHSxxxxxxxx relative to all reads and sequences carrying SxHS on random positions for three rounds of biopanning against on-column immobilized arsenic and of the calculated core fractions.
Figure A6
Figure A6
Comparison of motif occurrence in two different lots of the naïve Ph.D.TM–12 phage library (which were not used in this work) on 48hd.cloud [23]. Motifs QxQ and SxHS (green) are shown in comparison to the respective most abundant motifs (red).
Figure 1
Figure 1
Comparison of the amino acid composition of selected fractions of three rounds of biopanning against on-column immobilized arsenic. Shown in heatmaps is the relative occurrence of each amino acid on each position of the randomized 12-mer peptide sequence, displayed on M13KE phage of the combinatorial Ph.D.TM–12 phage library (New England Biolabs, Ipswich, MA, USA) relative to the percentage of occurrence of the amino acids in the naïve library. The original amino acid percentage on each position of the naïve library is shown in (A). In (B) the relative occurrences of the following fractions are shown: amplification of the naïve library, input biopanning round 1 (BP1), elution and stripping biopanning round 3 (BP3). Figure A2, which shows heatmaps of all fractions can be found in Appendix A.
Figure 2
Figure 2
Sequence logos of selected fractions of three rounds of biopanning against on-column immobilized arsenic. Shown are logos, calculated using pLogo [18] based on the significance of the individual residues in context to the naïve phage library Ph.D.TM–12 as background frequency. (A) Amplification of the naïve phage library (B) Elution and stripping fractions of three rounds of biopanning showing the enrichment of the consensus sequence FHMPLTDPGQVQ.
Figure 3
Figure 3
Visualization of the relative frequency of the unique sequences in the core fraction ES–I–W\naï.lib.TOP25% compared to the frequency of the respective sequences in the beforehand calculated core fractions. The horizontal stacked bars represent the total read number of each fraction, individual sequences are colored black/white and sorted from left to right proportional to their abundance. The size of the marked area is proportional to the frequency of the individual sequences. The area of specific sequences is colored. In total, 9/13 sequences of the core fraction ES–I–W\naï.lib.TOP25% carry the motif xxMPxTxxGQVQ (with x being any amino acid), 3/13 carry the motif SxHS either amino- or carboxy-terminal, 2/13 carry the motif SIHSxTKGxYPV, the remaining sequence does not show similarity to the other identified sequences and is rich in threonine, histidine and leucine. The enrichment process of the sequences shows that they are low abundant and become visible by subtraction of sequences with higher abundance.
Figure 4
Figure 4
Sequence motif occurrence in reads (green), fractions of the three biopanning rounds (red, middle) and in the calculated core fractions (red, right). Motifs were calculated using MEME [19]. Shown are: the naïve Ph.D.TM–12 library, the amplification of the naïve library (naï. lib. amp.) and the pass of the preceding pre-panning (negative), which was used after amplification as input for the three rounds of biopanning against on-column immobilized arsenic. For the three biopanning rounds, the respective input, wash, elution and stripping fraction are shown as well as the core fractions calculated in Section 3.6.
Figure 5
Figure 5
Visualization of the percentage of motif-bearing sequences, in which a second motif can be found. The population of sequences to be compared is defined in the X-axis, motifs which are compared for their appearance in the respective population in the Y-axis. In red bars the number of sequences, carrying the motif is given, in green bars the number of reads of the sequences carrying the motif is given. Motif comparisons colored dark red show that these occur multiple times in the sequences, leading to percentages of >100%. Calculations were performed with the sequence set of the naïve Ph.D.TM–12 library.
Figure 6
Figure 6
Occurrence of sequences carrying the motif xxxxxxxxxQxQ with two carboxy-terminal glutamines on positions 10 and 12 of the randomized 12-mer display on the Ph.D.TM–12 phage library. The occurrence in reads (green) and sequences (red) of the respective fraction of three rounds of biopanning against on-column immobilized arsenic (A) and of the calculated core fractions (B) is shown.
Figure 7
Figure 7
Proportion of reads (green) and sequences (red) carrying the motif xxxxxxxxxQxQ with two fixed carboxy-terminal glutamines relative to all reads and sequences carrying QxQ on random positions for three rounds of biopanning against on-column immobilized arsenic and of the calculated core fractions.

Similar articles

Cited by

References

    1. Cullen W.R. Is Arsenic An Aphrodisiac? Royal Society of Chemistry; Cambridge, UK: 2008. - DOI
    1. Ahuja S., editor. Arsenic Contamination of Groundwater. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 2008. - DOI
    1. Shen S., Li X.F., Cullen W.R., Weinfeld M., Le X.C. Arsenic binding to proteins. Chem. Rev. 2013;113:7769–7792. doi: 10.1021/cr300015c. - DOI - PMC - PubMed
    1. States J.C., editor. Arsenic: Exposure Sources, Health Risks and Mechanisms of Toxicity. John Wiley & Sons, Inc; Hoboken, NJ, USA: 2015. - DOI
    1. Yamauchi H., Takata A., Cao Y., Nakamura K. The Development and Purposes of Arsenic Detoxification Technology. Springer; Singapore: 2019. pp. 199–211. - DOI

Publication types