Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Mar 8;123(5):2049-2111.
doi: 10.1021/acs.chemrev.2c00621. Epub 2023 Jan 24.

Protein-Based Biological Materials: Molecular Design and Artificial Production

Affiliations
Review

Protein-Based Biological Materials: Molecular Design and Artificial Production

Ali Miserez et al. Chem Rev. .

Abstract

Polymeric materials produced from fossil fuels have been intimately linked to the development of industrial activities in the 20th century and, consequently, to the transformation of our way of living. While this has brought many benefits, the fabrication and disposal of these materials is bringing enormous sustainable challenges. Thus, materials that are produced in a more sustainable fashion and whose degradation products are harmless to the environment are urgently needed. Natural biopolymers─which can compete with and sometimes surpass the performance of synthetic polymers─provide a great source of inspiration. They are made of natural chemicals, under benign environmental conditions, and their degradation products are harmless. Before these materials can be synthetically replicated, it is essential to elucidate their chemical design and biofabrication. For protein-based materials, this means obtaining the complete sequences of the proteinaceous building blocks, a task that historically took decades of research. Thus, we start this review with a historical perspective on early efforts to obtain the primary sequences of load-bearing proteins, followed by the latest developments in sequencing and proteomic technologies that have greatly accelerated sequencing of extracellular proteins. Next, four main classes of protein materials are presented, namely fibrous materials, bioelastomers exhibiting high reversible deformability, hard bulk materials, and biological adhesives. In each class, we focus on the design at the primary and secondary structure levels and discuss their interplays with the mechanical response. We finally discuss earlier and the latest research to artificially produce protein-based materials using biotechnology and synthetic biology, including current developments by start-up companies to scale-up the production of proteinaceous materials in an economically viable manner.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Timeline discoveries of several load-bearing proteins that have gathered broad research interest as biological model systems in biomimetic materials engineering.
Figure 2
Figure 2
Overall approach to obtain complete sequences of load-bearing structural proteins. (A) Extraction of mRNAs from the cells and glands where proteins are stockpiled prior to secretion in the extracellular milieu. (B) Construction of cDNA library from mRNAs extracted in the secretory cells and glands. (C) Transcriptome assembly. (D) Isolation of proteins or peptide fragments from the biological material of interest (i) and gel electrophoresis of isolated compounds (ii). In-gel digestion of isolated proteins or protein fragments is carried out for subsequent proteomic studies. (E) High-throughput proteomic of digested peptides from (D). In the past decade, tandem mass spectroscopy (MS/MS) has been the most-used tool, but more traditional tools such as N-terminus sequencing can still provide highly valuable information. (F) Peptide fragments sequenced by proteomic methods are probed against the translated transcriptome using bioinformatic tools. (G) The full sequences of the candidates genes of interest coding for identified proteins in (F) are verified by RACE PCR. (H) The newly identified proteins can be cloned in suitable vectors and recombinantly expressed, for example in E. coli (shown), yeast, or other types of host.
Figure 3
Figure 3
Fibrous protein materials. (A) Orb-weaving spiders can produce up to seven different silk types, namely (i) pyriform attachment silk, (ii) cylindrical egg wrapping silk, (iii) minor ampullate silk, (iv) major ampullate silk, (v) flagelliform silk, (vi) aggregate silk glue, and (vii) aciniform prey wrapping silk. Schematic representation of the 3-block spidroin architecture consisting of the non-repetitive N- and C-terminal domains and the long central repetitive block, which is shared among all silk types. (viii) Small portions of the amino acid sequences for the repeating units of four major silks described in the text are presented: Araneus ventricosus Dragline spidroin-1 (UniProtKB - A0A090BQB1); Latrodectus hesperus Pyriform spidroin-1 (UniProtKB - C7T5D2); Nephila clavipes Flagelliform spidroin-1 (UniProtKB - Q9NHW4); Latrodectus hesperus Aggregate glue spidroin-1 (UniProtKB - A0A140DL44). These sequences are from refs (−160), respectively. (B) Hagfish (here the pacific hagfish (Eptatretus stoutii is represented) secrete a two-component viscous slime when threatened by predators. (i) The tough slime is made of mucins and fibrous proteins. (ii) The proteins are heterodimeric coiled-coil intermediate filaments. (iii) Primary architecture of the two coiled-coil proteins EsTKα and EsTKβ. The solid rectangles represent the coiled-coil regions. Adapted with permission from ref (161). Copyright 2017 Royal Society of Chemistry. (C) Velvet worms (onychophora) swiftly eject (i) proteinaceous fibers to capture their prey. Adapted with permission from ref (162). Copyright 2018 American Chemical Society. (ii) The fibers consist of a multi-protein complex, and the dominant proteins in the complex are large MW proteins (180–250 kDa) that are mostly disordered in solution, with a few short β-sheet domains predicted in silico. (iii) The primary structure of the main slime proteins can be divided into three main domains, i.e. long disordered domains, low sequence complexity domains enriched in Gly and Ser at the N-terminus, and interspersed repeat domains 20–30 amino acid long. Adapted from ref (163). Creative Commons CC BY.
Figure 4
Figure 4
Bioelastomeric protein materials. (A) Human elastin provides elasticity to vascular organs, lungs, and skin (i). (ii) Despite its high molecular mobility, the tropoelastin molecule (the precursor of elastin) has a defined tertiary structure consisting of an elongated asymmetric coil connected to a foot-like protrusion by a hinge domain. Adapted from ref (47). Copyright 2011 National Academy of Sciences. (iii) The primary structure of tropoelastin is made of alternative blocks of hydrophobic and cross-linking (KA and KP) domains. Representative peptide motifs for each domain are shown. (B) In flying insects, such as the fruit fly and the dragonfly, the hinge of wings is made of the resilin protein (i). (ii) The secondary structure of resilin is largely disordered. (iii) The primary structure of resilin has three main regions. The N- and C-terminus providing elasticity are highly repetitive (consensus motifs shown) while the central region has a Rebers & Riddiford (R&R) chitin binding domain. Adapted with permission from ref (218). Copyright 2012 Springer-Nature. (C) Mussel byssal threads are (i) mechanically graded fibers that transition from a stiff proximal region to an elastomeric distal region. (ii) The core of a thread is made of three main proteins (PreCol-P, PreCol-NG, and PreCol-D) with different spatial distributions along the thread as schematically illustrated. Adapted with permission from ref (89). Copyright 2004 American Chemical Society. (iii) PreCols consist of a central collagenous core flanked by either stiff, silk-like domains (PreCol-D) or flexible, elastin-like domains (PreCol-P) that are packed into regular lattices. The termini of the PreCols contain His- and Dopa-rich domains that enable cross-linking via either metal ion coordination bonds or Di-Dopa covalent bonds (see Figure 5 for details on the cross-linking chemistry). Adapted with permission from ref (219). Copyright 2021 American Chemical Society (D) Marine snails (here a whelk) lay their embryos within (i) elastomeric egg capsules. During deposition, a female will deposit a single string of egg capsules attached to a central filament (mermaid necklace). (ii) Egg capsules are made of coiled-coil intermediate filaments bundling together to form the macroscopic capsules. (iii) Coiled-coil domains are made of heptad repeats [abcdefg] where a and d are hydrophobic residues. (E) In scallop and other bivalves, the adductor muscle is made of the abductin protein (i), which is the only documented bioelastomer to work under compression loading. (ii) Abductin has been found to have random coil structure. (iii) The primary structure consists of multiple repeats of a Gly-rich motif that has a central Meth residue.
Figure 5
Figure 5
Cross-linking chemistry in various protein-based biological materials. (A) In elastin, cross-linking is initiated on Lys residues that are converted into allysine (ALys) by the lysyl oxidase enzyme (top panel). Spontaneous condensation between Lys and ALys leads to the intermediate cross-links lysinonorleucine, ALys aldol, and merodesmosine (middle panel). In the final step, cyclization condensation results in the cyclic cross-links desmosine and isodesmosine (bottom panel). Adapted with permission from ref (221). Copyright 1998 John Wiley & Sons. (B) Cross-links in resilin are simpler and consist of dityrosine, triggered by enzymatic oxidation of Tyr residues with peroxidase that leads to oxidative phenolic coupling. (C) Cross-links in the PreCol proteins of mussel threads are Di-Dopa and His-Metal coordination bonds. The Dopa-Fe3+ coordination bonds are particularly prominent in the protective coating of the byssus made of MFP-1. (D) In insect cuticles, cross-linking starts by oxidation of Tyr amino acids into Dopa, which is subsequently decarboxylated to dopamine and then acylated to either N-acetyldopamine (NADA) (i) or N-β-alanyldopamine (NBAD) (ii) resulting in acyldopamine derivatives. These compounds are enzymatically oxidized into o-quinones, which are then isomerized to quinone methides that can subsequently react with nucleophiles (X) on the aromatic ring (iii) or with the side-chain (iv) to produce catecholic-based cross-links. Alternatively, cross-links can be formed by reaction of the quinone methide intermediates with another available NADA (v) or other nucleophilic reactants (Y and X, vi). (i-vi) Adapted with permission from ref (222). Copyright 2010 Elsevier. Catecholamine-His adducts with covalent attachment between the imidazole and aromatic rings have also been identified as cross-links in insect cuticles, including His-dopamine (His-DA, vii), His-dihydroxylphenyl ethanol (His-DOPET, viii), and N-acetyl-His-NADA (NAc-His-NADA, ix). (vii-ix) Adapted with permission from ref (223) Copyright 1999 Elsevier. (E) Identified cross-links in squid beak proteins include covalent adducts between His side-chains with either the low MW compound 4-methyl catechol (4MC) or Dopa side-chains. Adapted with permission from ref (138). Copyright 2010 American Society for Biochemistry and Molecular Biology.
Figure 6
Figure 6
Uniaxial tensile response of coiled-coil elastomeric proteins. (A) The whelk egg capsules (shown in Figure 4D) can display large reversible deformation, but unlike most elastomers, the elasticity is not entropically dominated. (B) During straining, coiled-coils gradually unravel and eventually transition into β-sheets. Upon unloading the process is reversible and coiled-coils reform. The whole process dissipates significant elastic energy as highlighted by the large hysteresis of the stress–strain curve in (A). Adapted with permission from ref (252). Copyright 2009 Nature Publishing Group.
Figure 7
Figure 7
Bioadhesive proteins. (A) At the distal end of mussel threads, the adhesive plaque is constructed from different plaque proteins. (i) In Mytilus genus schematically represented here, MFP-3 and MFP-5 are the adhesive primers that are initially secreted and are in direct contact with the substrate. MFP-6 is a Cys-rich protein located in the adhesive layer. Adapted with permission from ref (91). Copyright 2017 The Company of Biologists Ltd. The plaque is linked to the core of the thread (made of PreCols, see Figure 4C) through the linker proteins MFP-2 and MFP-4. (ii) The byssus of mussels is assembled in a process similar to polymer injection molding. MFPs are stored in different secretory glands and secreted into the foot groove in a spatiotemporal controlled manner. Adapted from ref (283). Creative Commons CC BY. (iii) MFP-3 and -5 are rich in Dopa and contain multiple copies of the dipeptide Dopa-Lys (Y*K). Cys residues of MFP-6 are oxidized as disulfide bonds, ensuring that Dopa residues from MFP-5 remain reduced and not oxidized into Dopa-quinone. (B) The barnacle cement (i) is a permanent adhesive made of cement proteins (CPs) that ensures robust bonding of the barnacle shell to solid substrates. (ii) The cement is made of amyloid-like cross-β nanofibrils. (iii) In acorn barnacles (M. rosa), at least five CPs have been identified. A spatial distribution of CPs across the cement has been proposed, which has however been questioned in recent years. Adapted with permission from ref (292). Copyright 2013 Taylor & Francis. (C) The sandcastle worms construct a tubular shell (i) with sand grains, comminuted shells, and a secreted polyelectrolytic bioadhesive. (ii) Oppositely charged proteins are stored in homogeneous and heterogeneous granules. Co-secretion of the preassembled adhesive packets leads to the homogeneous curing of the cement. Adapted with permission from ref (293). Copyright 2013 American Chemical Society. (iii) Polycationic protein Pc-1 and polyanionic protein Pc-2 are rich in Dopa. Pc-3 contains an extremely high (∼70%) phosphoserine (pSer) content. (D) Biofilm formation and interfacial adhesion of the enterobacteria is mediated through (i) curli fibers on the cell surface. (ii) The self-assembly of curli’s is carried with a complex outer membrane machinery. (iii) It is composed of multiple Csg subunits each with very distinct functionality crucially important for the controlled growth of the curli fibers and for host cell adhesion, invasion, and colonization. Adapted with permission from ref (294). Copyright 2018 Elsevier. (E). Many fungi such as (i) Trichoderma reesei secret a group of surface-active amphiphilic structural proteins known as (ii) hydrophobins at the air–water interface. Their primary function is to act as water repellent coatings but also mediating communication of the hyphal network with their surrounding environment during growth and development. Their interfacial assemblies and the ability to make strong adhesion to various surfaces are also linked to infections related to pathogenic fungi. (iii) There are two major classes of hydrophobins, Class I (UniProtKB - P52754)(295) and Class II (UniProtKB - A0A023WG46), each with very distinct sequence, molecular structure, hydropathy plots, and solubility, resulting in different biophysical characteristics. (F) Caddisfly larvae construct a protective casing tube (i) made of inorganic sediments. (ii) The inorganic particles are glued together with silk-like fibroins that form antiparallel β-hairpins stabilized by Ca2+ ions. (iii) Silk fibroins are very large MW proteins comprised of large modular blocks (D, E, and F), which are themselves made of smaller positively and negatively charged submotifs as well as hydrophobic motifs. In the negatively charged motif, Ser residues are often phosphorylated (pSer). (ii) and (iii) Adapted with permission from ref (297). Copyright 2013 American Chemical Society.
Figure 8
Figure 8
Side-chain molecular interactions of MFPs with solid substrates during adhesive plaque deposition under the mussel foot. Dopa can mediate H-bonds and coordination bonds with oxide surfaces (although the latter is usually not occurring at the low pH under the mussel foot) as well as hydrophobic interactions with hydrophobic surfaces. Additionally Lys and pSer can electrostatically interact with negatively charged and positively charged surfaces, respectively. Adapted with permission from ref (91). Copyright 2017 The Company of Biologists Ltd.
Figure 9
Figure 9
Structural studies of the barnacle cement protein MrCP20. (A) Primary structure with the identified secondary structure regions identified by solution NMR. (B) Tertiary structure obtained by solution NMR (i) and calculated electrostatic surface potential from (ii). (i) reproduced with permission from ref (351). Copyright 2019 Royal Society. (ii) reproduced with permission from ref (340). Copyright 2020 American Chemical Society. (C) Molecular dynamics (MD) simulations of MrCP20 in the presence of CO32– and Ca2+ ions, predicting the formation of ion clusters around the surface of the protein. (D) MD simulations of MrCP20 on calcite surfaces. (C) and (D) reproduced with permission from ref (340). Copyright 2020 American Chemical Society. (E) Calcium carbonate (CaCO3) crystallization in the absence (i) or presence (ii) of MrCP20. In the presence of MrCP20, CaCO3 crystallizes in the metastable vaterite polymorphism. (F) In the presence of CaCO3, MrCP20 self-assembles into nanofibrils as seen by AFM (i) and TEM (ii) imaging. (E) and (F) reproduced with permission from ref (339). Copyright 2021 American Chemical Society.
Figure 10
Figure 10
Hard bulk materials predominantly made of proteins. (A) The polychaete bloodworm (Glycera dibranchiata) is equipped with two pairs of hard jaws (i) for grasping prey. (ii) The jaw is a composite material made of at least one multi-task protein (MTP), melanin, Cu2+ ions, as well as the Cu-based atacamite biomineral nanofibers (not shown in this cartoon). The melanin network is interspersed with MTP and Cu2+. (iii) MTP is highly enriched in Gly and His residues arranged in short modular tri- and tetrapeptide repeats. The chemical structure of melanin is also shown. (ii) and (iii) Adapted with permission from ref (431). Copyright 2022 Elsevier. (B) The Nereis genus of polychaete (i) has one pair of jaws. (ii) The jaws are made of cross-linked proteins and Zn2+ ions. In the mature state the proteins are mostly in the β-sheet conformation and form a cross-linked network with Zn2+ ions as coordination centers. Adapted with permission from ref (432). Copyright 2008 American Chemical Society. (iii) Nereis jaw proteins are highly enriched in Gly and His residues, the latter forming coordination bonds with Zn2+ ions. (C) The exoskeleton (cuticle) of insects, here represented by a beetle, is a multi-layer composite (i) structure. The procuticle is divided into different layers (exo-, meso-, and endocuticle) made of chitin and proteins in different ratios and levels of cross-linking depending on the layer. (ii) Cuticular proteins (CuPs) contain chitin-binding domains (CBDs) that fold (a CBD from D. gigas beak is illustrated, left) in the presence of chitin. A predicted CBD/chitin interface is also shown (right). Reproduced with permission from ref (433). Copyright 2021 Elsevier. (iii) A common primary sequence feature of cuticle proteins is the RR2 consensus. (D) Squids are equipped with two hard biotools for predatory purposes. (i) D. gigas squid beak is a biomolecular composite of proteins and chitin that are distributed in a graded fashion along the beak (ii). Adapted from ref (136). (iii) Two protein families are found in the squid beak, namely chitin-binding proteins (here illustrated with DgCBP-1, with CBDs shown as orange rectangles) and His-rich proteins (here illustrated with DgHBP-1). DgCBPs are most abundant in the proximal, soft region of the beak, while DgHBPs are more concentrated in the hard tip (rostrum) region, where they are heavily cross-linked. The C-termini of DgHBPs are highly repetitive, containing multiple copies of pentapeptide motifs enriched in Gly, His, Tyr, Phe, and Ala. (iv) The second hard tissue in squids is sucker ring teeth (SRT) that are entirely made of a protein family dubbed suckerins. (v) Suckerins self-assemble into hexagonally packed fibrils made of nanoconfined β-sheets. Adapted from ref (434). Creative Commons CC BY. (vi) Suckerins exhibit a block copolymer primary structure with alternative blocks of β-sheet forming domains enriched in Ala, His, and Thr and longer amorphous domains enriched in Gly and Tyr. Adapted with permission from ref (134). Copyright 2014 American Chemical Society.
Figure 11
Figure 11
Sequence homology of CPs from different arthropods containing the RR consensus (green residues). Reproduced with permission from ref (433). Copyright 2021 Elsevier.
Figure 12
Figure 12
Diverse material applications from biotechnologically produced spider silk proteins. Various fabrication methodologies such as aerosol, electrospinning, immersion, microfluidics, lithography, and many more have been employed to fabricate materials with length scales ranging from nanometer- to millimeter-scale. This includes nanospheres, non-woven nanofibrils, composites, films, fibers, adhesives, hydrogels, and aerogels with potential medical and industrial applications.
Figure 13
Figure 13
Applications of recombinant suckerins (rec-suckerins) and suckerin peptides. (A) Sucker ring teeth (SRT) and modular architecture of suckerin-12 and suckerin-19. (B) Rec-suckerin-19 can be processed into Di-Tyr cross-linked hydrogels and stiff materials (i) with a broad range of elastic modulus. Cell culture studies have shown that rec-suckerins are non-cytotoxic, with cells rapidly proliferating on rec-suckerins substrates. Reproduced with permission from ref (484). Copyright 2015 Wiley-VCH. (ii) Rec-suckerin-12 can also form hydrogels with tunable elastic modulus by incubation in various salts of the Hofmeister series. Reproduced with permission from ref (590). Copyright 2019 Wiley-VCH. (C) Rec-suckerin-19 self-assembled into drug-loaded nanoparticles (NPs) for nanomedicine applications (i). Reproduced with permission from ref (591). Copyright 2017 American Chemical Society. (ii) Rec-suckerin-12 can also be self-assembled into ca. 100 nm NPs with quasi-monodisperse size distribution when incubating in weakly kosmotropic salts. Reproduced with permission from ref (592). Copyright 2020 American Chemical Society. (D) In a reductionist approach, short peptides derived from suckerins can also be used as building blocks to construct nano- and biomaterials. (i) The short peptide A1H1 can be self-assembled into stiff mesoscale fibers (left, reproduced with permission from ref (477). Copyright 2017 American Chemical Society) made of amyloid-like cross-β nanofibrils as predicted by MD simulations (right, reproduced from ref (479). Creative Commons CC BY-NC 4.0. (ii) The short peptide GV8 can form stiff hydrogels in water (left, reproduced from ref (485). Creative Commons CC BY). Large macromolecular therapeutics, such as proteins, can be encapsulated in GV8 hydrogels and released in a controlled fashion, for example in wounds (right, reproduced with permission from ref (593). Copyright 2021 Elsevier).
Figure 14
Figure 14
Schematic overview of the steps involved in the recombinant production of spider silk proteins in various expression hosts. The central region of the figure illustrates the key steps involved in the recombinant production of any structural proteins. For example, the gene encoding the desired spidroin from a spider is fused to the promoter-based expression vector and subsequently transformed into an expression host such as E. coli. The process involves the selection of positively transformed cells, and small- and large-scale cultivations. In the next step, the target proteins are purified. This is mainly done using fast protein liquid chromatography (FPLC) techniques. The structure and function of the purified proteins are then validated using techniques such as gel electrophoresis, MS/MS, solid or solution NMR, CD, FTIR, DLS, WAXS-SAXS, and many other activity assays. The outer region of the figure demonstrates the expression hosts used for the recombinant production of spider silk proteins which are categorized into three major groups, namely microbial protein expression systems, expression in eukaryotic cell systems, and transgenic plants and animals.

Similar articles

Cited by

References

    1. Andrady A. L.; Neal M. A. Applications and Societal Benefits of Plastics. Philos. Trans. R. Soc. B 2009, 364, 1977–1984. 10.1098/rstb.2008.0304. - DOI - PMC - PubMed
    1. Philp J. C.; Ritchie R. J.; Allan J. E. M. Biobased Chemicals: The Convergence of Green Chemistry with Industrial Biotechnology. Trends Biotechnol. 2013, 31, 219–222. 10.1016/j.tibtech.2012.12.007. - DOI - PubMed
    1. Cózar A.; Echevarría F.; González-Gordillo J. I.; Irigoien X.; Úbeda B.; Hernández-León S.; Palma Á. T.; Navarro S.; García-de-Lomas J.; Ruiz A.; et al. Plastic Debris in the Open Ocean. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 10239–10244. 10.1073/pnas.1314705111. - DOI - PMC - PubMed
    1. Eriksen M.; Lebreton L. C. M.; Carson H. S.; Thiel M.; Moore C. J.; Borerro J. C.; Galgani F.; Ryan P. G.; Reisser J. Plastic Pollution in the World’s Oceans: More Than 5 Trillion Plastic Pieces Weighing over 250,000 Tons Afloat at Sea. PLoS One 2014, 9, 111913.10.1371/journal.pone.0111913. - DOI - PMC - PubMed
    1. Hamilton J. D.; Reinert K. H.; Hagan J. V.; Lord W. V. Polymers as Solid Waste in Municipal Landfills. J. Air Waste Manag. Assoc. 1995, 45, 247–251. 10.1080/10473289.1995.10467364. - DOI - PubMed

Publication types