Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 6;12(7):jkac078.
doi: 10.1093/g3journal/jkac078.

Breedbase: a digital ecosystem for modern plant breeding

Affiliations

Breedbase: a digital ecosystem for modern plant breeding

Nicolas Morales et al. G3 (Bethesda). .

Abstract

Modern breeding methods integrate next-generation sequencing and phenomics to identify plants with the best characteristics and greatest genetic merit for use as parents in subsequent breeding cycles to ultimately create improved cultivars able to sustain high adoption rates by farmers. This data-driven approach hinges on strong foundations in data management, quality control, and analytics. Of crucial importance is a central database able to (1) track breeding materials, (2) store experimental evaluations, (3) record phenotypic measurements using consistent ontologies, (4) store genotypic information, and (5) implement algorithms for analysis, prediction, and selection decisions. Because of the complexity of the breeding process, breeding databases also tend to be complex, difficult, and expensive to implement and maintain. Here, we present a breeding database system, Breedbase (https://breedbase.org/, last accessed 4/18/2022). Originally initiated as Cassavabase (https://cassavabase.org/, last accessed 4/18/2022) with the NextGen Cassava project (https://www.nextgencassava.org/, last accessed 4/18/2022), and later developed into a crop-agnostic system, it is presently used by dozens of different crops and projects. The system is web based and is available as open source software. It is available on GitHub (https://github.com/solgenomics/, last accessed 4/18/2022) and packaged in a Docker image for deployment (https://hub.docker.com/u/breedbase, last accessed 4/18/2022). The Breedbase system enables breeding programs to better manage and leverage their data for decision making within a fully integrated digital ecosystem.

Keywords: breeding; database; digital agriculture; digital ecosystem; genome-based breeding; genomic selection; genotyping; open source breeding software; phenotyping; predictive breeding; web-based software.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
a) Breedbase platform architecture. User interface: To offer a dynamic, highly interactive user interface, several JavaScript libraries are implemented including D3, JQuery, and Bootstrap. RESTful APIs, including a full BrAPI 2.0 implementation, handle the communication between the front and back end, allowing fast calculations without reloading the website. HTML5 for interactive graphical display, allowing instant reorganization of visual elements. The Bootstrap framework is used for modern and dynamic page templating. Middleware layer: A Perl software stack including Mason components to connect to the user interface, a Catalyst a web application framework, Moose an object oriented perl library and DBIX::Class an object-relational mapper to connect to SQL code. In addition, BrAPI libraries are used. Finally a job cluster scheduler, Slurm is implemented to allocate server resources and ensure scalability. Data source layer: Breedbase operates on a relational database using Postgres. Postgres 12.0 offers “Big data” solutions including parallel query execution and optimized binary JSON data type handling. Binary JSON (JSONB) is a simple data structure designed to be storage space and scan-speed efficient. In Breedbase, JSONB is used in various data types including genotypic (marker) information. In addition to the relational database a standard file system space is available for flat files. Finally, other databases can communicate to a Breedbase instance to provide additional back-end for marker data [i.e. Genomic Open Source Informatic Initiative (GOBii)] or to exchange germplasm information for example. b) Breedbase codevelopment process. User–developers interactions are promoted using various media. Users have online access to documentation (https://solgenomics.github.io/sgn/, last accessed 4/18/2022), video tutorials, or through onsite training. Software development goals are extensively discussed between developers, data managers, breeders, and other appropriate stakeholders. Agile development allows short-term product release. Suggested improvements, issues, and bugs discovered in Breedbase are submitted and tracked on the public GitHub issue tracking software (https://github.com/, last accessed 4/18/2022). Software development progress is tracked using a version control system and Docker releases. c) Cassavabase, a breedbase instance: data content overview. Cassavabase involves national and international breeding programs (22) from various African and South American countries (15) and currently has 1,131 registered users. Cassavabase hosts various data types including high-density and low-density genotyping assays (35,000), plot-based phenotypic data points (near 15 million), images from plants and plots from trials (5107) and locations (435).
Fig. 2.
Fig. 2.
Screenshot of the “Search Wizard” interface, a central query function on Breedbase. With the Search Wizard, the data in the database can be intersected by dimensions, such as locations, years, breeding programs, and traits. For each dimension, a number of elements can be selected. The individual selected dimensions can be stored in lists, and the combined selections can be saved as a dataset. Both lists and datasets can be used to feed data into various tools on Breedbase.

References

    1. Arnaud E, Laporte M-A, Kim S, Aubert C, Leonelli S, Miro B, Cooper L, et al.The ontologies community of practice: a CGIAR initiative for big data in agrifood systems. Patterns. 2020;1(7):100105. - PMC - PubMed
    1. Andrade-Sanchez P, Gore MA, Heun JT, Thorp KR, Carmo-Silva AE, French AN, Salvucci ME, White JW.. Development and evaluation of a field-based high-throughput phenotyping platform. Funct Plant Biol. 2013;41(1):68–79. - PubMed
    1. Beck K, Andres C.. 2004. Extreme Programming Explained: Embrace Change. Boston: Addison-Wesley.
    1. Bombarely A, Menda N, Tecle IY, Buels RM, Strickler S, Fischer-York T, Pujar A, Leto J, Gosselin J, Mueller LA.. The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl. Nucleic Acids Res. 2011;39(Database issue):D1149–D1155. - PMC - PubMed
    1. Breseghello F, Coelho ASG.. Traditional and modern plant breeding methods with examples in rice (Oryza sativa L.). J Agric Food Chem. 2013;61(35):8277–8286. - PubMed

Publication types

LinkOut - more resources