Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Dec 5;24(1):101890.
doi: 10.1016/j.isci.2020.101890. eCollection 2021 Jan 22.

Machine learning in plant science and plant breeding

Affiliations
Review

Machine learning in plant science and plant breeding

Aalt Dirk Jan van Dijk et al. iScience. .

Abstract

Technological developments have revolutionized measurements on plant genotypes and phenotypes, leading to routine production of large, complex data sets. This has led to increased efforts to extract meaning from these measurements and to integrate various data sets. Concurrently, machine learning has rapidly evolved and is now widely applied in science in general and in plant genotyping and phenotyping in particular. Here, we review the application of machine learning in the context of plant science and plant breeding. We focus on analyses at different phenotype levels, from biochemical to yield, and in connecting genotypes to these. In this way, we illustrate how machine learning offers a suite of methods that enable researchers to find meaningful patterns in relevant plant data.

Keywords: Artificial Intelligence; Plant Bioinformatics; Plant Biotechnology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
In plant sciences and plant breeding, variation at the genotype level (genomics data) is linked to phenotypic variation at various biochemical levels of organization (-omics data), at the cellular level, and at the macroscopic level at different scales Both the analysis of phenotypic measurements (sections 2 and 3) and genotype-phenotype prediction (section 4) increasingly rely on ML.
None
Figure. Machine learning (ML) is a subfield of the broader field of artificial intelligence. Within ML, a major distinction is between supervised and unsupervised methods. Supervised methods use labeled input data. One example is classification, in which discrete labels (green squares vs. red circles) are available, and the model learns to predict the label for new objects. The curve at the right hand side visualizes a decision boundary, which represents what the supervised model has learned. Unsupervised methods do not use labels but find groups or trends in data. One example is clustering, which in the example visualized detects two groups.
Figure 2
Figure 2
Overview of biochemical and cellular measurements A variety of “omics” (genomics, transcriptomics, proteomics, metabolomics) data can be measured. Machine learning is used to analyze these data at various levels and with various goals (bottom).
Figure 3
Figure 3
Overview of plant phenotyping systems Plants can be observed at different levels (development, growth, production) using different types of sensors and sensor systems. Machine learning plays an important role in processing the sensor data to measure traits at the various levels (red box).

References

    1. Abdollahi-Arpanahi R., Gianola D., Peñagaricano F. Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes. Genet. Sel. Evol. 2020;52:12. doi: 10.1186/s12711-020-00531-z. - DOI - PMC - PubMed
    1. Amarasinghe S.L., Su S., Dong X., Zappia L., Ritchie M.E., Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 2020;21:30. - PMC - PubMed
    1. Araus J.L., Kefauver S.C., Zaman-Allah M., Olsen M.S., Cairns J.E. Translating high-throughput phenotyping into genetic gain. Trends Plant Sci. 2018;23:451–466. doi: 10.1016/j.tplants.2018.02.001. - DOI - PMC - PubMed
    1. Azodi C.B., Bolger E., McCarren A., Roantree M., de los Campos G., Shiu S.-H. Benchmarking algorithms for genomic prediction of complex traits. G3 (Bethesda) 2019;9:3691–3702. doi: 10.1101/614479. - DOI - PMC - PubMed
    1. Azodi C.B., Tang J., Shiu S.-H. Opening the black box: interpretable machine learning for geneticists. Trends Genet. 2020;36:442–455. - PubMed

LinkOut - more resources