Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 9;6(3):lqae097.
doi: 10.1093/nargab/lqae097. eCollection 2024 Sep.

High-quality chromosome scale genome assemblies of two important Sorghum inbred lines, Tx2783 and RTx436

Affiliations

High-quality chromosome scale genome assemblies of two important Sorghum inbred lines, Tx2783 and RTx436

Bo Wang et al. NAR Genom Bioinform. .

Abstract

Sorghum bicolor (L.) Moench is a significant grass crop globally, known for its genetic diversity. High quality genome sequences are needed to capture the diversity. We constructed high-quality, chromosome-level genome assemblies for two vital sorghum inbred lines, Tx2783 and RTx436. Through advanced single-molecule techniques, long-read sequencing and optical maps, we improved average sequence continuity 19-fold and 11-fold higher compared to existing Btx623 v3.0 reference genome and obtained 19 and 18 scaffolds (N50 of 25.6 and 14.4) for Tx2783 and RTx436, respectively. Our gene annotation efforts resulted in 29 612 protein-coding genes for the Tx2783 genome and 29 265 protein-coding genes for the RTx436 genome. Comparative analyses with 26 plant genomes which included 18 sorghum genomes and 8 outgroup species identified around 31 210 protein-coding gene families, with about 13 956 specific to sorghum. Using representative models from gene trees across the 18 sorghum genomes, a total of 72 579 pan-genes were identified, with 14% core, 60% softcore and 26% shell genes. We identified 99 genes in Tx2783 and 107 genes in RTx436 that showed functional enrichment specifically in binding and metabolic processes, as revealed by the GO enrichment Pearson Chi-Square test. We detected 36 potential large inversions in the comparison between the BTx623 Bionano map and the BTx623 v3.1 reference sequence. Strikingly, these inversions were notably absent when comparing Tx2783 or RTx436 with the BTx623 Bionano map. These inversion were mostly in the pericentromeric region which is known to have low complexity regions and harder to assemble and suggests the presence of potential artifacts in the public BTx623 reference assembly. Furthermore, in comparison to Tx2783, RTx436 exhibited 324 883 additional Single Nucleotide Polymorphisms (SNPs) and 16 506 more Insertions/Deletions (INDELs) when using BTx623 as the reference genome. We also characterized approximately 348 nucleotide-binding leucine-rich repeat (NLR) disease resistance genes in the two genomes. These high-quality genomes serve as valuable resources for discovering agronomic traits and structural variation studies.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Number of scaffolds per chromosome and BUSCO assessment. (A) Number of scaffolds per chromosome for the Tx2783 and RTx436 genome assemblies. Each colored bar represents a scaffold. Most scaffold breaks occur at the centromeres. Chr00: Contigs unplaced into the chromosome. ChrC: Chloroplast sequence. ChrM: mitochondria sequence. (B) BUSCO assessment of the indicated sorghum genomes.
Figure 2.
Figure 2.
RAMPAGE signal enrichment around Transcription Start Site(TSS)s in Tx2783 (A) and RTx436 (B) in root and shoot tissues. The x-axis represent distance from TSS of gene in Kilo base pairs and the y-axis in a RAMPAGE TSS peak figure typically represents the read count.
Figure 3.
Figure 3.
(A) Gene Ontology enrichment of genes that are unique to Tx2783. (B) Gene Ontology enrichment of genes that are unique to RTx436
Figure 4.
Figure 4.
Pan-gene index using 18 sorghum genome annotations. (A) Pan gene set growth age of the gene. Core genes remain stable while increase in pan set due to lineage specific genes. (B) Distribution of core (18), softcore (2–17) and cloud (1) genes in the pan gene set. As expected core genes contain older conserved genes while softcore and cloud lineage specific or new evolving genes.
Figure 5.
Figure 5.
(A) Violin plot of NLR variation in pan-genomes of monocot species (Z. mays, S. bicolor and B. distachyon) and eudicot (A. thaliana). (B) Maximum likelihood phylogeny of NLRs containing integrated domains from sorghum lines. Dots indicate bootstrap values >80. Outer ring indicates the additional non-canonical domain present in the NLR-ID. Inner ring represents the type group the sorghum line belongs to.
Figure 6.
Figure 6.
Gene expression analysis after sugarcane aphid infestation. (A) Heatmap of genes expressed before and after sugarcane aphid infestation. (B) Heatmap of NLR genes before and after sugarcane aphid infestation. (C) Clusters of genes expressed before and after sugarcane aphid infestation
Figure 7.
Figure 7.
Alignments of Bionano maps between sorghum genomes. (A) BTx623 v3.1 reference sequence and BTx623 Bionano map; (B) BTx623 v3.1 reference sequence and RTx436 Bionano map; (C) BTx623 v3.1 reference sequence and Tx2783 Bionano map; (D) BTx623 Bionano map and Tx2783 sequence; (E) BTx623 Bionano map and RTx436 sequence; (F) Tx2783 sequence and RTx436 Bionano map.

References

    1. Ordonio R., Ito Y., Morinaka Y., Sazuka T., Matsuoka M.. Molecular breeding of sorghum bicolor, a novel energy crop. Int. Rev. Cell Mol. Biol. 2016; 321:221–257. - PubMed
    1. Morris G.P., Ramu P., Deshpande S.P., Hash C.T., Shah T., Upadhyaya H.D., Riera-Lizarazu O., Brown P.J., Acharya C.B., Mitchell S.E.et al. .. Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc. Natl. Acad. Sci. U.S.A. 2013; 110:453–458. - PMC - PubMed
    1. Paterson A.H., Bowers J.E., Bruggmann R., Dubchak I., Grimwood J., Gundlach H., Haberer G., Hellsten U., Mitros T., Poliakov A.et al. .. The sorghum bicolor genome and the diversification of grasses. Nature. 2009; 457:551–556. - PubMed
    1. Mace E.S., Tai S., Gilding E.K., Li Y., Prentis P.J., Bian L., Campbell B.C., Hu W., Innes D.J., Han X.et al. .. Whole-genome sequencing reveals untapped genetic potential in Africa's indigenous cereal crop sorghum. Nat. Commun. 2013; 4:2320. - PMC - PubMed
    1. Mbulwe L., Peterson G.C., Scott-Armstrong J., Rooney W.L.. Registration of Sorghum germplasm Tx3408 and Tx3409 with tolerance to sugarcane aphid [Melanaphis sacchari (Zehntner)]. Jo. Plant Registrations. 2016; 10:51–56.

LinkOut - more resources