Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2020 Jul 8;21(1):470.
doi: 10.1186/s12864-020-06814-5.

Improved reconstruction and comparative analysis of chromosome 12 to rectify Mis-assemblies in Gossypium arboreum

Affiliations
Comparative Study

Improved reconstruction and comparative analysis of chromosome 12 to rectify Mis-assemblies in Gossypium arboreum

Javaria Ashraf et al. BMC Genomics. .

Abstract

Background: Genome sequencing technologies have been improved at an exponential pace but precise chromosome-scale genome assembly still remains a great challenge. The draft genome of cultivated G. arboreum was sequenced and assembled with shotgun sequencing approach, however, it contains several misassemblies. To address this issue, we generated an improved reassembly of G. arboreum chromosome 12 using genetic mapping and reference-assisted approaches and evaluated this reconstruction by comparing with homologous chromosomes of G. raimondii and G. hirsutum.

Results: In this study, we generated a high quality assembly of the 94.64 Mb length of G. arboreum chromosome 12 (A_A12) which comprised of 144 scaffolds and contained 3361 protein coding genes. Evaluation of results using syntenic and collinear analysis of reconstructed G. arboreum chromosome A_A12 with its homologous chromosomes of G. raimondii (D_D08) and G. hirsutum (AD_A12 and AD_D12) confirmed the significant improved quality of current reassembly as compared to previous one. We found major misassemblies in previously assembled chromosome 12 (A_Ca9) of G. arboreum particularly in anchoring and orienting of scaffolds into a pseudo-chromosome. Further, homologous chromosomes 12 of G. raimondii (D_D08) and G. arboreum (A_A12) contained almost equal number of transcription factor (TF) related genes, and showed good collinear relationship with each other. As well, a higher rate of gene loss was found in corresponding homologous chromosomes of tetraploid (AD_A12 and AD_D12) than diploid (A_A12 and D_D08) cotton, signifying that gene loss is likely a continuing process in chromosomal evolution of tetraploid cotton.

Conclusion: This study offers a more accurate strategy to correct misassemblies in sequenced draft genomes of cotton which will provide further insights towards its genome organization.

Keywords: Gene loss; Genetic map; Reference-assisted assembly; Syntenic relationship; Transcription factor.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there is no conflict of interests regarding the publication of this paper.

Figures

Fig. 1
Fig. 1
Schematic diagram for reassembling of G. arboreum chromosome 12 (A_A12). Each rectangle corresponded to procedures applied for chromosome reassembling steps. Genotypic data of 24,569 SNP markers used in previous study [27] was first filtered out for construction of linkage groups, which were then assigned to 13 chromosomes of G. arboreum. Afterwards, linkage group belong to G. arboreum chromosome 12 was used for re-assembling. We checked the alignments of scaffolds belonging to G. arboreum chromosome 12 for following levels: (i) Alignment of G. arboreum scaffolds (obtained by the genetic map) to G. raimondii scaffolds [7], (ii) Orientation of G. raimondii (obtained from the previous step) and G. arboreum scaffolds along G. raimondii chromosome (D_D08) [36], and (iii) adjacency of G. arboreum scaffolds within G. hirsutum chromosome (AD_A12) [8]
Fig. 2
Fig. 2
Syntenic relationship between corresponding homologous chromosomes of different Gossypium species. Syntenic relationship between homologous chromosomes 12 of; aG. raimondii (D_D08) and G. arboreum (A_A12), bG. hirsutum (AD_A12) and G. arboreum (A_A12), and cG. hirsutum (AD_D12) and G. arboreum (A_A12). Syntenic blocks were required to match at least five genes per block after masking repeat regions. Good syntenic relationship was found when comparing the homologous chromosomes of G. raimondii (D_D08) and G. hirsutum (AD_A12 and AD_D12) with reassembled chromosome of G. arboreum (A_A12)
Fig. 3
Fig. 3
Collinearity of reassembled G. arboreum chromosome (A_A12) with 26 chromosomes of G. hirsutum. Collinear relationship of reassembled G. arboreum chromosome (A_A12) with 26 chromosomes of G. hirsutum was determined by MCScan. After masking the repeat regions, collinearity analysis of G. arboreum chromosome A_A12 was carried out with all 26 chromosomes of G. hirsutum. Results indicated good collinear relationship of reassembled G. arboreum chromosome A_A12 with its corresponding homologous chromosomes 12 (AD_A12 and AD_D12) of G. hirsutum as compare to others chromosomes. G. arboreum chromosome 12 was shown by ‘A_A12’ while, chromosomes belong to At and Dt sub-genomes of G. hirsutum were indicated by ‘AD_A’ and ‘AD_D’
Fig. 4
Fig. 4
Dotplot representation between homologous chromosomes of different cotton species. A BLASTP search (with an E-value cutoff of 1 × 10− 5) was performed to identify orthologous genes. Afterwards, dotplots representation among homologous chromosomes of three cotton species was carried out by MCScan. aG. arboreum chromosome A_A12 (Y-axis) vs G. raimondii chromosome D_D08 (X-axis), bG. arboreum chromosome A_A12 (Y-axis) vs G. hirsutum chromosome AD_A12 (X-axis), and cG. arboreum chromosome A_A12 (Y-axis) vs G. hirsutum chromosome AD_D12 (X-axis)
Fig. 5
Fig. 5
Syntenic relationship with previously assembled chromosome 12 of G. arboreum (A_Ca9). Previously assembled chromosome 12 (A_Ca9) of G. arboreum was used to explore the syntenic relationship with a re-assembled G. arboreum chromosome A_A12 and, bG. hirsutum chromosome AD_A12. Syntenic blocks were required to match at least five genes per block. Results indicated poor syntenic relationship of G. arboreum chromosome A_Ca9 with these two chromosomes
Fig. 6
Fig. 6
Chromosomal mapping of the TF-related genes on homologous chromosome 12 of three cotton species. Physical mapping of five major TF-related family members including (a) MYB, (b) C2H2, (c) WRKY, (d) bHLH, (e) ERF was performed in homologous chromosome 12 of G. arboreum (A_A12), G. raimondii (D_D08) and G. hirsutum (AD_A12 and AD_D12). Genes in the positive and negative strands were represented by blue and red colors, while lines signified the collinear genes

Similar articles

References

    1. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269(5223):496–512. - PubMed
    1. Initiative AG. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408(6814):796–815. - PubMed
    1. Sasaki T. The map-based sequence of the rice genome. Nature. 2005;436(7052):793–800. - PubMed
    1. Ming R, Hou S, Feng Y, Yu Q, Dionne-Laporte A, Saw JH, Senin P, Wang W, Ly BV, Lewis KL. The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) Nature. 2008;452(7190):991–996. - PMC - PubMed
    1. Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP. The genome of woodland strawberry (Fragaria vesca) Nat Genet. 2011;43(2):109–116. - PMC - PubMed

Publication types

Substances