Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 1;8(6):giz073.
doi: 10.1093/gigascience/giz073.

A large interactive visual database of copy number variants discovered in taurine cattle

Affiliations

A large interactive visual database of copy number variants discovered in taurine cattle

Arun Kommadath et al. Gigascience. .

Abstract

Background: Copy number variants (CNVs) contribute to genetic diversity and phenotypic variation. We aimed to discover CNVs in taurine cattle using a large collection of whole-genome sequences and to provide an interactive database of the identified CNV regions (CNVRs) that includes visualizations of sequence read alignments, CNV boundaries, and genome annotations.

Results: CNVs were identified in each of 4 whole-genome sequencing datasets, which together represent >500 bulls from 17 breeds, using a popular multi-sample read-depth-based algorithm, cn.MOPS. Quality control and CNVR construction, performed dataset-wise to avoid batch effects, resulted in 26,223 CNVRs covering 107.75 unique Mb (4.05%) of the bovine genome. Hierarchical clustering of samples by CNVR genotypes indicated clear separation by breeds. An interactive HTML database was created that allows data filtering options, provides graphical and tabular data summaries including Hardy-Weinberg equilibrium tests on genotype proportions, and displays genes and quantitative trait loci at each CNVR. Notably, the database provides sequence read alignments at each CNVR genotype and the boundaries of constituent CNVs in individual samples. Besides numerous novel discoveries, we corroborated the genotypes reported for a CNVR at the KIT locus known to be associated with the piebald coat colour phenotype in Hereford and some Simmental cattle.

Conclusions: We present a large comprehensive collection of taurine cattle CNVs in a novel interactive visual database that displays CNV boundaries, read depths, and genome features for individual CNVRs, thus providing users with a powerful means to explore and scrutinize CNVRs of interest more thoroughly.

Keywords: CNV; beef; cattle; dairy; database; sequence visualization; structural variants; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1:
Figure 1:
Batch effects amongst the 4 datasets contributing to inconsistent distribution of CNV genotypes in the analysis of the combined datasets. (a) PCA based on normalized read counts per segment showed separation by datasets and 4 outliers. (b) When datasets were combined and analysed together using cn.MOPS (N = 549 after removing PCA outliers), the distribution of CNV genotypes revealed considerable differences among datasets (only autosomal CNVs are depicted here).
Figure 2:
Figure 2:
Distributions of CNV genotypes were more consistent across datasets that were analysed individually. When datasets were analysed individually (N = 546 after removing PCA outliers and high-coverage outlier samples in dataset A), the distribution of CNV genotypes was consistent among datasets (only autosomal CNVs are depicted here).
Figure 3:
Figure 3:
Proportions of overlapping CNVRs amongst datasets. Pairwise comparisons of the proportions of CNVRs in each dataset (rows; ordered by dataset size) that overlap by ≥1 base pair with CNVRs of other larger datasets (columns) are presented.
Figure 4:
Figure 4:
Prevalence and genotypes of the KIT locus CNV across breeds and datasets. The breed-wise prevalence and genotypes at CNVR Chr6:71,747,001–71,752,000, found ~45 kb upstream of the KIT gene, are depicted here. This CNVR has been reported to be associated with the piebald coat colour phenotype in HER and some SIM cattle, and occurs in high copy numbers in these breeds. The reason for detection of this CNVR in high copy number in 2 of the 22 CHA cattle in dataset B is attributed to potential issues with sourcing or handling of the respective samples.
Figure 5:
Figure 5:
Key features of the functionality of the CNVR database. The database has an index view and a detailed view with an option to enable/disable the help function on the top right of each page. The index page (a) has a panel (Filters) that allows users to apply filters to the CNVRs such as CNVR length or the number of samples that must contain the CNVR and the ability to exclude/include specific samples based on regular expression matches. Another panel (Statistics) provides summary information on the CNVRs before and after applying the filters. The remaining panels on the index page allow users to search and sort on CNVRs, overlapping genes, and QTLs and/or samples to quickly find CNVRs associated with a particular gene/QTL. All or selected data can be exported as CSV files. CNVRs of interest can be noted as favorites; and comments can be added for individual CNVRs. All comments, filters, and/or favorites can be saved as a text file that can be reloaded later using the Settings button options on the top right of the page. Clicking on a CNVR provides a detailed view (b) with panels displaying basic statistics on the CNVR (Summary), a bar plot of the number of samples per CNV genotype (Genotype distribution), and another bar plot of the number of non-CN2 variants per breed (Breed distribution), graphical representation of the CNVR in genomic context (Overlapping genes, QTLs, and CNVs), sequence read coverage at the CNVR for up to 3 samples per genotype (IGV images), a table of all the samples indicating the CNV genotype (CNVR-specific sample list), and finally a sample view that provides, for the selected sample, a graphical representation of the CNVR and CNV in genomic context with overlapping genes and QTLs.

References

    1. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7:85–97. - PubMed
    1. Sudmant PH, Rausch T, Gardner EJ, et al. ., An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81. - PMC - PubMed
    1. Keel BN, Lindholm-Perry AK, Snelling WM. Evolutionary and functional features of copy number variation in the cattle genome. Front Genet. 2016;7:207. - PMC - PubMed
    1. Canales CP, Walz K. Copy number variation and susceptibility to complex traits. EMBO Mol Med. 2011;3:1–4. - PMC - PubMed
    1. Zarrei M, MacDonald JR, Merico D, et al. ., A copy number variation map of the human genome. Nat Rev Genet. 2015;16:172–83. - PubMed

Publication types