Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Sep;198(1):167-70.
doi: 10.1534/genetics.114.166769. Epub 2014 Jul 9.

A defined zebrafish line for high-throughput genetics and genomics: NHGRI-1

Affiliations

A defined zebrafish line for high-throughput genetics and genomics: NHGRI-1

Matthew C LaFave et al. Genetics. 2014 Sep.

Abstract

Substantial intrastrain variation at the nucleotide level complicates molecular and genetic studies in zebrafish, such as the use of CRISPRs or morpholinos to inactivate genes. In the absence of robust inbred zebrafish lines, we generated NHGRI-1, a healthy and fecund strain derived from founder parents we sequenced to a depth of ∼50×. Within this strain, we have identified the majority of the genome that matches the reference sequence and documented most of the variants. This strain has utility for many reasons, but in particular it will be useful for any researcher who needs to know the exact sequence (with all variants) of a particular genomic region or who wants to be able to robustly map sequences back to a genome with all possible variants defined.

Keywords: CRISPR; SNV; genome sequence; variants; zebrafish.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Screenshot of the UCSC browser custom tracks for NHGRI-1. Twenty mating pairs from 6-month-old TAB-5 fish were screened to select a robust founding pair with good clutch size and healthy progeny; the most fecund pair was renamed NHGRI-1. Fin clips from the NHGRI-1 male and female were prepared as separate genomic DNA libraries and sequenced on the Illumina HiSeq 2000 by the National Institutes of Health (NIH) Intramural Sequencing Center. Both libraries were subjected to paired-end sequencing with 101-bp reads. We aligned the sequence to the zebrafish genome [Zv9 (Howe et al. 2013)] with Novoalign version 2.08.02 (http://www.novocraft.com/). We removed PCR duplicates via SAMtools version 0.1.18 (Li et al. 2009). We used bam2mpg to identify the most probable genotype (MPG) for nucleotides in both parents (Teer et al. 2010). Bases that did not have an MPG score of at least 10, coverage of at least 20×, and a ratio of MPG score to coverage >0.5 were discarded. Regions of low sequence complexity were not specifically excluded from the analysis unless they failed to meet these criteria. The bases that matched the reference and met the above criteria in both fish were used to build the BED track of invariant nucleotides. The top track indicates the bases that were invariant in both fish sequenced. The white regions indicate either variation in at least one fish or insufficient read depth to confidently call the region as invariant. The second track indicates two nonsense mutations detected in this region. The letter indicates the alternative allele, and the color indicates whether the mutation was homozygous (red) or heterozygous (blue) in the NHGRI-1 population. Both tracks are available on the ZebrafishGenomics track hub, which is hosted at http://research.nhgri.nih.gov/manuscripts/Burgess/zebrafish/downloads/NHGRI-1/hub.txt and accessible through http://genome.ucsc.edu/cgi-bin/hgHubConnect.
Figure 2
Figure 2
Deletion and insertion variant length distribution within exons. (A) The 3160 DIVs in exons. (B) The 2,210,080 DIVs detected genome-wide. Red bars indicate the number of deletions of a given length; blue bars represent insertions.
Figure 3
Figure 3
SNV overlap with publicly available data sets. This comparison incorporates only SNVs that were biallelic and for which the reference base was an unambiguous A, C, G, or T. The Bowen et al. (2012) SNVs were downloaded from http://fishbonelab.org/harris/Resources_files/parental_variants.tar; both data sets were downloaded on March 12th, 2014.

References

    1. Bedell V. M., Wang Y., Campbell J. M., Poshusta T. L., Starker C. G., et al. , 2012. In vivo genome editing using a high-efficiency TALEN system. Nature 491: 114–118 - PMC - PubMed
    1. Bowen M. E., Henke K., Siegfried K. R., Warman M. L., Harris M. P., 2012. Efficient mapping and cloning of mutations in zebrafish by low-coverage whole-genome sequencing. Genetics 190: 1017–1024 - PMC - PubMed
    1. Chen F. C., Chen C. J., Li W. H., Chuang T. J., 2007. Human-specific insertions and deletions inferred from mammalian genome sequences. Genome Res. 17: 16–22 - PMC - PubMed
    1. Cong L., Ran F. A., Cox D., Lin S., Barretto R., et al. , 2013. Multiplex genome engineering using CRISPR/Cas systems. Science 339: 819–823 - PMC - PubMed
    1. Doyon Y., McCammon J. M., Miller J. C., Faraji F., Ngo C., et al. , 2008. Heritable targeted gene disruption in zebrafish using designed zinc-finger nucleases. Nat. Biotechnol. 26: 702–708 - PMC - PubMed

Publication types