Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 6;51(D1):D1312-D1324.
doi: 10.1093/nar/gkac936.

IAnimal: a cross-species omics knowledgebase for animals

Affiliations

IAnimal: a cross-species omics knowledgebase for animals

Yuhua Fu et al. Nucleic Acids Res. .

Abstract

With the exponential growth of multi-omics data, its integration and utilization have brought unprecedented opportunities for the interpretation of gene regulation mechanisms and the comprehensive analyses of biological systems. IAnimal (https://ianimal.pro/), a cross-species, multi-omics knowledgebase, was developed to improve the utilization of massive public data and simplify the integration of multi-omics information to mine the genetic mechanisms of objective traits. Currently, IAnimal provides 61 191 individual omics data of genome (WGS), transcriptome (RNA-Seq), epigenome (ChIP-Seq, ATAC-Seq) and genome annotation information for 21 species, such as mice, pigs, cattle, chickens, and macaques. The scale of its total clean data has reached 846.46 TB. To better understand the biological significance of omics information, a deep learning model for IAnimal was built based on BioBERT and AutoNER to mine 'gene' and 'trait' entities from 2 794 237 abstracts, which has practical significance for comprehending how each omics layer regulates genes to affect traits. By means of user-friendly web interfaces, flexible data application programming interfaces, and abundant functional modules, IAnimal enables users to easily query, mine, and visualize characteristics in various omics, and to infer how genes play biological roles under the influence of various omics layers.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic diagram of IAnimal. (A) Pipelines of data pre-processing. (B) Back-end data of various omics information stored in IAnimal. (C) Gallery of functional modules in IAnimal.
Figure 2.
Figure 2.
Interface of the Gene Search module. (A) There are five different search modes in the Gene Search module. Users can select an appropriate mode to obtain the gene sets of interests. (B) Users can filter the query results based on features of transcriptome, genome, and literature. When there are too many query results, users can narrow down the candidate gene sets by this mode. (C) Gene information at different omics layers. This mode integrates 11 kinds of information for a queried gene, so users can infer the gene's potential biological functions quickly. (D) Visualization of the gene structure. Users can view the structure information of different gene transcripts intuitively. (E) Gene expression information. This includes the gene expression of each individual and the gene expression in different tissues.
Figure 3.
Figure 3.
The main functions and usage of the Genome section. (A) There are four different search modes in the Variation module. Users can select an appropriate mode to search the variations of interest. (B) There are two methods of constructing subgroups. Subgroups can be constructed through breed information or sample IDs. (C) The Population module helps users customize subgroups. Through the sample information and phylogenetic tree of this module, users can better understand the population structure and construct subgroups. (D) An example of Variation module search results. The similarities or differences of gene frequencies among subgroups can be easily compared. (E) The details of a specified variant. (F) The genotype frequency of a specified locus. (G) The genotypes and basic information of all individuals. Users can select individuals of interest for downstream analysis. (H) The genotype image generated by the Genotype Plotter module. The genotype alleles of homozygous reference, homozygous variant, heterozygote, and missing (no call) are marked in blue, yellow, dark grey and light grey, respectively.
Figure 4.
Figure 4.
The main functions and usage of the Transcriptome section. (A) There are four different search modes in the Gene Expression module. Users can select an appropriate mode to search the gene sets of interest. (B) There are two methods of constructing subgroups. Users can construct subgroups through tissue information or sample IDs. (C) The expression levels of queried genes in customized subgroups. Users can view the expression levels of genes in different subgroups conveniently. (D) Comparison of expression levels of specified genes in different subgroups. Users can select genes and subgroups of interest from a heatmap to be displayed in a boxplot for comparison. (E) The regulatory network of the specified gene and other genes. Users can obtain and visualize the gene set associated with the specified gene through the gene ID and the GCC threshold. (F) Visualization of the regulation patterns among PTPRC, WWTR1, HSPH1 and DNAJB14 between pigs and mice by using the GCC Comparison module.
Figure 5.
Figure 5.
The main functions and usage of the Epigenome section. (A) Enrichment signal level of queried chromosome bins in customized subgroups. (B) The enrichment peak information obtained by the Peak Search module. Users can click the view button to display the peak on JBrowse. (C) Visualization of user-specified enrichment peak by using the JBrowse module. Users can also add other tracks to compare with this peak. (D) Example in which the Signal Plotter module was used to visualize the enrichment signals of CTCT, H3K27, H3K27ac, H3K27me3 and H3K4me3 in the LOX gene region in embryo tissue. This module can display the enrichment signals of multiple samples. (E) Example in which the Signal Comparison module was used to reveal potential links between ChIP-seq, ATAC-seq, and expression level of LOX gene between pig and mice.
Figure 6.
Figure 6.
The main functions and usage of the Literature section. (A) The results of the Entity Search module, using the IGF2 gene as an example. The results contain genes and phenotype entities, and users can click the corresponding sentence or abstract to view the detailed information. (B) The feedback interface for Gene-Trait relationships. This allows users to give feedback on the reliability of relationships between genes and traits. (C) The feedback interface for entity recognition. Users can feed back the accuracy of entity recognition, which will assist the continuous optimization of the named entity recognition model. (D) The word cloud image generated by the Entity Cloud module, using the IGF2 gene as an example. From the word cloud image, users can infer that IGF2 may be related to muscle growth. (E) The word cloud image generated by the Entity Cloud module, using coat color trait as an example. From the word cloud image, users can infer that the KIT gene plays an important role in regulating this trait.

Similar articles

Cited by

References

    1. Subramanian I., Verma S., Kumar S., Jere A., Anamika K.. Multi-omics data integration, interpretation, and its application. Bioinform Biol. Insights. 2020; 14:1177932219899051. - PMC - PubMed
    1. Luo Y., Hitz B.C., Gabdank I., Hilton J.A., Kagda M.S., Lam B., Myers Z., Sud P., Jou J., Lin K.et al. .. New developments on the encyclopedia of DNA elements (ENCODE) data portal. Nucleic Acids Res. 2020; 48:D882–D889. - PMC - PubMed
    1. FAANG Consortium Giuffra E., Tuggle C.K.. Functional annotation of animal genomes (FAANG): current achievements and roadmap. Annu. Rev. Anim. Biosci. 2019; 7:65–88. - PubMed
    1. Fu Y., Fan P., Wang L., Shu Z., Zhu S., Feng S., Li X., Qiu X., Zhao S., Liu X.. Improvement, identification, and target prediction for miRNAs in the porcine genome by using massive, public high-throughput sequencing data. J. Anim. Sci. 2021; 99:skab018. - PMC - PubMed
    1. Li C., Tian D., Tang B., Liu X., Teng X., Zhao W., Zhang Z., Song S.. Genome variation map: a worldwide collection of genome variations across multiple species. Nucleic Acids Res. 2021; 49:D1186–D1191. - PMC - PubMed

Publication types