Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 14;20(1):217.
doi: 10.1186/s12864-019-5551-2.

TADKB: Family classification and a knowledge base of topologically associating domains

Affiliations

TADKB: Family classification and a knowledge base of topologically associating domains

Tong Liu et al. BMC Genomics. .

Abstract

Background: Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs.

Results: We built an online knowledge base TADKB integrating knowledge for TADs in eleven cell types of human and mouse. For each TAD, TADKB provides the predicted three-dimensional (3D) structures of chromosomes and TADs, and detailed annotations about the protein-coding genes and long non-coding RNAs (lncRNAs) existent in each TAD. Besides the 3D chromosomal structures inferred by population Hi-C, the single-cell haplotype-resolved chromosomal 3D structures of 17 GM12878 cells are also integrated in TADKB. A user can submit query gene/lncRNA ID/sequence to search for the TAD(s) that contain(s) the query gene or lncRNA. We also classified TADs into families. To achieve that, we used the TM-scores between reconstructed 3D structures of TADs as structural similarities and the Pearson's correlation coefficients between the fold enrichment of chromatin states as functional similarities. All of the TADs in one cell type were clustered based on structural and functional similarities respectively using the spectral clustering algorithm with various predefined numbers of clusters. We have compared the overlapping TADs from structural and functional clusters and found that most of the TADs in the functional clusters with depleted chromatin states are clustered into one or two structural clusters. This novel finding indicates a connection between the 3D structures of TADs and their DNA functions in terms of chromatin states.

Conclusion: TADKB is available at http://dna.cs.miami.edu/TADKB/ .

Keywords: Family classification; Long non-coding RNAs; Single-cell 3D genome structures; TADs; Topologically associating domains; lncRNAs.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
The webpage of TADKB that allows a user to browse all the TADs for a cell or cell line
Fig. 2
Fig. 2
The annotation page of TADKB showing the information about a single TAD with MDS-based reconstructed 3D structure
Fig. 3
Fig. 3
The annotation page of TADKB with the 3D structure of the chromosome displayed in single-cell Hi-C (red color highlights the TAD and blue color highlights the starting and end positions of the 3D structure)
Fig. 4
Fig. 4
The TADKB page showing the annotations of protein coding genes. When a user selects gene(s) from the list in the middle, the annotations of that gene(s) will be displayed on the panel on the right. Meanwhile, the location of the gene(s) will be highlighted on the 3D structure of the TAD on the left
Fig. 5
Fig. 5
The TADKB page showing the annotations of protein coding genes with a TAD’s reconstructed 3D structure of extracted from 3D structure of single-cell chromosome
Fig. 6
Fig. 6
The TADKB page showing the annotations of lncRNAs. Three major lncRNA databases NONCODE, LNCipedia, and lncRNAdb are integrated. Different IDs from different lncRNA databases will be unified. The locations of the selected lncRNA(s) will be highlighted on the 3D structure of the TAD on the left
Fig. 7
Fig. 7
The TADKB page showing the loops or peaks. Loops in DNA can indicate the enhancer-promoter interaction
Fig. 8
Fig. 8
The TADKB page showing the fold enrichment of chromatin states. Red color indicates fold enrichment larger than 1, otherwise blue color
Fig. 9
Fig. 9
a Each chromatin-state cluster’s fold enrichment of 25 states. b The normalized overlapping TAD enrichment between chromatin-state clusters and structural clusters. c The original overlapping TAD numbers between chromatin-state clusters and structural clusters. d, e, and f The distribution of exponent parameters, radius of gyration, and gene density of the TADs in structural clusters. The cell type is GM12878 and the predefined numbers of chromatin-state and structural clusters are 20 and 5, respectively. Gene density is calculated by normalizing the number of protein-coding genes found within a TAD by the TAD’s number of bins
Fig. 10
Fig. 10
The family browsing page of TADKB listing all the families of a species. In this example, the families of human GM12878 are listed
Fig. 11
Fig. 11
The distribution of Pearson’s correlation coefficients between two acrossCells’ fold enrichment of chromatin states and TM-scores between two acrossCells’ MDS-inferred 3D structures. TADKB provides acrossCells for 15 cell pairs from six human cell types (x labels)
Fig. 12
Fig. 12
The acrossCells browsing page of TADKB listing all the acrossCells between two cell types. In this example, the acrossCells between human GM12878 and human HMEC are listed
Fig. 13
Fig. 13
The searching page of TADKB that allows a user to input a query DNA sequence to search against human and mouse genomes
Fig. 14
Fig. 14
The result page from the searching function of TADKB showing all the TADs that contains the query sequence. A user can further click any of the hit TADs and view more information about it

References

    1. Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485(7398):376–380. - PMC - PubMed
    1. Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, Sanborn AL, Machol I, Omer AD, Lander ES. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159(7):1665–1680. - PMC - PubMed
    1. Dixon JR, Gorkin DU, Ren B. Chromatin domains: the unit of chromosome organization. Mol Cell. 2016;62(5):668–680. - PMC - PubMed
    1. Zuin J, Dixon JR, van der Reijden MI, Ye Z, Kolovos P, Brouwer RW, van de Corput MP, van de Werken HJ, Knoch TA, van IJcken WF. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc Natl Acad Sci. 2014;111(3):996–1001. - PMC - PubMed
    1. Rudan MV, Barrington C, Henderson S, Ernst C, Odom DT, Tanay A, Hadjur S. Comparative hi-C reveals that CTCF underlies evolution of chromosomal domain architecture. Cell Rep. 2015;10(8):1297–1309. - PMC - PubMed

Substances

LinkOut - more resources