Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 3;9(1):10.
doi: 10.1186/s13395-019-0196-z.

Muscle Gene Sets: a versatile methodological aid to functional genomics in the neuromuscular field

Affiliations

Muscle Gene Sets: a versatile methodological aid to functional genomics in the neuromuscular field

Apostolos Malatras et al. Skelet Muscle. .

Abstract

Background: The approach of building large collections of gene sets and then systematically testing hypotheses across these collections is a powerful tool in functional genomics, both in the pathway analysis of omics data and to uncover the polygenic effects associated with complex diseases in genome-wide association study. The Molecular Signatures Database includes collections of oncogenic and immunologic signatures enabling researchers to compare transcriptional datasets across hundreds of previous studies and leading to important insights in these fields, but such a resource does not currently exist for neuromuscular research. In previous work, we have shown the utility of gene set approaches to understand muscle cell physiology and pathology.

Methods: Following a systematic survey of public muscle data, we passed gene expression profiles from 4305 samples through a robust pre-processing and standardized data analysis pipeline. Two hundred eighty-two samples were discarded based on a battery of rigorous global quality controls. From among the remaining studies, 578 comparisons of interest were identified by a combination of text mining and manual curation of the study meta-data. For each comparison, significantly dysregulated genes (FDR adjusted p < 0.05) were identified.

Results: Lists of dysregulated genes were divided between upregulated and downregulated to give 1156 Muscle Gene Sets (MGS). This resource is available for download ( www.sys-myo.com/muscle_gene_sets ) and is accessible through three commonly used functional genomics platforms (GSEA, EnrichR, and WebGestalt). Basic guidance and recommendations are provided for the use of MGS through these platforms. In addition, consensus muscle gene sets were created to capture the overlap between the results of similar studies, and analysis of these highlighted the potential for novel disease-relevant findings.

Conclusions: The MGS resource can be used to investigate the behaviour of any list of genes across previous comparisons of muscle conditions, to compare previous studies to one another, and to explore the functional relationship of muscle dysregulation to the Gene Ontology. Its major intended use is in enrichment testing for functional genomics analysis.

Keywords: Functional enrichment; Functional genomics; GWAS; Gene expression; Gene sets; Neuromuscular; Pathway analysis; Skeletal muscle; Transcriptomics.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Naming convention for Muscle Gene Sets. Each name was chosen to be both succinct and readily understandable. This was not an automated process—consideration was given to the name of each gene set. The first segment, before the triple underscore, has the generic form ‘up_in_Group1_v_Group2’ or ‘down_in_Group1_v_Group2’, referring to genes that were up- or downregulated in the comparison of group 1 (e.g. mdx) to group 2 (e.g. WT), for which ‘up’ indicates greater expression in group1 compared to group2, and ‘down’ means lesser expression in group1. Following the triple underscore, species name is then given, then age/timepoint and/or tissue description and/or gender (in any order). Finally, each gene set is given a MGS ID number. List of time abbreviations used: h = hour(s); d = day(s); wk. = week(s); mo = month(s); y = year(s). List of other abbreviation conventions used (ordered by appearance in the complete MGS gmt file): ctl = control; WT = wild-type; gastroc/gastr = gastrocnemius muscle; DMD = Duchenne muscular dystrophy; quad = quadriceps muscle; skel = skeletal; dysf = dysferlinopathy; EDMD = Emery-Dreifuss muscular dystrophy; EDL = extensor digitorum longus muscle; TA/tib_anterior = tibialis anterior muscle; diff = differentiation/differentiated (of myotubes); prim = primary cells; vast_lat/vastus_lat = vastus lateralis; KO = knock-out; mir = microRNA. Some study-specific abbreviations are used, which are assumed to be understandable from context or occasionally requiring reference to the source GEO entry indicated in the information column of the gmt file
Fig. 2
Fig. 2
Proportional composition of the MGS collection broken down by tissue type, research theme, and myopathy sub-type, for human and murine species. a Tissue types. Shown are all tissue categories containing 10 or more gene sets. ‘Mixed’ indicates that the comparison is between different tissue types (e.g. gastrocnemius vs vastus lateralis). The ‘unspecified’ category indicates gene sets from studies in which the specific muscle tissue was not given in the published work. b Research themes. Shown are all themes containing 10 or more gene sets. c Myopathy sub-types. These are sub-categories of the myopathy set in b. All are shown

References

    1. Li W, Freudenberg J, Oswald M. Principles for the organization of gene-sets. Comput Biol Chem. 2015;59:139–149. doi: 10.1016/j.compbiolchem.2015.04.005. - DOI - PubMed
    1. Zid BM, O’Shea EK. Promoter sequences direct cytoplasmic localization and translation of mRNAs during starvation in yeast. Nature. 2014;514:117–121. doi: 10.1038/nature13578. - DOI - PMC - PubMed
    1. Katayama Y, Nishiyama M, Shoji H, Ohkawa Y, Kawamura A, Sato T, et al. CHD8 haploinsufficiency results in autistic-like phenotypes in mice. Nature. 2016;537:675–679. doi: 10.1038/nature19357. - DOI - PubMed
    1. Erstad DJ, Fuchs BC, Tanabe KK. Molecular signatures in hepatocellular carcinoma: a step toward rationally designed cancer therapy. Cancer. 2018;124:3084–3104. doi: 10.1002/cncr.31257. - DOI - PubMed
    1. Ruppert V, Maisch B. Molecular signatures and the study of gene expression profiles in inflammatory heart diseases. Herz. 2012;37:619–626. doi: 10.1007/s00059-012-3662-5. - DOI - PubMed

Publication types