Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 26;9(12):e115740.
doi: 10.1371/journal.pone.0115740. eCollection 2014.

Allele Workbench: transcriptome pipeline and interactive graphics for allele-specific expression

Affiliations

Allele Workbench: transcriptome pipeline and interactive graphics for allele-specific expression

Carol A Soderlund et al. PLoS One. .

Abstract

Sequencing the transcriptome can answer various questions such as determining the transcripts expressed in a given species for a specific tissue or condition, evaluating differential expression, discovering variants, and evaluating allele-specific expression. Differential expression evaluates the expression differences between different strains, tissues, and conditions. Allele-specific expression evaluates expression differences between parental alleles. Both differential expression and allele-specific expression have been studied for heterosis (hybrid vigor), where the hybrid has improved performance over the parents for one or more traits. The Allele Workbench software was developed for a heterosis study that evaluated allele-specific expression for a mouse F1 hybrid using libraries from multiple tissues with biological replicates. This software has been made into a distributable package, which includes a pipeline, a Java interface to build the database, and a Java interface for query and display of the results. The required input is a reference genome, annotation file, and one or more RNA-Seq libraries with optional replicates. It evaluates allelic imbalance at the SNP and transcript level and flags transcripts with significant opposite directional allele-specific expression. The Java interface allows the user to view data from libraries, replicates, genes, transcripts, exons, and variants, including queries on allele imbalance for selected libraries. To determine the impact of allele-specific SNPs on protein folding, variants are annotated with their effect (e.g., missense), and the parental protein sequences may be exported for protein folding analysis. The Allele Workbench processing results in transcript files and read counts that can be used as input to the previously published Transcriptome Computational Workbench, which has a new algorithm for determining a trimmed set of gene ontology terms. The software with demo files is available from https://code.google.com/p/allele-workbench. Additionally, all software is ready for immediate use from an Atmosphere Virtual Machine Image available from the iPlant Collaborative (www.iplantcollaborative.org).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. viewAW tables.
The blue circles represent tables that can be queried in viewAW. From each table, one or more rows may be selected to view the associated table of data, which is indicated by the pointed-to circles. The “LibList” is the library counts for a selected set of genes, transcripts or SNPs, which link to the associated replicate counts.
Figure 2
Figure 2. viewAW transcript table.
The columns are shown in the lower panel; when an adjoining box is checked, the corresponding column is shown in the table. Selecting “Hide” closes the column listing. The SpNYfKid and SpNYfLiv columns are the SNP coverage p-values. The RpNYfKid and RpNYfLiv are the read counts p-values. The #SNPCov is the number of SNPs with ≥20 reads for any library, #SNPAI is number of SNP that are AI (p-value <0.05) for any library, and #Mis is the number of missense SNPs. #SNPCov and #SNPAI take into account all four libraries, where only two are shown but the others can be viewed by selecting their respective column box next to “Tissue”.
Figure 3
Figure 3. viewAW drawing of a gene with three transcripts and 11 variants.
The black exons are non-coding. The coding exons that are stacked but are different colors have different coordinates, e.g. the stack with two pink exons (the same) and a blue (different). The long vertical lines represent SNPs (black) and indels (red); if the number below the variant line is followed by an “*”, then it is AI (p-value <0.05) for at least one library, e.g. variant #2 is AI for libraries NYfBr and NYfLiv.
Figure 4
Figure 4. viewAW drilling down into the data.
(a) The table shows the variants for an AI transcript. The S:NYfMus column displays the ref:alt SNP coverage for library NYfMus, and the SpNYfMus column shows the corresponding p-values. There are three AI SNPs, where two are ref> alt and the other is alt
Figure 5
Figure 5. TCW trimmed GO set.
All 76 DE-enriched GOs are shown in the table, and the 24 green rows are the trimmed set.

References

    1. Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, et al. (2009) Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25:3207–3212. - PMC - PubMed
    1. Wu TD, Nacu S (2010) Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26:873–881. - PMC - PubMed
    1. GSNAP README (version 2013-07-16). Available: http://github.com/julian-gehring/GMAP-GSNAP/blob/master/README. Accessed 4 September 2014.
    1. Satya RV, Zavaljevski N, Reifman J (2012) A new strategy to reduce allelic bias in RNA-Seq readmapping. Nucleic Acids Res 40:e127. - PMC - PubMed
    1. Stevenson KR, Coolon JD, Wittkopp PJ (2013) Sources of bias in measures of allele-specific expression derived from RNA-sequence data aligned to a single reference genome. BMC Genomics 14:536. - PMC - PubMed

Publication types

LinkOut - more resources