Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 15;35(16):2865-2867.
doi: 10.1093/bioinformatics/bty1044.

M3Drop: dropout-based feature selection for scRNASeq

Affiliations

M3Drop: dropout-based feature selection for scRNASeq

Tallulah S Andrews et al. Bioinformatics. .

Abstract

Motivation: Most genomes contain thousands of genes, but for most functional responses, only a subset of those genes are relevant. To facilitate many single-cell RNASeq (scRNASeq) analyses the set of genes is often reduced through feature selection, i.e. by removing genes only subject to technical noise.

Results: We present M3Drop, an R package that implements popular existing feature selection methods and two novel methods which take advantage of the prevalence of zeros (dropouts) in scRNASeq data to identify features. We show these new methods outperform existing methods on simulated and real datasets.

Availability and implementation: M3Drop is freely available on github as an R package and is compatible with other popular scRNASeq tools: https://github.com/tallulandrews/M3Drop.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Comparison of feature selection methods. (A and B) Accuracy in identifying DE genes in simulated data. (C and D) Reproducibility of features across five mouse embryo and four human pancreas datasets. (E and F) Average fold-change in expression of reproducible features. (C–F) Each point represents a pair of datasets and the horizontal lines indicate the mean across all pairs. PCA scored genes by their loadings for the top components, Gini is the method used by GiniClust (Jiang et al., 2016), Cons is the consensus across all other methods

References

    1. Anders S. et al. (2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat. Protoc., 8, 1765–1786. - PubMed
    1. Brennecke P. et al. (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods, 10, 1093–1095. - PubMed
    1. Grün D. et al. (2014) Validation of noise models for single-cell transcriptomics. Nat. Methods, 11, 637–640. - PubMed
    1. Islam S. et al. (2014) Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods, 11, 163–166. - PubMed
    1. Jiang L. et al. (2016) GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol., 17, 144. - PMC - PubMed

Publication types