ipd: an R package for conducting inference on predicted data
- PMID: 39898809
- PMCID: PMC11842045
- DOI: 10.1093/bioinformatics/btaf055
ipd: an R package for conducting inference on predicted data
Abstract
Summary: ipd is an open-source R software package for the downstream modeling of an outcome and its associated features where a potentially sizable portion of the outcome data has been imputed by an artificial intelligence or machine learning prediction algorithm. The package implements several recent proposed methods for inference on predicted data with a single, user-friendly wrapper function, ipd. The package also provides custom print, summary, tidy, glance, and augment methods to facilitate easy model inspection. This document introduces the ipd software package and provides a demonstration of its basic usage.
Availability: ipd is freely available on CRAN or as a developer version at our GitHub page: github.com/ipd-tools/ipd. Full documentation, including detailed instructions and a usage 'vignette' are available at github.com/ipd-tools/ipd.
© The Author(s) 2025. Published by Oxford University Press.
Figures
References
-
- Angelopoulos AN, Bates S, Fannjiang C et al. Prediction-powered inference. Science 2023a;382:669–74. - PubMed
-
- Angelopoulos AN, Duchi JC, Zrnic T. PPI++: Efficient prediction-powered inference. arXiv, arXiv:2311.01453, 2023b, preprint: not peer reviewed.
-
- Egami N, Hinck M, Stewart B et al. Using imperfect surrogates for downstream inference: design-based supervised learning for social science applications of large language models. Adv Neural Inf Process Syst 2023;36:68589–601.
-
- Hoffman K, Salerno S, Afiaz A et al. Do we really even need data? arXiv, arXiv:2401.08702, 2024, preprint: not peer reviewed.
-
- Miao J, Lu Q. Task-agnostic machine learning-assisted inference. Adv Neural Inf Process Syst 2024;37:106162–89.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
