Computing the polytomous discrimination index
- PMID: 33866577
- DOI: 10.1002/sim.8991
Computing the polytomous discrimination index
Abstract
Polytomous regression models generalize logistic models for the case of a categorical outcome variable with more than two distinct categories. These models are currently used in clinical research, and it is essential to measure their abilities to distinguish between the categories of the outcome. In 2012, van Calster et al proposed the polytomous discrimination index (PDI) as an extension of the binary discrimination c-statistic to unordered polytomous regression. The PDI is a summary of the simultaneous discrimination between all outcome categories. Previous implementations of the PDI are not capable of running on "big data." This article shows that the PDI formula can be manipulated to depend only on the distributions of the predicted probabilities evaluated for each outcome category and within each observed level of the outcome, which substantially improves the computation time. We present a SAS macro and R function that can rapidly evaluate the PDI and its components. The routines are evaluated on several simulated datasets after varying the number of categories of the outcome and size of the data and two real-world large administrative health datasets. We compare PDI with two other discrimination indices: M-index and hypervolume under the manifold (HUM) on simulated examples. We describe situations where the PDI and HUM, indices based on multiple comparisons, are superior to the M-index, an index based on pairwise comparisons, to detect predictions that are no different than random selection or erroneous due to incorrect ranking.
Keywords: R function; SAS macro; discrimination; polytomous discrimination index; polytomous regression.
© 2021 John Wiley & Sons Ltd.
Similar articles
-
Extending the c-statistic to nominal polytomous outcomes: the Polytomous Discrimination Index.Stat Med. 2012 Oct 15;31(23):2610-26. doi: 10.1002/sim.5321. Epub 2012 Jun 26. Stat Med. 2012. PMID: 22733650
-
Assessing the discriminative ability of risk models for more than two outcome categories.Eur J Epidemiol. 2012 Oct;27(10):761-70. doi: 10.1007/s10654-012-9733-3. Epub 2012 Oct 7. Eur J Epidemiol. 2012. PMID: 23054032
-
Polytomous logistic regression analysis could be applied more often in diagnostic research.J Clin Epidemiol. 2008 Feb;61(2):125-34. doi: 10.1016/j.jclinepi.2007.03.002. Epub 2007 Jun 29. J Clin Epidemiol. 2008. PMID: 18177785
-
[Overview of logistic regression model analysis and application].Zhonghua Yu Fang Yi Xue Za Zhi. 2019 Sep 6;53(9):955-960. doi: 10.3760/cma.j.issn.0253-9624.2019.09.018. Zhonghua Yu Fang Yi Xue Za Zhi. 2019. PMID: 31474082 Review. Chinese.
-
A discussion of calibration techniques for evaluating binary and categorical predictive models.Prev Vet Med. 2018 Jan 1;149:107-114. doi: 10.1016/j.prevetmed.2017.11.018. Epub 2017 Nov 24. Prev Vet Med. 2018. PMID: 29290291 Review.
Cited by
-
The impact of COVID-19 on antibiotic prescribing in primary care in England: Evaluation and risk prediction of appropriateness of type and repeat prescribing.J Infect. 2023 Jul;87(1):1-11. doi: 10.1016/j.jinf.2023.05.010. Epub 2023 May 12. J Infect. 2023. PMID: 37182748 Free PMC article.
-
Associations between social determinants of health and weight status in preschool children: a population-based study.Health Promot Chronic Dis Prev Can. 2023 Jun;43(6):281-289. doi: 10.24095/hpcdp.43.6.02. Health Promot Chronic Dis Prev Can. 2023. PMID: 37379357 Free PMC article.
-
AutoScore-Ordinal: an interpretable machine learning framework for generating scoring models for ordinal outcomes.BMC Med Res Methodol. 2022 Nov 4;22(1):286. doi: 10.1186/s12874-022-01770-y. BMC Med Res Methodol. 2022. PMID: 36333672 Free PMC article.
-
Development and external validation of prediction algorithms to improve early diagnosis of cancer.Nat Commun. 2025 May 7;16(1):3660. doi: 10.1038/s41467-025-57990-5. Nat Commun. 2025. PMID: 40335498 Free PMC article.
-
A network approach to compute hypervolume under receiver operating characteristic manifold for multi-class biomarkers.Stat Med. 2023 Jan 3:10.1002/sim.9646. doi: 10.1002/sim.9646. Online ahead of print. Stat Med. 2023. PMID: 36597213 Free PMC article.
References
REFERENCES
-
- Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn. 2001;45(2):171-186. https://doi.org/10.1023/A:1010920819831.
-
- Obuchowski NA, Goske MJ, Applegate KE. Assessing physicians' accuracy in diagnosing paediatric patients with acute abdominal pain: measuring accuracy for multiple diseases. Stat Med. 2001;20(21):3261-3278. https://doi.org/10.1002/sim.944.
-
- Ferri C, Hernández-Orallo J, Salido MA. Volume under the ROC surface for multi-class problems. Lavrac N, Gamberger D, Blockeel H, Todorovski L, Machine Learning: ECML 2003. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer; 2003;2837:108-120. https://doi.org/10.1007/978-3-540-39857-8_12.
-
- Mossman D. Three-way ROCs. Med Decis Making. 1999;19(1):78-89. https://doi.org/10.1177/0272989X9901900110.
-
- Scurfield BK. Multiple-event forced-choice tasks in the theory of signal detectability. J Math Psychol. 1996;40(3):253-269. https://doi.org/10.1006/jmps.1996.0024.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Other Literature Sources