A primer on classical test theory and item response theory for assessments in medical education
- PMID: 20078762
- DOI: 10.1111/j.1365-2923.2009.03425.x
A primer on classical test theory and item response theory for assessments in medical education
Abstract
Context: A test score is a number which purportedly reflects a candidate's proficiency in some clearly defined knowledge or skill domain. A test theory model is necessary to help us better understand the relationship that exists between the observed (or actual) score on an examination and the underlying proficiency in the domain, which is generally unobserved. Common test theory models include classical test theory (CTT) and item response theory (IRT). The widespread use of IRT models over the past several decades attests to their importance in the development and analysis of assessments in medical education. Item response theory models are used for a host of purposes, including item analysis, test form assembly and equating. Although helpful in many circumstances, IRT models make fairly strong assumptions and are mathematically much more complex than CTT models. Consequently, there are instances in which it might be more appropriate to use CTT, especially when common assumptions of IRT cannot be readily met, or in more local settings, such as those that may characterise many medical school examinations.
Objectives: The objective of this paper is to provide an overview of both CTT and IRT to the practitioner involved in the development and scoring of medical education assessments.
Methods: The tenets of CCT and IRT are initially described. Then, main uses of both models in test development and psychometric activities are illustrated via several practical examples. Finally, general recommendations pertaining to the use of each model in practice are outlined.
Discussion: Classical test theory and IRT are widely used to address measurement-related issues that arise from commonly used assessments in medical education, including multiple-choice examinations, objective structured clinical examinations, ward ratings and workplace evaluations. The present paper provides an introduction to these models and how they can be applied to answer common assessment questions.
Comment in
-
Improving the flexibility and efficiency of testing.Med Educ. 2010 Jan;44(1):18-9. doi: 10.1111/j.1365-2923.2009.03551.x. Med Educ. 2010. PMID: 20078752 No abstract available.
Similar articles
-
Post-examination interpretation of objective test data: monitoring and improving the quality of high-stakes examinations: AMEE Guide No. 66.Med Teach. 2012;34(3):e161-75. doi: 10.3109/0142159X.2012.651178. Med Teach. 2012. PMID: 22364473
-
Item response theory test equating in health sciences education.Adv Health Sci Educ Theory Pract. 2008 Mar;13(1):3-10. doi: 10.1007/s10459-006-9020-8. Epub 2006 Jul 18. Adv Health Sci Educ Theory Pract. 2008. PMID: 16847729
-
The value of item response theory in clinical assessment: a review.Assessment. 2011 Sep;18(3):291-307. doi: 10.1177/1073191110374797. Epub 2010 Jul 19. Assessment. 2011. PMID: 20644081 Review.
-
Item response theory: applications of modern test theory in medical education.Med Educ. 2003 Aug;37(8):739-45. doi: 10.1046/j.1365-2923.2003.01587.x. Med Educ. 2003. PMID: 12945568
-
Classical test theory.Med Care. 2006 Nov;44(11 Suppl 3):S50-9. doi: 10.1097/01.mlr.0000245426.10853.30. Med Care. 2006. PMID: 17060836 Review.
Cited by
-
Cochlear Implant Quality of Life (CIQOL): Development of a Profile Instrument (CIQOL-35 Profile) and a Global Measure (CIQOL-10 Global).J Speech Lang Hear Res. 2019 Sep 20;62(9):3554-3563. doi: 10.1044/2019_JSLHR-H-19-0142. Epub 2019 Sep 4. J Speech Lang Hear Res. 2019. PMID: 31479616 Free PMC article.
-
Comparisons of item difficulty and passing scores by test equating in a basic medical education curriculum.Korean J Med Educ. 2019 Jun;31(2):147-157. doi: 10.3946/kjme.2019.126. Epub 2019 May 30. Korean J Med Educ. 2019. PMID: 31230437 Free PMC article.
-
Electronic Feedback Alone Versus Electronic Feedback Plus in-Person Debriefing for a Serious Game Designed to Teach Novice Anesthesiology Residents to Perform General Anesthesia for Cesarean Delivery: Randomized Controlled Trial.JMIR Serious Games. 2024 Nov 19;12:e59047. doi: 10.2196/59047. JMIR Serious Games. 2024. PMID: 39622704 Free PMC article.
-
Estimating cognitive load during self-regulation of brain activity and neurofeedback with therapeutic brain-computer interfaces.Front Behav Neurosci. 2015 Feb 16;9:21. doi: 10.3389/fnbeh.2015.00021. eCollection 2015. Front Behav Neurosci. 2015. PMID: 25762908 Free PMC article.
-
The impact of item-writing flaws and item complexity on examination item difficulty and discrimination value.BMC Med Educ. 2016 Sep 29;16(1):250. doi: 10.1186/s12909-016-0773-3. BMC Med Educ. 2016. PMID: 27681933 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous