An experimental comparison of multiple-choice and short-answer questions on a high-stakes test for medical students
- PMID: 37665413
- PMCID: PMC11208249
- DOI: 10.1007/s10459-023-10266-3
An experimental comparison of multiple-choice and short-answer questions on a high-stakes test for medical students
Abstract
Recent advances in automated scoring technology have made it practical to replace multiple-choice questions (MCQs) with short-answer questions (SAQs) in large-scale, high-stakes assessments. However, most previous research comparing these formats has used small examinee samples testing under low-stakes conditions. Additionally, previous studies have not reported on the time required to respond to the two item types. This study compares the difficulty, discrimination, and time requirements for the two formats when examinees responded as part of a large-scale, high-stakes assessment. Seventy-one MCQs were converted to SAQs. These matched items were randomly assigned to examinees completing a high-stakes assessment of internal medicine. No examinee saw the same item in both formats. Items administered in the SAQ format were generally more difficult than items in the MCQ format. The discrimination index for SAQs was modestly higher than that for MCQs and response times were substantially higher for SAQs. These results support the interchangeability of MCQs and SAQs. When it is important that the examinee generate the response rather than selecting it, SAQs may be preferred. The results relating to difficulty and discrimination reported in this paper are consistent with those of previous studies. The results on the relative time requirements for the two formats suggest that with a fixed testing time fewer SAQs can be administered, this limitation more than makes up for the higher discrimination that has been reported for SAQs. We additionally examine the extent to which increased difficulty may directly impact the discrimination of SAQs.
Keywords: Constructed response; Item performance; Multiple choice; Short answer.
© 2023. NBME.
Conflict of interest statement
The authors have no conflicts of interest regarding this research.
Figures






References
-
- Baldwin P. A problem with the bookmark procedure’s correction for guessing. Educational Measurement. 2021;40:7–15. doi: 10.1111/emip.12400. - DOI
-
- Bridgeman B. A simple answer to a simple question on answer changing. Journal of Educational Measurement. 2012;49:467–468. doi: 10.1111/j.1745-3984.2012.00189.x. - DOI
-
- Clauser BE, Margolis MJ, Swanson DB. An examination of the contribution of the computer-based case simulations to the USMLE Step 3 examination. Academic Medicine (RIME Supplement) 2002;77(10):S80–S82. - PubMed
-
- Heemskerk L, Norman G, Chou S, Mintz M, Mandin H, McLaughlin K. The effect of question format and task difficulty on reasoning strategies and diagnostic performance in internal medicine residents. Advances in Health Sciences Education. 2008;13(4):453–462. doi: 10.1007/s10459-006-9057-8. - DOI - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources