How well do workplace-based assessments support summative entrustment decisions? A multi-institutional generalisability study
- PMID: 38167833
- DOI: 10.1111/medu.15291
How well do workplace-based assessments support summative entrustment decisions? A multi-institutional generalisability study
Abstract
Background: Assessment of the Core Entrustable Professional Activities for Entering Residency requires direct observation through workplace-based assessments (WBAs). Single-institution studies have demonstrated mixed findings regarding the reliability of WBAs developed to measure student progression towards entrustment. Factors such as faculty development, rater engagement and scale selection have been suggested to improve reliability. The purpose of this investigation was to conduct a multi-institutional generalisability study to determine the influence of specific factors on reliability of WBAs.
Methods: The authors analysed WBA data obtained for clerkship-level students across seven institutions from 2018 to 2020. Institutions implemented a variety of strategies including selection of designated assessors, altered scales and different EPAs. Data were aggregated by these factors. Generalisability theory was then used to examine the internal structure validity evidence of the data. An unbalanced cross-classified random-effects model was used to decompose variance components. A phi coefficient of >0.7 was used as threshold for acceptable reliability.
Results: Data from 53 565 WBAs were analysed, and a total of 77 generalisability studies were performed. Most data came from EPAs 1 (n = 17 118, 32%) 2 (n = 10 237, 19.1%), and 6 (n = 6000, 18.5%). Low variance attributed to the learner (<10%) was found for most (59/77, 76%) analyses, resulting in a relatively large number of observations required for reasonable reliability (range = 3 to >560, median = 60). Factors such as DA, scale or EPA were not consistently associated with improved reliability.
Conclusion: The results from this study describe relatively low reliability in the WBAs obtained across seven sites. Generalisability for these instruments may be less dependent on factors such as faculty development, rater engagement or scale selection. When used for formative feedback, data from these instruments may be useful. However, such instruments do not consistently provide reasonable reliability to justify their use in high-stakes summative entrustment decisions.
© 2024 The Authors. Medical Education published by Association for the Study of Medical Education and John Wiley & Sons Ltd.
References
REFERENCES
-
- Englander R, Flynn T, Call S, et al. Toward defining the foundation of the MD degree: core entrustable professional activities for entering residency. Acad Med. 2016;91(10):1352‐1358. doi:10.1097/ACM.0000000000001204
-
- Ryan MS, Richards A, Perera R, et al. Generalizability of the Ottawa surgical competency operating room evaluation (O‐SCORE) scale to assess medical student performance on Core EPAs in the workplace: findings from one institution. Acad Med. 2021;96(8):1197‐1204. doi:10.1097/ACM.0000000000003921
-
- Ryan MS, Khamishon R, Richards A, Perera R, Garber A, Santen SA. A question of scale? Comparison of generalizability in Ottawa and Chen scales when used to formulate ad hoc entrustment decisions for the Core EPAs. Acad Med. 2021;96(11S):S214‐S215. doi:10.1097/ACM.0000000000004282
-
- Rodgers V, Tripathi J, Lockeman K, Helou M, Lee C, Ryan MS. Implementation of a workplace‐based assessment system to measure performance of the core rntrustable professional activities in the pediatric clerkship. Acad Pediatr. 2021;21(3):564‐568. doi:10.1016/j.acap.2020.09.016
-
- Cutrer WB, Russell RG, Davidson M, Lomis KD. Assessing medical student performance of entrustable professional activities: a mixed methods comparison of co‐activity and supervisory scales. Med Teach. 2020;42(3):325‐332. doi:10.1080/0142159X.2019.1686135
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials
