Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 23:10.25300/misq/2025/18946.
doi: 10.25300/misq/2025/18946. Online ahead of print.

AI-Augmented Content Validation in Behavioral Research: Development and Evaluation of the RATER System

Affiliations

AI-Augmented Content Validation in Behavioral Research: Development and Evaluation of the RATER System

Jean-Charles Pillet et al. MIS Q. .

Abstract

Content validation is an essential aspect of the scale development process that ensures that measurement instruments capture their intended constructs. However, researchers rarely undertake this core step in behavioral research because it requires costly data collection and specialized expertise. We present RATER (Replicable Approach to Expert Ratings), a free web-based system (www.contval.org) that can help the broader research community (scientists, reviewers, students) gain quick and reliable insights into the content validity of measurement instruments. Guided by psychometric measurement theory, RATER evaluates whether a scale's items correspond to their intended construct, remain distinct from other constructs, and adequately represent all aspects of the construct's content domain. The system employs two unique artificial intelligence models, RATERC and RATERD, which leverage psychometric scales from 2,443 journal articles spanning eight disciplines and two state-of-the-art large language model architectures (i.e., BERT and GPT). A set of six complementary studies confirms the RATER system's accuracy, reliability, and usefulness. We find RATER can augment the scale development and validation process, increasing the validity of findings in behavioral research.

Keywords: behavioral research; content validity; design science research; large language models (LLMs); machine learning; psychometrics; research rigor; scale development.

PubMed Disclaimer

References

    1. Abbasi A, Parsons J, Pant G, Sheng ORL, & Sarker S (2024). Pathways for Design Research on Artificial Intelligence. Information Systems Research. 10.1287/isre.2024.editorial.v35.n2 - DOI
    1. Abdin M, Aneja J, Behl H, Bubeck S, Eldan R, Gunasekar S, Harrison M, Hewett RJ, Javaheripi M, Kauffmann P, Lee JR, Lee YT, Li Y, Liu W, Mendes CCT, Nguyen A, Price E, de Rosa G, Saarikivi O, … Zhang Y (2024). Phi-4 Technical Report (Version 1). arXiv. 10.48550/ARXIV.2412.08905 - DOI
    1. Agogo D, & Hess TJ (2018). Scale Development Using Twitter Data: Applying Contemporary Natural Language Processing Methods in IS Research. In Deokar AV, Gupta A, Iyer LS, & Jones MC (Eds.), Analytics and Data Science: Advances in Research and Pedagogy (pp. 163–178). Springer International Publishing.
    1. Aguinis H, & Edwards JR (2014). Methodological wishes for the next decade and how to make wishes come true. Journal of Management Studies, 51(1), 143–174.
    1. Aguinis H, Ramani RS, & Alabduljader N (2018). What You See Is What You Get? Enhancing Methodological Transparency in Management Research. Academy of Management Annals, 12(1), 83–110. 10.5465/annals.2016.0011 - DOI