Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct;78(5):805-825.
doi: 10.1177/0013164417724187. Epub 2017 Aug 4.

The Delta-Scoring Method of Tests With Binary Items: A Note on True Score Estimation and Equating

Affiliations

The Delta-Scoring Method of Tests With Binary Items: A Note on True Score Estimation and Equating

Dimiter M Dimitrov. Educ Psychol Meas. 2018 Oct.

Abstract

This article presents some new developments in the methodology of an approach to scoring and equating of tests with binary items, referred to as delta scoring (D-scoring), which is under piloting with large-scale assessments at the National Center for Assessment in Saudi Arabia. This presentation builds on a previous work on delta scoring and adds procedures for scaling and equating, item response function, and estimation of true values and standard errors of D scores. Also, unlike the previous work on this topic, where D-scoring involves estimates of item and person parameters in the framework of item response theory, the approach presented here does not require item response theory calibration.

Keywords: assessment; delta-scoring; test equating; test scoring; testing.

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
Item–person map for D scores and δi values (“deltas”) obtained with data from the administration of the general aptitude test (GAT) to 3,460 high-school graduates in Saudi Arabia (GAT consists of 72 dichotomously scored item responses, 1 = correct, 0 = incorrect).
Figure 2.
Figure 2.
Frequency distribution of D scores obtained via Equation (1) with the binary scores of 3,000 subjects on 20 items generated with simulations under the one-parameter logistic (1PL) model, from the standard normal ability distribution, θs ~ N(0, 1).
Figure 3.
Figure 3.
Item–person map for D scores obtained via Equation (1) with simulated binary scores of 3,000 subjects on 20 items (δi values in Table 3).
Figure 4.
Figure 4.
Item characteristic curves (ICCs) on the D-scale for four items selected from the 20 simulated items with delta parameters in Table 3: Item 6 (δ6 = 0.2284), Item 8 (δ8 = 0.3648), Item 12 (δ12 = 0.8320), and Item 20 (δ20 = 0.6982). The ICCs are obtained via Equation (4).
Figure 5.
Figure 5.
Standard error of measurement of D scores, computed via Equation (6), with the simulated data for 20 items in Example 1.
Figure 6.
Figure 6.
The δi values for the 10 common items on test Form X, prior to their rescaling, and test Form Y (with the simulated data in Example 2).

References

    1. Allen J. M., Yen W. M. (1979). Introduction to measurement theory. Pacific Grove, CA: Brooks/Cole.
    1. Angoff W. H. (1971). Scales, norms and equivalent scores. In Thorndike R. L. (Ed.), Educational measurement (2nd ed., pp. 508-600). Washington DC: American Council on Education.
    1. Atanasov D. V. (2016. a). A system for automated test assembly (SATA). Riyadh, Saudi Arabia: National Center for Assessment.
    1. Atanasov D. V. (2016. b). A computer program in MATLAB for bootstrap estimation of expected item difficulties on a test of binary items. Riyadh, Saudi Arabia: National Center for Assessment.
    1. Atanasov D. V., Dimitrov D. M. (2015). A system for automated test scoring and equating (SATSE). Riyadh, Saudi Arabia: National Center for Assessment.

LinkOut - more resources