Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 1;51(11):719-727.
doi: 10.1097/OLQ.0000000000002047. Epub 2024 Jun 17.

Multiple Imputation of Race and Hispanic Ethnicity in National Surveillance Data for Chlamydia, Gonorrhea, and Syphilis

Affiliations

Multiple Imputation of Race and Hispanic Ethnicity in National Surveillance Data for Chlamydia, Gonorrhea, and Syphilis

Tracy Pondo et al. Sex Transm Dis. .

Abstract

Background: Disease burden of sexually transmitted infections such as chlamydia, gonorrhea, and syphilis is often compared across age categories, sex categories, and race and ethnicity categories. Missing data may prevent researchers from accurately characterizing health disparities between populations. This article describes the methods used to impute race and Hispanic ethnicity in a large national surveillance data set.

Methods: All US cases of chlamydia, gonorrhea, and syphilis (excluding congenital syphilis) reported through the National Notifiable Diseases Surveillance System from the year 2019 were included in the analyses. We used fully conditional specification to impute missing race and Hispanic ethnicity data. After imputation, reported case rates were calculated, by disease, for each race and Hispanic ethnicity category using Vintage 2019 Population and Housing Unit Estimates from the US Census. We then used case counts from subsets that contained only complete race and Hispanic ethnicity information to investigate if the confidence intervals from the multiply imputed data included the observed number of cases in each race and Hispanic ethnicity category.

Results: Among the 2,553,038 cases reported in 2019, race and Hispanic ethnicity were multiply imputed for 9% of syphilis cases, 22% of gonorrhea cases, and 33% of chlamydia cases. In the subset analyses, every nonzero rate of reported cases was contained within the confidence intervals that were calculated from multiply imputed data.

Conclusions: Confidence intervals that account for the uncertainty of the predictions are an advantage of multiple imputation over complete-case analysis because a realistic variance estimate allows for valid hypothesis testing results.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest and sources of funding: None declared.

Figures

Figure 1:
Figure 1:
US chlamydia cases reported to NNDSS in 2019 per 100,000 population from 15 imputed data sets with 95% confidence intervals vs. rates for cases with reported race and Hispanic ethnicity. AIAN, non-Hispanic (NH) American Indian or Alaska Native; ASIAN, NH-Asian; BLACK, NH-Black or African American; HISP, Hispanic/Latino; MULTI, NH-Multiracial; NAHAW, NH-Native Hawaiian or Other Pacific Islander; WHITE, NH-White
Figure 2:
Figure 2:
US gonorrhea cases reported to NNDSS in 2019 per 100,000 population from 15 imputed data sets with 95% confidence intervals vs. rates for cases with reported race and Hispanic ethnicity. AIAN, non-Hispanic (NH) American Indian or Alaska Native; ASIAN, NH-Asian; BLACK, NH-Black or African American; HISP, Hispanic/Latino; MULTI, NH-Multiracial; NAHAW, NH-Native Hawaiian or Other Pacific Islander; WHITE, NH-White
Figure 3:
Figure 3:
US primary and secondary syphilis cases reported to NNDSS in 2019 per 100,000 population from 15 imputed data sets with 95% confidence intervals vs. rates for cases with reported race and Hispanic ethnicity. AIAN, non-Hispanic (NH) American Indian or Alaska Native; ASIAN, NH-Asian; BLACK, NH-Black or African American; HISP, Hispanic/Latino; MULTI, NH-Multiracial; NAHAW, NH-Native Hawaiian or Other Pacific Islander; WHITE, NH-White
Figure 4:
Figure 4:
Width of 95% confidence interval compared to proportion of cases in geographic area with missing race and Hispanic ethnicity data for chlamydia cases reported to NNDSS in 2019. Ratio on x-axis is width of 95% confidence interval divided by the reported chlamydia cases per 100,000 population. Confidence intervals for chlamydia cases per 100,000 population calculated from 15 imputed data sets for each of 50 states. A) For race and Hispanic ethnicity categories with the largest number of cases, most ratios do not exceed 1.0. B) For race and Hispanic ethnicity categories with smallest number of cases, most ratios exceed 1.0.

References

    1. Centers for Disease Control and Prevention. National Notifiable Diseases Surveillance System (NNDSS). Accessed April 25, 2024. https://www.cdc.gov/nndss/index.html
    1. Department of Health and Human Services. STI National Strategic Plan Overview. Accessed April 25, 2024. https://www.hhs.gov/programs/topic-sites/sexually-transmitted-infections...
    1. Office of Disease Prevention and Health Promotion. Healthy People 2030 Framework. Accessed April 25, 2024. https://health.gov/healthypeople/about/healthy-people-2030-framework
    1. Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2019. Accessed April 25, 2024. https://www.cdc.gov/std/statistics/2019/std-surveillance-2019.pdf
    1. van Buuren S. Flexible Imputation of Missing Data. CRC Press; 2012.