Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 1;159(1):87-95.
doi: 10.1001/jamasurg.2023.5695.

Demographic Representation in 3 Leading Artificial Intelligence Text-to-Image Generators

Affiliations

Demographic Representation in 3 Leading Artificial Intelligence Text-to-Image Generators

Rohaid Ali et al. JAMA Surg. .

Abstract

Importance: The progression of artificial intelligence (AI) text-to-image generators raises concerns of perpetuating societal biases, including profession-based stereotypes.

Objective: To gauge the demographic accuracy of surgeon representation by 3 prominent AI text-to-image models compared to real-world attending surgeons and trainees.

Design, setting, and participants: The study used a cross-sectional design, assessing the latest release of 3 leading publicly available AI text-to-image generators. Seven independent reviewers categorized AI-produced images. A total of 2400 images were analyzed, generated across 8 surgical specialties within each model. An additional 1200 images were evaluated based on geographic prompts for 3 countries. The study was conducted in May 2023. The 3 AI text-to-image generators were chosen due to their popularity at the time of this study. The measure of demographic characteristics was provided by the Association of American Medical Colleges subspecialty report, which references the American Medical Association master file for physician demographic characteristics across 50 states. Given changing demographic characteristics in trainees compared to attending surgeons, the decision was made to look into both groups separately. Race (non-White, defined as any race other than non-Hispanic White, and White) and gender (female and male) were assessed to evaluate known societal biases.

Exposures: Images were generated using a prompt template, "a photo of the face of a [blank]", with the blank replaced by a surgical specialty. Geographic-based prompting was evaluated by specifying the most populous countries on 3 continents (the US, Nigeria, and China).

Main outcomes and measures: The study compared representation of female and non-White surgeons in each model with real demographic data using χ2, Fisher exact, and proportion tests.

Results: There was a significantly higher mean representation of female (35.8% vs 14.7%; P < .001) and non-White (37.4% vs 22.8%; P < .001) surgeons among trainees than attending surgeons. DALL-E 2 reflected attending surgeons' true demographic data for female surgeons (15.9% vs 14.7%; P = .39) and non-White surgeons (22.6% vs 22.8%; P = .92) but underestimated trainees' representation for both female (15.9% vs 35.8%; P < .001) and non-White (22.6% vs 37.4%; P < .001) surgeons. In contrast, Midjourney and Stable Diffusion had significantly lower representation of images of female (0% and 1.8%, respectively; P < .001) and non-White (0.5% and 0.6%, respectively; P < .001) surgeons than DALL-E 2 or true demographic data. Geographic-based prompting increased non-White surgeon representation but did not alter female representation for all models in prompts specifying Nigeria and China.

Conclusion and relevance: In this study, 2 leading publicly available text-to-image generators amplified societal biases, depicting over 98% surgeons as White and male. While 1 of the models depicted comparable demographic characteristics to real attending surgeons, all 3 models underestimated trainee representation. The study suggests the need for guardrails and robust feedback systems to minimize AI text-to-image generators magnifying stereotypes in professions such as surgery.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Dr Groff reported personal fees from Nuvasive Spine and SpineArt outside the submitted work. No other disclosures were reported.

Figures

Figure 1.
Figure 1.. Differences in Demographic Representation of Surgeons for DALL-E 2, Midjourney, and Stable Diffusion From True Demographic Data
AI indicates artificial intelligence. aSignificant at P < .05 by proportion tests.
Figure 2.
Figure 2.. Representative Images for Depictions of Surgeons by DALL-E 2, Midjourney, and Stable Diffusion
Five representative images for 3 of 8 surgical specialties studied were selected at random for each model. Images for the remaining 5 specialties are reported in eFigure 1 in Supplement 1.
Figure 3.
Figure 3.. Representation of Gender and Race by DALL-E 2, Midjourney, and Stable Diffusion
An increase in the percentage of female attending surgeons in a specialty was associated with a rise in the percentage of female trainees (+1.3% for every 1.0% increase in female attending surgeons, P = .003), while there was no significant association between the percentage of non-White trainees and non-White attending surgeons within the specialties (P = .46).

Comment in

References

    1. Open AI . DALL·E now available without waitlist. Published 2022. Accessed October 13, 2023. https://openai.com/blog/dall-e-now-available-without-waitlist
    1. Adams LC, Busch F, Truhn D, Makowski MR, Aerts HJWL, Bressem KK. What does DALL-E 2 know about radiology? J Med Internet Res. 2023;25:e43110. doi:10.2196/43110 - DOI - PMC - PubMed
    1. Koljonen V. What could we make of AI in plastic surgery education. J Plast Reconstr Aesthet Surg. 2023;81:94-96. doi:10.1016/j.bjps.2023.04.055 - DOI - PubMed
    1. Buolamwini J, Gebru T. Gender shades: intersectional accuracy disparities in commercial gender classification. Proc Mach Learn Res. 2018;81:1-15.
    1. Dastin J. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters . Published 2018. Accessed October 13, 2023. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/am...