Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias

Giuseppe Samo¹, Caterina Bonan², Fuzhen Si¹

Affiliations

PMID: 35773848
DOI: 10.3233/SHTI220702

Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias

Giuseppe Samo et al. Stud Health Technol Inform. 2022.

. 2022 Jun 29:295:221-225.

doi: 10.3233/SHTI220702.

Authors

Giuseppe Samo¹, Caterina Bonan², Fuzhen Si¹

Affiliations

¹ Beijing Language and Culture University.
² University of Cambridge.

PMID: 35773848
DOI: 10.3233/SHTI220702

Abstract

This paper explores a methodology for bias quantification in transformer-based deep neural network language models for Chinese, English, and French. When queried with health-related mythbusters on COVID-19, we observe a bias that is not of a semantic/encyclopaedical knowledge nature, but rather a syntactic one, as predicted by theoretical insights of structural complexity. Our results highlight the need for the creation of health-communication corpora as training sets for deep learning.

Keywords: COVID-19; Corpora; Knowledge Reproduction; Language Models; Natural Language Processing.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- IOS Press
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias

Affiliations

Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias

Authors

Affiliations

Abstract

MeSH terms

LinkOut - more resources

Full Text Sources

Medical