Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 10;11(1):482.
doi: 10.1038/s41597-024-03321-0.

A tree-based corpus annotated with Cyber-Syndrome, symptoms, and acupoints

Affiliations

A tree-based corpus annotated with Cyber-Syndrome, symptoms, and acupoints

Wenxi Wang et al. Sci Data. .

Abstract

Prolonged and over-excessive interaction with cyberspace poses a threat to people's health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndrome, clinical manifestations, and acupoints that can alleviate their symptoms or signs, designating this corpus as CS-A. In the CS-A corpus, this paper defines six entities and relations subject to annotation. There are 448 texts to annotate in total manually. After three rounds of updating the annotation guidelines, the inter-annotator agreement (IAA) improved significantly, resulting in a higher IAA score of 86.05%. The purpose of constructing CS-A corpus is to increase the popularity of Cyber-Syndrome and draw attention to its subtle impact on people's health. Meanwhile, annotated corpus promotes the development of natural language processing technology. Some model experiments can be implemented based on this corpus, such as optimizing and improving models for discontinuous entity recognition, nested entity recognition, etc. The CS-A corpus has been uploaded to figshare.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Examples of entity annotation: (a) overlapping entities; (b) discontinuous entities; (c) nested entities; (d) adjectives.
Fig. 2
Fig. 2
Examples of relationship annotation. (a) The first case of the relationship annotation of causes and Is_synon in the same sentence. (b) The second case of the relationship annotation of causes and Is_synon in the same sentence. (c) An example of annotation when Disease/Cyber-Syndrome entities and their abbreviations in different sentences. (d) An example of annotation when Disease/Cyber-Syndrome entities and their anaphors in different sentences.
Fig. 3
Fig. 3
Statics on the ultimate number of entities and relationships annotated: (a) Number of entities (b) Number of relations.
Fig. 4
Fig. 4
Examples of supplementary rules after inconsistency annotation analysis. (a) An example of supplementary rules for disease annotation. (b) An example of annotation rules when symptoms/signs and their technical terms are present at the same time. (c) An example of syndromes/signs rules for (Syndrome_1Syndrome_2 ∪... ∪ Syndrome_N)(body part 1body part 2 ∪...∪ body part M). (d) An example of syndromes/signs rules for (Syndrome_1Syndrome_2 ∪... ∪ Syndrome_N)body part. (e) An example of syndromes/signs rules for Syndrome(body part 1body part 2 ∪...∪ body part M).
Fig. 5
Fig. 5
Comparison of annotation agreement scores for each entity and relation. (a) Three-round entity annotation agreement scores. (b) Three-round relation annotation agreement scores.
Fig. 6
Fig. 6
Example of CS-A corpus - Take the Cyber-Syndrome represented by D001251 as an example, the four acupoints of tianyou, sizhukong, luoque, and yangbai can relieve its symptoms.

Similar articles

References

    1. Ning H, Ye X, Bouras MA, Wei D, Daneshmand M. General cyberspace: Cyberspace and cyber-enabled spaces. IEEE Internet of Things Journal. 2018;5:1843–1856. doi: 10.1109/JIOT.2018.2815535. - DOI
    1. Ning H, Dhelim S, Bouras MA, Khelloufi A, Ullah A. Cyber-syndrome and its formation, classification, recovery and prevention. IEEE Access. 2018;6:35501–35511. doi: 10.1109/ACCESS.2018.2848286. - DOI
    1. Kang Y, Cai Z, Tan CW, Huang Q, Liu H. Natural language processing(nlp)in management research:a literature review. Journal of Management Analytics. 2020;7:139–172. doi: 10.1080/23270012.2020.1756939. - DOI
    1. Martínez-deMiguel C, Segura-Bedmar I, Chacón-Solano E, Guerrero-Aspizua S. The raredis corpus: A corpus annotated with rare diseases, their signs and symptoms. Journal of Biomedical Informatics. 2022;125:103961. doi: 10.1016/j.jbi.2021.103961. - DOI - PubMed
    1. Aminian, O. et al. The relationship between video display terminals (vdts) usage and dermatologic manifestations: a cross sectional study. BMC Dermatol5, 10.1186/1471-5945-5-3. (2005). - PMC - PubMed

Publication types