How Well Do Raters Agree on the Development Stage of Caenorhabditis elegans?

doi:10.1371/journal.pone.0132365

. 2015 Jul 14;10(7):e0132365.

doi: 10.1371/journal.pone.0132365. eCollection 2015.

How Well Do Raters Agree on the Development Stage of Caenorhabditis elegans?

Annabel A Ferguson¹, Richard A Bilonick², Jeanine M Buchanich³, Gary M Marsh³, Alfred L Fisher⁴

Affiliations

¹ Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
² Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America; Department of Ophthalmology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America; Department of Orthodontics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
³ Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America; Center for Occupational Biostatistics and Epidemiology and Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
⁴ Division of Geriatrics, Gerontology, and Palliative Medicine, Department of Medicine, UTHSCSA, San Antonio, Texas, United States of America; Center for Healthy Aging, UTHSCSA, San Antonio, Texas, United States of America; San Antonio GRECC, STVAHCS, San Antonio, Texas, United States of America.

PMID: 26172989
PMCID: PMC4501796
DOI: 10.1371/journal.pone.0132365

How Well Do Raters Agree on the Development Stage of Caenorhabditis elegans?

Annabel A Ferguson et al. PLoS One. 2015.

. 2015 Jul 14;10(7):e0132365.

doi: 10.1371/journal.pone.0132365. eCollection 2015.

Authors

Annabel A Ferguson¹, Richard A Bilonick², Jeanine M Buchanich³, Gary M Marsh³, Alfred L Fisher⁴

Affiliations

¹ Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
² Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America; Department of Ophthalmology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America; Department of Orthodontics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
³ Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America; Center for Occupational Biostatistics and Epidemiology and Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
⁴ Division of Geriatrics, Gerontology, and Palliative Medicine, Department of Medicine, UTHSCSA, San Antonio, Texas, United States of America; Center for Healthy Aging, UTHSCSA, San Antonio, Texas, United States of America; San Antonio GRECC, STVAHCS, San Antonio, Texas, United States of America.

PMID: 26172989
PMCID: PMC4501796
DOI: 10.1371/journal.pone.0132365

Abstract

The assessment of inter-rater reliability is a topic that is infrequently addressed in Caenorhabditis elegans research, despite the existence of sophisticated statistical methods and the strong interest in the field in obtaining reliable and accurate data. This study applies statistical modeling as a robust means of analyzing the performance of worm researchers measuring the stage of worm development in terms of the two independent factors that comprise "agreement", which are (1) accuracy, representing trueness, a lack of systematic differences, or lack of bias, and (2) precision, representing reliability or the extent to which random differences are small. In our study, multiple raters assessed the same sample of worms to determine the developmental stage of each animal, and we collected data linking each scorer with their assessment for each worm. To describe the agreement of the raters, we developed a structural equation model with latent variables and thresholds, which assumes that all the raters are jointly scoring each worm. This common factor model separately quantifies the two aspects of agreement. The stage-specific thresholds examine accuracy and characterize the relative biases of each rater during the scoring process. The factor loadings for each rater examine the precision and characterizes the random error of the rater. Within our group, we found that the overall agreement was good, while certain adjustments in particular raters would have decreased systematic differences. Hence, the use of developmental stage as an experimental outcome can be both accurate and precise.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Fig 1. A common factor ordinal model to analyze rater agreement.**
This model describes the ordinal measurements (R₁, R₂, and R₃) made by three raters (1, 2, and 3), which are observed (manifest) variables denoted by squares. These variables are related to the variables μ and χ, which are latent, meaning that they are not directly observable, but are included in the model since they underlie the actual observable values. The latent variable μ corresponds to the true worm stage but on a continuous scale. The variable μ is defined as being normally distributed with a mean of zero and a standard deviation of one. This standard deviation is represented by the curved arrow showing the value one (“1”) that is adjacent to μ. Each rater judges the stage of worm development on his or her own continuous scale, shown as the latent variables χ₁, χ₂, and χ₃ in the model. Each rater’s unknown continuous scale is a linear function of μ as indicated by the single arrow paths pointing from μ to each χ. The slopes (path coefficients) for these linear functions are denoted by ρ₁, ρ₂, and ρ₃ and the intercepts are equal to zero. These functions result in χ₁, χ₂, and χ₃ having a residual standard deviation of 1-ρ₁ ², 1-ρ₂ ², and 1-ρ₃ ², respectively, which are denoted by the labelled curved arrow beside each variable. The directed path from each rater’s continuous scale, χ, and the observed ordinal measurement, R, is nonlinear as denoted by the sinusoidal path. The nonlinear relationship can be described as a threshold model where the thresholds (c_i1, c_i2, c_i3, and c_i4) for rater i control the marginal probability of each observed ordinal measurement (denoted by P(L1), P(L2), P(dauer), P(L3), and P(L4)) under the assumption that each rater’s continuous judgment is normally distributed with mean of zero and a standard deviation of one.

**Fig 2. Sample still image of the sample of worms used for scoring by the raters.**
Each rater was assigned the same sample of worms to score for developmental stage. The worms were shown in both a 40X magnification image (illustrated) as well as a short video recording of each animal. Each worm was identified by a number to facilitate each rater evaluating identical animals in the same order.

**Fig 3. Head-to-head comparison of ratings from pairs of observers.**
Each of the 60 animal developmental stage ratings from a pair of reviewers is compared via the use of a pair-wise scatter plot matrix. The axis showing numbers 1 through 5 represents the animal stage with 1 representing L1, 2 representing L2, 3 representing dauer, 4 representing L3, and 5 representing L4. The green line represents perfect agreement between the two observers, and points along this line represent animals that are scored similarly by each observer. In contrast points either above or below the line represent disagreement between the raters. The ordinal values are slightly “jittered” to make it easier to discern the varying density of the ratings.

**Fig 4. Heat map showing the pairwise ratio of the residual error estimates for all raters.**
The residual error estimate for the rater indicated in each row was divided by the rater in each column, and then was displayed as a heat map to highlight similarities and differences between raters. Each ratio is shown as the number inside of the colored box. The brightness of the color indicates relative strength of difference between raters, with red representing a ratio greater than one and green representing a ratio less than one.

**Fig 5. Rater-specific thresholds estimated using the common factor model.**
The thresholds classify the worms into the L1, L2, dauer, L3, or L4 stages. Each stages represents an abstract concept encompassing size, morphologic, and behavioral features of the worm that can be perceived by a rater relative to each threshold. Threshold 1 (A) separates the L1 and L2 categories, threshold 2 (B) separates the L2 and dauer categories, threshold 3 (C) separates the dauer and L3 categories, and threshold 4 (D) separates the L3 and L4 categories.

**Fig 6. Heat map showing differences between raters for the predicted proportion of worms assigned to each stage of development.**
The brightness of the color indicates relative strength of difference between raters, with red as positive and green as negative. Result are shown as column minus row for each rater 1 through 7.

**Fig 7. Comparison of the common factor model with rater behavior.**
Shown are bar-graphs depicting the percentages predicted for the assignment of animals to each stage by individual reviewers from the estimated common factor model (left column), and the observed percentage of animals assigned to each of the developmental stages for the raters (right column).

See this image and copyright information in PMC

Cited by

Comparison of the Abilities of SD-OCT and SS-OCT in Evaluating the Thickness of the Macular Inner Retinal Layer for Glaucoma Diagnosis.
Lee KM, Lee EJ, Kim TW, Kim H. Lee KM, et al. PLoS One. 2016 Jan 26;11(1):e0147964. doi: 10.1371/journal.pone.0147964. eCollection 2016. PLoS One. 2016. PMID: 26812064 Free PMC article.
Genetic architecture and temporal analysis of Caenorhabditis briggsae hybrid developmental delay.
Velazco-Cruz L, Ross JA. Velazco-Cruz L, et al. PLoS One. 2022 Aug 11;17(8):e0272843. doi: 10.1371/journal.pone.0272843. eCollection 2022. PLoS One. 2022. PMID: 35951524 Free PMC article.

References

1. Riddle DL, Swanson MM, Albert PS. Interacting genes in nematode dauer larva formation. Nature. 1981;290(5808):668–71. . - PubMed
1. Ambros V. Heterochronic Genes In: Riddle DL, Blumenthal T, Meyer BJ, Priess JR, editors. C elegans II. 2nd ed Cold Spring Harbor (NY) 1997. - PubMed
1. Cassada RC, Russell RL. The dauerlarva, a post-embryonic developmental variant of the nematode Caenorhabditis elegans. Developmental biology. 1975;46(2):326–42. . - PubMed
1. Abrahante JE, Miller EA, Rougvie AE. Identification of heterochronic mutants in Caenorhabditis elegans. Temporal misexpression of a collagen::green fluorescent protein fusion gene. Genetics. 1998;149(3):1335–51. - PMC - PubMed
1. Moore BT, Jordan JM, Baugh LR. WormSizer: high-throughput analysis of nematode size and shape. PLoS One. 2013;8(2):e57142 10.1371/journal.pone.0057142 - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 ES017761/ES/NIEHS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

[1] Riddle DL, Swanson MM, Albert PS. Interacting genes in nematode dauer larva formation. Nature. 1981;290(5808):668–71. . - PubMed

[2] Riddle DL, Swanson MM, Albert PS. Interacting genes in nematode dauer larva formation. Nature. 1981;290(5808):668–71. . - PubMed

[3] Ambros V. Heterochronic Genes In: Riddle DL, Blumenthal T, Meyer BJ, Priess JR, editors. C elegans II. 2nd ed Cold Spring Harbor (NY) 1997. - PubMed

[4] Ambros V. Heterochronic Genes In: Riddle DL, Blumenthal T, Meyer BJ, Priess JR, editors. C elegans II. 2nd ed Cold Spring Harbor (NY) 1997. - PubMed

[5] Cassada RC, Russell RL. The dauerlarva, a post-embryonic developmental variant of the nematode Caenorhabditis elegans. Developmental biology. 1975;46(2):326–42. . - PubMed

[6] Cassada RC, Russell RL. The dauerlarva, a post-embryonic developmental variant of the nematode Caenorhabditis elegans. Developmental biology. 1975;46(2):326–42. . - PubMed

[7] Abrahante JE, Miller EA, Rougvie AE. Identification of heterochronic mutants in Caenorhabditis elegans. Temporal misexpression of a collagen::green fluorescent protein fusion gene. Genetics. 1998;149(3):1335–51. - PMC - PubMed

[8] Abrahante JE, Miller EA, Rougvie AE. Identification of heterochronic mutants in Caenorhabditis elegans. Temporal misexpression of a collagen::green fluorescent protein fusion gene. Genetics. 1998;149(3):1335–51. - PMC - PubMed

[9] Moore BT, Jordan JM, Baugh LR. WormSizer: high-throughput analysis of nematode size and shape. PLoS One. 2013;8(2):e57142 10.1371/journal.pone.0057142 - DOI - PMC - PubMed

[10] Moore BT, Jordan JM, Baugh LR. WormSizer: high-throughput analysis of nematode size and shape. PLoS One. 2013;8(2):e57142 10.1371/journal.pone.0057142 - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

How Well Do Raters Agree on the Development Stage of Caenorhabditis elegans?

Affiliations

How Well Do Raters Agree on the Development Stage of Caenorhabditis elegans?

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources