. 2019 Jul 3;19(1):205.

doi: 10.1186/s12888-019-2171-y.

The reproducibility of psychiatric evaluations of work disability: two reliability and agreement studies

Regina Kunz¹, David Y von Allmen², Renato Marelli^{3

4}, Ulrike Hoffmann-Richter^{5

6}, Joerg Jeger⁷, Ralph Mager^{3

8}, Etienne Colomb⁹, Heinz J Schaad¹⁰, Monica Bachmann², Nicole Vogel², Jason W Busse^{11

12}, Martin Eichhorn¹³, Oskar Bänziger¹⁴, Thomas Zumbrunn², Wout E L de Boer², Katrin Fischer¹⁵

Affiliations

¹ Department of Clinical Research, Evidence-based Insurance Medicine, University of Basel, University Hospital, 4031, Basel, Switzerland. regina.kunz@usb.ch.
² Department of Clinical Research, Evidence-based Insurance Medicine, University of Basel, University Hospital, 4031, Basel, Switzerland.
³ Swiss Society of Insurance Psychiatry, SGVP, 4051, Basel, Switzerland.
⁴ Private Practice for Psychiatry, 4051, Basel, Switzerland.
⁵ Swiss National Accident Insurance Funds, 6004, Luzern, Switzerland.
⁶ Private Practice for Psychiatry and Psychotherapy, 6004, Lucerne, Switzerland.
⁷ Institute of Medical Disability Evaluations of Central Switzerland, 6003, Lucerne, Switzerland.
⁸ Psychiatric University Hospital Basel, 4002, Basel, Switzerland.
⁹ French-Speaking Swiss Association of Practitioners in Medical Expertise (ARPEM), 1025, St Sulpice, Switzerland.
¹⁰ Institute for Medical Disability Evaluation Interlaken, 3800, Unterseen, Switzerland.
¹¹ Department of Anaesthesia, McMaster University, Hamilton, L8S 4K1, ON, Canada.
¹² Department of Health Research Methods, Evidence and Impact, McMaster University Hamilton, Hamilton, L8S 4K1, ON, Canada.
¹³ Private Practice for Psychiatry, 4057, Basel, Switzerland.
¹⁴ Zuerich Office of the Swiss National Disability Insurance, 8005, Zürich, Switzerland.
¹⁵ Institute Humans in Complex Systems, School of Applied Psychology, University of Applied Sciences Northwestern Switzerland, 4600, Olten, Switzerland.

PMID: 31266488
PMCID: PMC6607597
DOI: 10.1186/s12888-019-2171-y

The reproducibility of psychiatric evaluations of work disability: two reliability and agreement studies

Regina Kunz et al. BMC Psychiatry. 2019.

. 2019 Jul 3;19(1):205.

doi: 10.1186/s12888-019-2171-y.

Authors

Affiliations

¹ Department of Clinical Research, Evidence-based Insurance Medicine, University of Basel, University Hospital, 4031, Basel, Switzerland. regina.kunz@usb.ch.
² Department of Clinical Research, Evidence-based Insurance Medicine, University of Basel, University Hospital, 4031, Basel, Switzerland.
³ Swiss Society of Insurance Psychiatry, SGVP, 4051, Basel, Switzerland.
⁴ Private Practice for Psychiatry, 4051, Basel, Switzerland.
⁵ Swiss National Accident Insurance Funds, 6004, Luzern, Switzerland.
⁶ Private Practice for Psychiatry and Psychotherapy, 6004, Lucerne, Switzerland.
⁷ Institute of Medical Disability Evaluations of Central Switzerland, 6003, Lucerne, Switzerland.
⁸ Psychiatric University Hospital Basel, 4002, Basel, Switzerland.
⁹ French-Speaking Swiss Association of Practitioners in Medical Expertise (ARPEM), 1025, St Sulpice, Switzerland.
¹⁰ Institute for Medical Disability Evaluation Interlaken, 3800, Unterseen, Switzerland.
¹¹ Department of Anaesthesia, McMaster University, Hamilton, L8S 4K1, ON, Canada.
¹² Department of Health Research Methods, Evidence and Impact, McMaster University Hamilton, Hamilton, L8S 4K1, ON, Canada.
¹³ Private Practice for Psychiatry, 4057, Basel, Switzerland.
¹⁴ Zuerich Office of the Swiss National Disability Insurance, 8005, Zürich, Switzerland.
¹⁵ Institute Humans in Complex Systems, School of Applied Psychology, University of Applied Sciences Northwestern Switzerland, 4600, Olten, Switzerland.

PMID: 31266488
PMCID: PMC6607597
DOI: 10.1186/s12888-019-2171-y

Abstract

Background: Expert psychiatrists conducting work disability evaluations often disagree on work capacity (WC) when assessing the same patient. More structured and standardised evaluations focusing on function could improve agreement. The RELY studies aimed to establish the inter-rater reproducibility (reliability and agreement) of 'functional evaluations' in patients with mental disorders applying for disability benefits and to compare the effect of limited versus intensive expert training on reproducibility.

Methods: We performed two multi-centre reproducibility studies on standardised functional WC evaluation (RELY 1 and 2). Trained psychiatrists interviewed 30 and 40 patients respectively and determined WC using the Instrument for Functional Assessment in Psychiatry (IFAP). Three psychiatrists per patient estimated WC from videotaped evaluations. We analysed reliability (intraclass correlation coefficients [ICC]) and agreement ('standard error of measurement' [SEM] and proportions of comparisons within prespecified limits) between expert evaluations of WC. Our primary outcome was WC in alternative work (WC_{alternative.work}), 100-0%. Secondary outcomes were WC in last job (WC_last.job), 100-0%; patients' perceived fairness of the evaluation, 10-0, higher is better; usefulness to psychiatrists.

Results: Inter-rater reliability for WC_{alternative.work} was fair in RELY 1 (ICC 0.43; 95%CI 0.22-0.60) and RELY 2 (ICC 0.44; 0.25-0.59). Agreement was low in both studies, the 'standard error of measurement' for WC_{alternative.work} was 24.6 percentage points (20.9-28.4) and 19.4 (16.9-22.0) respectively. Using a 'maximum acceptable difference' of 25 percentage points WC_{alternative.work} between two experts, 61.6% of comparisons in RELY 1, and 73.6% of comparisons in RELY 2 fell within these limits. Post-hoc secondary analysis for RELY 2 versus RELY 1 showed a significant change in SEM_{alternative.work} (- 5.2 percentage points WC_{alternative.work} [95%CI - 9.7 to - 0.6]), and in the proportions on the differences ≤ 25 percentage points WC_{alternative.work} between two experts (p = 0.008). Patients perceived the functional evaluation as fair (RELY 1: mean 8.0; RELY 2: 9.4), psychiatrists as useful.

Conclusions: Evidence from non-randomised studies suggests that intensive training in functional evaluation may increase agreement on WC between experts, but fell short to reach stakeholders' expectations. It did not alter reliability. Isolated efforts in training psychiatrists may not suffice to reach the expected level of agreement. A societal discussion about achievable goals and readiness to consider procedural changes in WC evaluations may deserve considerations.

Keywords: Disability evaluation; Evidence-based medicine; Observer variation; Reproducibility of results; Return to work; Social security; Work capacity evaluation.

PubMed Disclaimer

Conflict of interest statement

None of the authors received support from any external organization or company for the submitted work. No financial relationships with any organizations that might have an interest in the submitted work in the previous three years; after data collection was finished (07/2016), RK became head of the Medical Competence Center of Suva, Lucerne. No other relationships or activities that could appear to have influenced the submitted work.

Figures

**Fig. 1**
Work capacity ratings in RELY 1. Thirty plots of the four psychiatrists’ ratings of the patients’ overall work capacity in their last job and in alternative work for 30 patients (c01 to c30). The dots on the left in each cell indicate the psychiatrists’ ratings in relation to the patients’ last job and the dots on the right indicate their ratings in relation to the patients’ alternative work. The lines linking the dots represent the changes in the psychiatrists’ ratings. Each psychiatrist has a different colour. Red frames: psychiatrists disagreed with each other by 100% about the extent of work capacity. This was the case for two patients in relation to their last job, and for five patients in relation to alternative work. Patients with maximum divergent expert ratings. For ‘alternative work’, one rating of patient 26 was excluded from the analysis due to a violation of the rating rules

formula image — **Fig. 1**
Work capacity ratings in RELY 1. Thirty plots of the four psychiatrists’ ratings of the patients’ overall work capacity in their last job and in alternative work for 30 patients (c01 to c30). The dots on the left in each cell indicate the psychiatrists’ ratings in relation to the patients’ last job and the dots on the right indicate their ratings in relation to the patients’ alternative work. The lines linking the dots represent the changes in the psychiatrists’ ratings. Each psychiatrist has a different colour. Red frames: psychiatrists disagreed with each other by 100% about the extent of work capacity. This was the case for two patients in relation to their last job, and for five patients in relation to alternative work. Patients with maximum divergent expert ratings. For ‘alternative work’, one rating of patient 26 was excluded from the analysis due to a violation of the rating rules

**Fig. 2**
Agreement between experts for varying levels of ‘maximum acceptable difference’ This figure demonstrates the impact of varying limits for ‘maximum acceptable difference’ in WC ratings on level of agreement. Agreement is defined as the proportions of comparisons (in percentage, values in the bars) for whom the WC ratings between any two experts’ differ less than a prespecified limit, here, the ‘maximum acceptable agreement’. We used the expectations from a recent survey among stakeholders to specify the limits for ‘maximum acceptable difference’ (see Table 1 [6]). Illustrative examples from the stakeholder survey [6]. a Treating and expert psychiatrists defined 25 percentage points* in work capacity ratings between two experts as the ‘maximum acceptable difference’. In RELY 1, 61.6% (109/177) of comparisons would fall within this limit versus 73.6% (170/231) of comparisons in RELY 2. b Lawyers, judges and insurers defined 20 percentage points* in work capacity ratings between two experts as the ‘maximum acceptable difference’. In RELY 1, 59.3% (105/177) of comparisons would fall within this limit versus 65.4% (151/231) of comparisons in RELY 2. * upper limit of the interquartile range (see Table 1)

**Fig. 3**
Work capacity ratings in RELY 2. Forty plots of the four psychiatrists’ ratings of the patients’ overall work capacity in their last job and in alternative work for 40 patients (c01 to c40). Red frames: Psychiatrists disagreed with each other by 100% about the extent of work capacity for two patients in their last job, and for no patient in relation to alternative work, which was the primary outcome. Patients with maximum divergent ratings. For ‘alternative work’, all ratings of patient 19 and one rating of patient 23 were excluded from the analysis due to violations of the rating rules

See this image and copyright information in PMC

References

1. International Social Security Association I: Country Profiles. https://www.issa.int/en/country-profiles, last accessed 14.04.2019.
1. OECD . Sickness, disability and work: breaking the barriers. A synthesis of findings across OECD countries. Paris: OECD; 2010.
1. Schandelmaier S, Fischer K, Mager R, Hoffmann-Richter U, Leibold A, Bachmann MS, Kedzia S, Jeger J, Marelli R, Kunz R, et al. Evaluation of work capacity in Switzerland: a survey among psychiatrists about practice and problems. Swiss Med Wkly. 2013;143:w13890. - PubMed
1. de Boer W, Brage S, Kunz R. Insurance medicine in clinical epidemiological terms: A concept paper for discussion. Dutch J Occup Insurance Med (Tijdschrift voor Bedrijfs- en Verzekeringsgeneeskunde - TBV) 2018;26(2):97–99. doi: 10.1007/s12498-018-0040-0. - DOI
1. Spanjer J, Krol B, Brouwer S, Groothoff JW. Sources of variation in work disability assessment. Work. 2010;37(4):405–411. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The reproducibility of psychiatric evaluations of work disability: two reliability and agreement studies

Affiliations

The reproducibility of psychiatric evaluations of work disability: two reliability and agreement studies

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical