Skip to main content

Download the Ultimate Clinical Trial Tracking Checklist

W33: The Correlation between Patient Reported Outcomes and Clinician Reported Outcomes

Eric Gemmen, Quintiles Outcome; Katie Zarzar, Roche Products Ltd in the UK; Shital Kamble, Quintiles Outcome; Rebecca Dawsey, TransPerfect: November 6, 2013

Presentation

Purpose

 

Explore the evidence of the degree of correlation between patient reported outcomes (PROs) and clinician reported outcomes (ClinROs), and how this varies by:

  • disease/therapeutic area and measure:
    • disease/symptom presence
    • symptom frequency
    • symptom severity

 

Methods

 

  • A review of the literature and analysis of existing patient registry data was conducted to qualitatively assess degree of correlation between PROs and ClinROs.
  • Statistical measures of correlation and concordance are expressed in terms of Spearman’s rho, Pearson’s rho, weighted Kappa, and Kendall’s Tao.
  • A review of translation and linguistic validation projects involving PRO and ClinROs was also conducted to examine language-related differences and correlations between the scales.
  • The results are organized as follows:
    • Direct Comparison of PRO and ClinRO Responses;
    • Relative Impacts of Language to the Responses of PROs and ClinRO Measures
  • Types of outcome measures are specified:
    • diagnosis, symptom presence (yes/no),
    • symptom frequency
    • symptom severity
  • Specific examples of PRO-ClinRO pairs are provided for the following disease areas:
    • Oncology
    • Depression in Parkinson’s Disease
    • Multiple sclerosis
    • Rheumatoid arthritis
    • Acne, Psoriasis, and Atopic Eczema
    • Dry eye
    • Crohn’s Disease
    • Pediatrics

 

Limitations

  • This workshop presents a survey of the evidence relating PROs to ClinROs primarily from the medical literature. As such, detail and results are limited to what is presented in the research article.
Download Correlation: Patient and Clinical Reported Outcomes

Presentation

Concordance of Symptom Presence and Overall Health Status (cont.)

  • Patient Reported Symptoms
  • Physician Reported Symptoms
  • Overall Health Status (via EQ-5D)

 

Strength of Concordance (Kendall’s Tao) between patient/physician reported symptoms and overall health status as measured by EQ-5D

  • Patient Reported
    • Fatigue 0.36
    • Nausea 0.19
    • Vomiting 0.13
    • Diarrhea 0.14
    • Constipation 0.17
    • Dyspnea 0.27
    • Appetite Loss 0.28
  • Physician Reported
    • Fatigue 0.24
    • Nausea 0.10
    • Vomiting 0.09
    • Diarrhea 0.05
    • Constipation 0.13
    • Dyspnea 0.15
    • Appetite Loss 0.22

 

Source: Basch (2010) from N=467 persons with breast, lung, genitourinary or gynecologic malignant conditions across a total of 4034 clinic visits at Memorial SloanKettering Cancer Center, New York.

 

Diagnosis/Symptoms Agreement Example: Rheumatoid Arthritis

 

Agreement and Correlations between Rheumatoid Arthritis Values Findings by PROs and Physician

  • Patient-reported / Physician-reported:
    • SJC / SJC: 0.772b
    • SJC / TJC: 0.499
    • SJC / DAS28: 0.525
    • SJC / MD-Global: 0.531
    • SJC / CDAI: 0.563
    • SJC / SDAI: 0.541
    • TJC / SJC: 0.429
    • TJC / TJC: 0.75b
    • TJC / DAS28: 0.552
    • TJC / MD-Global: 0.493
    • TJC / CDAI: 0.611
    • TJC / SDAI: 0.598
    • RADAI / SJC: 0.393
    • RADAI / TJC: 0.604
    • RADAI / DAS28: 0.56
    • RADAI / MD-Global: 0.399a
    • RADAI / CDAI: 0.667
    • RADAI / SDAI: 0.646
    • RAPID3 / SJC: 0.372
    • RAPID3 / TJC: 0.594
    • RAPID3 / DAS28: 0.523
    • RAPID3 / MD-Global: 0.361a
    • RAPID3 / CDAI: 0.731
    • RAPID3 / SDAI: 0.706
    • RAPID4 / SJC: 0.402
    • RAPID4 / TJC: 0.625
    • RAPID4 / DAS28: 0.562
    • RAPID4 / MD-Global: 0.395a
    • RAPID4 / CDAI: 0.75
    • RAPID4 / SDAI: 0.726
    • RAPID5 / SJC: 0.53
    • RAPID5 / TJC: 0.709
    • RAPID5 / DAS28: 0.662
    • RAPID5 / MD-Global: 0.511a
    • RAPID5 / CDAI: 0.829
    • RAPID5 / SDAI: 0.851
    • MDHAQ / SJC: 0.246d
    • MDHAQ / TJC: 0.491
    • MDHAQ / DAS28: 0.442
    • MDHAQ / MD-Global: 0.304a
    • MDHAQ / CDAI: 0.531
    • MDHAQ / SDAI: 0.531
    • VAS-Global / SJC: 0.396
    • VAS-Global / TJC: 0.583
    • VAS-Global / DAS28: 0.517
    • VAS-Global / MD-Global: 0.026c,e
    • VAS-Global / CDAI: 0.754
    • VAS-Global / SDAI: 0.725
    • VAS-Pain / SJC: 0.323
    • VAS-Pain / TJC: 0.508
    • VAS-Pain / DAS28: 0.434
    • VAS-Pain / MD-Global: 0.314a
    • VAS-Pain / CDAI: 0.632
    • VAS-Pain / SDAI: 0.606

 

Source: Amaya-Amaya (2012). All correlations via Spearman’s rho, except: a Correlation by Kendall’s Tau; b Agreement by Kendall’s W test; c Agreement by Weighted kappa ∗∗All data P < 0.0001, except in dP = 0.004 and eP = 0.241.

 

Symptom Severity Example: Dry Eyes

  • 162 dry eye subjects, and 48 controls
  • Self-assessment of severity of dry eye completed first
  • Clinicians first completed a clinical assessment, and then a clinician assessment of severity of the subject’s dry eye symptoms (Patients did not discuss their self-assessment with the clinician)

 

Source: Chalmers et al. (2005)

 

Symptom Severity Example: PD

  • 50 patients diagnosed by a movement disorder specialist with idiopathic PD

 

Scales utilized:

  • PRO
    • Beck Depression Inventory (BDI)
    • Geriatric Depression Scale (GDS)
  • ClinRO
    • Hamilton Depression Rating scale (HAM-D)
    • Montgomery-Asberg Rating Scale (MADRS)
  • 25 respondents received the PROs first, and 25 respondents received the ClinROs first

 

Source: Cimino (2011)

 

Overall Results

  • Dry Eyes:
    • Under reporting of severity by clinicians
  • Parkinson’s Disease:
    • Strong correlation (72%) between patient and clinician ratings of depressive symptom

 

Agreement of Self-assessed and Clinician assessed Severity Example: Skin Disease- Acne, Psoriasis and Atopic Eczema

  • Objective and Study Population
    • A cross-sectional study examined psychological associations of acne, psoriasis, or atopic eczema
      • Comparison- Self-assessed versus clinician objective responses regarding skin disease severity
    • 108 patients from general and specialist dermatology practices:
      • Acne (n=41),
      • psoriasis (n=47), and
      • Atopic eczema (n=20)

 

Ref: Magin et al. 2011

 

Skin Disease: Acne, Psoriasis and Atopic Eczema

  • Objective severity assessment:
    • Leeds technique (Acne),
    • Psoriasis Area and Severity Index (PASI),
    • Six Area Six Sign Atopic Dermatitis (SASSAD) instruments.
    • Continuous scores on these instruments converted to accepted cut-points: ‘‘mild,’’ ‘‘moderate,’’ and ‘‘severe’’
  • Patients disease severity self assessment:
    • ‘‘mild,’’ ‘‘moderate,’’ or ‘‘severe”
  • Study Considerations and Limitations
    • Patients recruited from both general practice and specialist dermatology practice. Findings do not reflect the perceptions only of patients who have been ‘‘self-selected’’ to some extent by referral to specialist or secondary care.
    • Small sample size-agreement across three different diseases pooled approach may fail to detect variation in agreement between particular skin diseases.

 

Ref: Magin et al. 2011

 

Agreement of Self-reported and Clinician-reported Symptoms Example: Cancer Patients

  • Objective, Design, Patient Population
    • To examine the extent to which patient and clinician symptom scoring and their agreement could contribute to the estimation of overall survival among cancer patients.
    • Retrospective pooled analysis (n=2279) conducted using secondary data from 14 Phase III European Organization for Research and Treatment of Cancer (EORTC) randomized clinical trials (1990-2002).

 

Ref: Quinten et al. 2011

 

Example: Cancer- Baseline Symptom Assessment in 14 selected trials

  • Patient Symptom Burden Assessment: EORTC Quality of Life Core questionnaire (QLQ-C30)
    • The patient rated his or her symptoms on a 4-point ordinal scale:
      • Score of 1 = “not at all,”
      • Score of 2 = “a little,”
      • Score of 3 = “quite a bit,” and
      • Score of 4 = “very much.”
  • Clinician Assessments of Patient Symptoms: National Cancer Institute’s Common Terminology Criteria for Adverse Events (NCI- CTCAE)
    • The clinician rated the patient’s symptoms on a 5-point scale:
      • Score 0 = “none or normal,”
      • Score 1 = “mild,”
      • Score 2 = “moderate,”
      • Score 3 = “severe,” and
      • Score 4 = “life threatening or disabling.”
  • For purposes of comparison, each of the following pairs were considered to be identical responses:
    • EORTC QLQ-C30 score 1 vs NCI-CTCAE score 0;
    • EORTC QLQ-C30 score 2 vs NCI-CTCAE score 1;
    • EORTC QLQ-C30 score 3 vs NCI-CTCAE score 2;
    • EORTC QLQ-C30 score 4 vs NCI-CTCAE scores 3 and 4 combined.

 

Ref: Quinten et al. 2011

 

Example: Cancer - The Mean Scores and 95% Confidence Intervals (CIs) for the Symptoms Pain, Fatigue, Vomiting, Nausea, Diarrhea, and Constipation

  • Clinical Symptom
    • Pain
      • Patient Score (EORTC QLQ-C30), Mean (95% CI): 2.31 (2.26 to 2.36)
      • Clinician Score (NCI-CTCAE), Mean (95% CI): 2.13 (2.07 to 2.18)
    • Fatigue
      • Patient Score (EORTC QLQ-C30), Mean (95% CI): 2.10 (2.05 to 2.15)
      • Clinician Score (NCI-CTCAE), Mean (95% CI): 1.36 (1.33 to 1.40)
    • Vomiting
      • Patient Score (EORTC QLQ-C30), Mean (95% CI): 1.11 (1.08 to 1.14)
      • Clinician Score (NCI-CTCAE), Mean (95% CI): 1.18 (1.15 to 1.21)
    • Nausea
      • Patient Score (EORTC QLQ-C30), Mean (95% CI): 1.38 (1.35 to 1.41)
      • Clinician Score (NCI-CTCAE), Mean (95% CI): 1.20 (1.16 to 1.24)
    • Diarrhea
      • Patient Score (EORTC QLQ-C30), Mean (95% CI): 1.27 (1.23 to 1.31)
      • Clinician Score (NCI-CTCAE), Mean (95% CI): 1.10 (1.08 to 1.12)
    • Constipation
      • Patient Score (EORTC QLQ-C30), Mean (95% CI): 1.50 (1.44 to 1.56)
      • Clinician Score (NCI-CTCAE), Mean (95% CI): 1.11 (1.09 to 1.14)

 

Ref: Quinten et al. 2011

 

Example: Cancer - The Mean Scores and 95% Confidence Intervals (CIs) for the Symptoms Pain, Fatigue, Vomiting, Nausea, Diarrhea, and Constipation

  • Clinician (NCI-CTCAE)
    • ​​​​​​​Pain: Have you had pain?
      • r: 0.58
      • k (95% Confidence Interval): 0.29 (0.26 to 0.33)
    • Pain: Did pain interfere with your daily activities?
      • r: 0.50
      • k (95% Confidence Interval): 0.27 (0.23 to 0.30)
    • Fatigue: Did you need to rest?
      • r: 0.30
      • k (95% Confidence Interval): 0.07 (0.03 to 0.10)
    • Fatigue: Have you felt weak?
      • r: 0.28
      • k (95% Confidence Interval): 0.07 (0.03 to 0.10)
    • Fatigue: Were you tired?
      • r: 0.30
      • k (95% Confidence Interval): 0.08 (0.04 to 0.11)
    • Vomiting: Have you vomited?
      • r: 0.32
      • k: 0.22 (0.13 to 0.30)
    • Nausea: Have you felt nauseated?
      • r: 0.32
      • k: 0.14 (0.10 to 0.18)
    • Diarrhea: Have you had diarrhea?
      • r: 0.20
      • k: 0.14 (0.07 to 0.20)
    • Constipation: Have you been constipated?
      • r: 0.38
      • k: 0.16 (0.11 to 0.21)

 

Ref: Quinten et al. 2011

 

Agreement of Self-reported and Clinician-reported Symptoms Example: Cancer Patients

  • Limitations
    • ​​​​​​​No evidence-based consensus regarding how to compare scoring from patient-reported vs. clinician-reported measurements
      • Different purpose of assessment for EORTC QLQ-C30 vs. NCI-CTCAE may explain the rationale for low levels of agreement reported between patients and clinicians at baseline.
    • Generalizability- limited to relative asymptomatic population

 

Ref: Quinten et al. 2011

 

Summary

  • Although study examples show modest agreements between self-reports and clinician-reports, results suggest that clinical studies would benefit from assessment of both self-reported and clinician-reported diseased severity and symptom burden.
  • Compared to clinician objective severity, self-assessed severity is associated with patients’ psychological wellbeing (Skin Disease Example).
  • Further patients provide subjective measure of symptom severity that complements clinician scoring in predicting overall survival (Cancer Example)

 

Impact of Language vs:

  • Level of Education
  • Context
  • Personal Experience
  • Age

 

Examples:

  • Pediatric Populations
    • “Average”
  • “ Did freezing of gait contribute to your falling in the past 24 hours?”
  • “Regurgitation”
    • “liquid or food coming up into your throat or mouth”

 

Linguistic Validation of a Subject Diary Card for Patients Diagnosed with Crohn’s Disease - Zulu

  • Source
    • (Also mark “Yes” if any narcotics were used.)
  • Forward Translation
    • (Phinda umake “Yebo” uma kunezidakam izwa ezisetshenzis iwe.)
  • Interview Analysis
    • R1 showed concern for the word “izidakamizw a”.
  • Linguist Feedback
    • The Zulu word in this context may be associated with street/illegal drugs, instead of drugs for medical purposes. The FT and BT are updated to the safer term that merely means "opioids" or “medical drugs”.
  • Updated Forward Translation
    • (Phinda umake u- ”Yebo” uma kuneminye imithi yokwelapha esetshenzisiwe.)
  • Updated Back Translation
    • (Also mark “Yes” if any pain medication was used.)

 

Similar feedback in Afrikaans, Xhosa, and English (South Africa)

 

Linguistic Validation of TAPQoL for Parents of Children under 5 – Italian & Japanese

  • Forward Trans.
    • Coliche?
  • Back Trans.
    • Colics?
  • Clinician Feedback
    • Coliche addominali is more precise.
  • Linguist Feedback
    • Both the terms "coliche" and "coliche addominali" are correct. "Coliche addominali" is more precise and is understandable as the term "coliche". The FT and BT are revised.

    Updated Forward Translation

    • Coliche addominali?
  • Updated Back Translation
    • Abdominal colics?

 

  • Forward Trans.
    • 疝痛
  • Back Trans.
    • Colic
  • Clinician Feedback
    • 激しい反復性腹痛 (severe recurrent abdominal pain) 「疝痛」 is a right English term for Japanese physicians, but it seems to be a technical term which is unfamiliar to general people.
  • Linguist Feedback
    • The clinician suggests that the medical term "colic" is not widely known among laymen. However, it‘s important to include the medical term as well. The FT and BT are revised to include the medical term, with an explanation in parentheses.
  • Updated Forward Translation
    • 疝痛(反復する 激し い腹痛)
  • Updated Back Translation
    • Colic (Repetitive severe abdominal pain)

 

Linguistic Validation of TAPQoL for Parents of Children under 5 - Italian

  • Source
    • Colic
  • Forward Translation
    • Coliche addominali?
  • Interview Analysis
    • R2 still shows confusion between stomach ache, abdominal pain and colic. He thinks this question is the same as the previous one. R3 says that colics are quite similar to the symptoms outlined in the previous question, so the two questions should be together. R5 believes that colics are diarrhea episodes. The other respondents have no difficulty.
  • Linguist Feedback
    • The FT and BT are revised to add a very clear explanation so that colic cannot be confused with another condition.
  • Updated Forward Translation
    • Coliche (dolori intensi o crampi nella regione addominale)?
  • Updated Back Translation
    • Colics (acute pains or cramps in the abdominal area)?

 

Linguistic Validation of TAPQoL for Parents of Children under 5 - Italian

  • Source
    • Colic
  • Forward Translation
    • 疝痛(反復す る激し い腹痛)
  • Interview Analysis
    • All the respondents felt odd about the technical term "疝痛."
  • Linguist Feedback
    • The term "colic" is removed from the FT and BT as it is too technical and causes confusion amongst all respondents.
  • Updated Forward Translation
    • 繰り返す激しい 腹痛
  • Updated Back Translation
    • Repetitive severe abdominal cramps

 

When to consider…

  • Adaptations
    • ClinRO adapted to PRO
    • PRO adapted to ClinRO
    • Clinician-administered PRO adapted to PRO
    • PRO adapted to Clinician-administered PRO
Download Correlation: Patient and Clinical Reported Outcomes

References

  • Amaya-Amaya J. et al. Arthritis 2012; doi:10.1155/2012/935187.
  • Basch E. The Missing Voice of Patients in Drug Safety Reporting. N Engl J Med 2010; 362; 10.
  • Chalmers RL, OD, Begley CG, Edrington T, Caffery B, Nelson D, Snyder C, Simpson T. The Agreement Between Self-Assessment and Clinician Assessment of Dry Eye Severity. Cornea Volume 24, Number 7, October 2005.
  • Chen RC et al. Patient-reported Acute Gastrointestinal Symptoms During Concurrent Chemoradiation Treatment for Rectal Cancer Cancer 2010; 116:1879-86.
  • Cimino CR, Siders CA, Zesiewicz TA. Depressive Symptoms in Parkinson Disease: Degree of Association and Rate of Agreement of Clinician-Based and Self-Report Measures. J Geriatr Psychiatry Neurol 2011 24:199. DOI: 10.1177/0891988711422525
  • Eddy L. Item Selection in Self-Report Measures for Children and Adolescents with Disabilities: Lessons from Cognitive Interviews. Journal of Pedatric Nursing. 2011 December; 26(6): 559- 565.
  • Magin PJ, Pond CD, Smith WT, Watson AB, Goode SM. Correlation and agreement of selfassessed and objective skin disease severity in a cross-sectional study of patients with acne, psoriasis, and atopic eczema. Int J Dermatol.2011;50(12):1486-90.
  • Quinten C, Maringwa J, Gotay CC, Martinelli F, Coens C, Reeve BB, Flechtner H, Greimel E, King M, Osoba D, Cleeland C, Ringash J, Schmucker-Von Koch J, Taphoorn MJ, Weis J, Bottomley A. Patient self-reports of symptoms and clinician ratings as predictors of overall cancer survival. J Natl Cancer Inst. 2011; 103:1851–1858.
  • Spiegel B, MD. Clinical Trial Design and Outcome Measurement. IFFGD 10th International Symposium on Functional GI Disorders. April 14, 2013.

 

The authors would like to thank TransPerfect and Quintiles Outcome for their support. No funding or grant support was received for this project.

 

Conclusions / Discussion

  • Correlation is generally stronger for discrete measures (symptom/disease presence=yes/no) than for continuous measures (severity, scales)
  • Effects of language/culture
  • Further research: Correlations over time