How empathic is your healthcare practitioner? A systematic review and meta-analysis of patient surveys

Background A growing body of evidence suggests that healthcare practitioners who enhance how they express empathy can improve patient health, and reduce medico-legal risk. However we do not know how consistently healthcare practitioners express adequate empathy. In this study, we addressed this gap by investigating patient rankings of practitioner empathy. Methods We conducted a systematic review and meta-analysis of studies that asked patients to rate their practitioners’ empathy using the Consultation and Relational Empathy (CARE) measure. CARE is emerging as the most common and best-validated patient rating of practitioner empathy. We searched: MEDLINE, Embase, PsycINFO, Cinahl, Science & Social Science Citation Indexes, the Cochrane Library and PubMed from database inception to March 2016. We excluded studies that did not use the CARE measure. Two reviewers independently screened titles and extracted data on average CARE scores, demographic data for patients and practitioners, and type of healthcare practitioners. Results Sixty-four independent studies within 51 publications had sufficient data to pool. The average CARE score was 40.48 (95% CI, 39.24 to 41.72). This rank s in the bottom 5th percentile in comparison with scores collected by CARE developers. Longer consultations (n = 13) scored 15% higher (42.60, 95% CI 40.66 to 44.54) than shorter (n = 9) consultations (34.93, 95% CI 32.63 to 37.24). Studies with mostly (>50%) female practitioners (n = 6) showed 16% higher empathy scores (42.77, 95% CI 38.98 to 46.56) than those with mostly (>50%) male (n = 6) practitioners (34.84, 95% CI 30.98 to 38.71). There were statistically significant (P = 0.032) differences between types of providers (allied health professionals, medical students, physicians, and traditional Chinese doctors). Allied Health Professionals (n = 6) scored the highest (45.29, 95% CI 41.38 to 49.20), and physicians (n = 39) scored the lowest (39.68, 95% CI 38.29 to 41.08). Patients in Australia, the USA, and the UK reported highest empathy ratings (>43 average CARE), with lowest scores (<35 average CARE scores) in Hong Kong. Conclusions Patient rankings of practitioner empathy are highly variable, with female practitioners expressing empathy to patients more effectively than male practitioners. The high variability of patient rating of practitioner empathy is likely to be associated with variable patient health outcomes. Limitations included frequent failure to report response rates introducing a risk of response bias. Future work is warranted to investigate ways to reduce the variability in practitioner empathy. Electronic supplementary material The online version of this article (doi:10.1186/s12909-017-0967-3) contains supplementary material, which is available to authorized users.


Background
A growing number of randomized trials show that when healthcare practitioners are encouraged to enhance how they express empathy, this can reduce patient pain, [1,2] lower patient anxiety, [3] increase patient satisfaction, [4,5] improve medication adherence, [6,7] and ameliorate other patient health outcomes. [8][9][10][11]. For example, Chassany's [1] empathy training intervention for general practitioners (GPs) (n = 180) reduced pain in osteoarthritis patients (n = 842) by one point on a 10-point VAS (P < 0.0001). These modest benefits are comparable to many pharmaceutical interventions without the adverse events. Hence some authors have recently called for efforts to encourage empathic care [12].
Supporting the view that empathic care should be encouraged, the extent to which healthcare practitioners express empathy seems to be lacking in some cases, [13][14][15][16] and it may decline with time in practice [17]. The increased burden of paperwork, which takes up a quarter of practitioner time, [18] may be a barrier to empathic care. However we do not know the prevalence of inadequate empathy. If adequate empathy is rare, then patients and practitioners would both likely benefit if practitioners reinforced how they display empathy. In this study, we aimed to address this gap by conducting a systematic review of patient ratings of practitioner empathy.
An obstacle to empathy research is that practitioner empathy is difficult to define theoretically [19,20]. At the same time there is an emerging consensus that empathy can be operationalized as a healthcare practitioner's ability to understand a patient's point of view, express this understanding, and make a recommendation that reflects the shared understanding [21,22]. More importantly for present purposes, while empathy is measured using different scales, [23,24] only one patient-rating of practitioner empathy demonstrated evidence of reliability, [25] internal validity and consistency: CARE [25,26]. From a patient health perspective, patient ratings of practitioner empathy are likely to be important. We therefore limited our review to studies that used the CARE measure.

Objectives
Our primary objective was to measure the extent to which patients (of any type) report their healthcare practitioners (of any type) to be empathic. Our secondary objective was to compare differences in empathy ratings between different practitioner groups (male versus female, consultation times, different types of practitioners, and practitioners in different countries).

Protocol and registration
The protocol for this review was published in PROSPERO (record no. CRD42016037456). We made two changes to the protocol. In the protocol we proposed to analyze CARE scores before and after training, however there were insufficient studies to complete this analysis. We also had insufficient data to perform the proposed analyses comparing practitioners with 10 years or more experience with those who had less than 10 years experience. Neither of these changes was related to our main study aim.

Eligibility criteria
We included any study where patients rated their practitioners' empathy using the CARE measure. We included ratings of any practitioner including nurses, doctors, alternative practitioners, and medical students. We included studies in any language, provided that the translation of the CARE questionnaire was validated.
We excluded studies that used other measures of empathy, because only CARE has been validated. An added benefit of this approach is that it reduced heterogeneity. We excluded studies where practitioners were reported to have been trained in empathy prior to being rated by patients, since we were interested in pre-training empathy ratings. Where the publications included surveys of more than one group of practitioners the surveys were treated independently.
CARE asks patients to answer 10 questions about the consultation with their practitioner such as whether the practitioner: made the patient feel at ease, really listened and understood, showed compassion, and explained things clearly (see Additional file 1). Each question can be answered by ticking one of five options: poor, fair, good, very good, excellent, does not apply, with the lowest being given a score of '1' , and the highest a score of '5'. Hence, the maximum CARE score is 50. The developers of the CARE measure have produced normative values based on administration of their questionnaire [27]. They found that the mean CARE score was 45.75, and that 5% of CARE scores fell above 48.32, and 5% fell below 40.72.  [25] and any record that includes the full name of the measure (consultation and relational empathy). Additionally, we contacted authors of studies to ask whether they are aware of any additional studies.

Data collection, extraction, and management
After piloting the extraction sheet by two authors (JH, KM), two authors (LS, AU) independently screened all titles and abstracts and extracted data. Discrepancies were resolved with discussion by a third author (JH).
We extracted data about: type of practitioner, percentage female practitioners, country, average CARE score, and individual CARE scores (where available).
We assessed risk of bias within studies by measuring response rates. It was not feasible to assess risk of bias across studies, for example by conducting a funnel plot since there was no reason to suspect higher (or lower) CARE scores varying with sample size. There was insufficient data to investigate risk of bias across studies.
Statistical analyses were performed using the program Comprehensive Meta Analysis [28]. We provided the mean and 95% confidence interval of the CARE score. We contacted study authors via email to obtain missing data with respect to participants, outcomes, or summary data. Participant data were analysed as reported. We conducted preplanned subgroup analyses to assess the extent to which proportion of female practitioners, consultation duration, type of practitioner, and country played a role. To evaluate the predictive value of gender and consultation time with respect to CARE scores we performed a multivariable regression analysis, with gender and consultation time included as the independent variables, and CARE scores included as the dependent variable.

Sensitivity and subgroup analyses
We conducted four preplanned subgroup analyses.

Longer (>10 min) consultations compared with
shorter (≤ 10 min) consultations. This was based on average consultation times in UK general practice [29]. 2. Gender: average empathy ratings of mostly (>50%) female compared with average ratings of mostly (>50%) male practitioners. 3. When there were at least three studies within the same country, we conducted a subgroup analysis with those three countries, and compared it with the complement. We chose three studies because fewer than three makes meta-analysis problematic and increases the likelihood of basing conclusions on anomalous results. 4. Types of practitioners (physicians, medical students, alternative practitioners, etc.). If there were at least three studies that measured patient ratings of

Main results
Our search yielded 392 independent records, of which 69 studies met our inclusion criteria (see Supplemental Material). Of these, 64 independent study groups (within 51 publications) had sufficient data to be included in our meta-analysis (see Table 1, Fig. 1, Additional file 3). See Additional file 4 for excluded studies. The 64 study groups were from 15 different countries: UK (n = 23), USA (n = 6), Hong Kong (n = 9), Germany (n = 7), Australia (n = 4), China (n = 6), Ethiopia (n = 2), South Korea (n = 2), and one study from each of Brazil, Croatia, France, India, and Japan. The types of practitioners included primary care physicians, practitioners of Traditional Chinese Medicine (TCM), medical students, allied health professionals, and other specialists.
Fifty-five study groups could be included in the preplanned subgroup analysis by country (   We found at least three studies each measured empathy in the following types of providers: physicians, medical students, allied health professionals, and practitioners of Traditional Chinese Medicine (Table 2). There was statistically significant heterogeneity between these (P = 0.032), with allied health professionals scoring the highest (n = 5, 45.29, 95% CI 41.38 to 49. 20), and physicians scoring the lowest (n = 39, 39.68, 95% CI 38.29 to 41.08). We found no differences between primary care physicians, specialists, and complementary and alternative medicine (CAM) providers, (P = 0.386) (see Table 3).
A multivariable regression analysis was performed to analyze the predictive value of gender and consultation time with respect to CARE scores. Consultation duration was the only significant predictor for CARE scores ( Table 4).

Risk of bias
The response rate was reported in 20 of the 53 studies (38%), with the average rate being high (69%, ranging from 21% to 100%). The uncertainty about the remaining response rates entails a risk of response bias.

Discussion
We found that patient rating of practitioner empathy is highly variable, with some practitioners being reported to express empathy much less effectively to patients than others. Female practitioners, allied health professionals, those who spend more time with patients, and practitioners from Australia, the US, and the UK seem to display empathy more effectively than other practitioners. In addition, the average care score we identified was low in comparison with normative values, falling in the lowest 5% of CARE scores measured by the developers of the questionnaire [27]. The highly variable scores we found are likely to be associated with variable patient outcomes [9][10][11]30].

Strengths and limitations
This is the first systematic review to investigate the extent to which healthcare practitioners are empathic.
Another strength is that it used measures of the only validated patient-rated measure of practitioner empathy. As such, it provides a good indication of the differences between perceived empathy across gender, disciplines, and countries.
There are also several potential limitations. First, our method for measuring the difference between female and male practitioners was likely to be an underestimate. If studies with majority female practitioners resulted in greater patient-rated empathy, it is reasonable to assume that if all the practitioners were female, the difference between male and female practitioners would have been greater. In the context of this observational research we do not know whether the additional time caused female practitioners to be more empathic, or whether female practitioners' higher empathy caused them to spend more time with patients, or whether these two factors cannot be separated. Second, response bias [26,31,32] could have affected the results. Patients who know they are rating their practitioners may wish to please their practitioners, [33] for example by giving them higher scores than they otherwise would [31,32]. The lack of response rate reporting in most of the studies makes the extent of this problem unclear. Furthermore, selection bias might have influenced the results: the CARE questionnaire could be delivered in areas where the empathy of the practitioners is believed to be anomalous (either particularly high or particularly low). Next, the comparison between countries could have been influenced by the number of studies per country. Specifically, some of the countries with low scores had very few studies (Croatia had 1, Ethiopia had 2, and India had 1). Moreover in spite of validation of CARE translations, patients in different countries may have divergent prior expectations and beliefs about what it means to be an empathic practitioner. Finally, the comparison with normative values (resulting in the average score we found being in the lowest 5%) is problematic. In spite of being relatively low, the average score is still above 40. Further work needs to be done to investigate the meaning of average CARE scores.