INTRODUCTION

Inadequate communication, consultation, and coordination between providers contribute to poor health outcomes and increased healthcare costs. Specialists can play important roles in patient care both directly (by seeing patients themselves) and indirectly by collaborating with primary care practitioners (PCPs) in patient-centered care.1,2 Electronic consultation (eConsult) programs have been shown effective in reducing face-to-face specialist encounters through provision of timely and relevant input from specialists to PCPs on treatment or diagnosis.3,4,5 Promoting improved communication between PCPs and specialists through eConsults might increase system capacity, extend PCP capability, improve access, reduce wait times, contain costs, and reduce the risk of healthcare-associated outpatient clinic transmission of coronavirus and other infectious disease.3,5,6,7,8,9,10,11

Effective intraprofessional communication is a critical component of healthcare.12,13 A good interprofessional consultation can include bidirectional teaching that shares specialized knowledge about specific conditions (from specialist to PCP) and holistic longitudinal knowledge of the patient (from PCP to specialist).14,15 However, specialists’ performance can vary widely, which may be attributable to the level of specialist engagement in the eConsult platform.9,16 Prior research has investigated PCP perceptions of the value of eConsult,3,7,17 but strategies for improving consultation skills have been under-studied, and effective approaches have yet to be identified.18,19 The pandemic has accelerated adoption of systems that reduce contact with patients, with little opportunity to develop necessary measurement and training.

Behavioral science suggests promising approaches to improve the quality of eConsults. “Nudges” are interventions that perturb behavior in a predictable way without restricting options or altering economic incentives.20 One such nudge leverages the principle that people tend to conform to the behavior and/or expectations of their peers,21,22,23 especially when expectations about behavior are otherwise unclear, as in electronic communication.24 Communicating feedback about how one’s peers expect one to behave provides pressure to meet this expectation. Likewise, the process of evaluating a peer can provide impetus to improve one’s own performance.25 We study both types of peer effects.

Providing normative feedback about performance relative to peers has led to subsequent improvements in productivity and quality of care.22,26,27 Feedback based on peers’ evaluations may be seen as more valid than standardized metrics, potentially conferring greater accountability of the recipient.27,28 Peers’ judgment of performance changes behavior based on mutual intra-specialty and intra-organizational trust.29 We have applied this approach to better understand how nudges might improve communication between PCPs and specialists using the eConsult system. We conducted the first randomized trial designed to assess the extent to which specialists’ exposure to peer ratings enhances consultation quality.

METHODS

Intervention and Trial Design

The Los Angeles County Department of Health Services (LADHS) eConsult clinician steering committee collaborated with health services researchers and behavioral scientists to design the intervention, leveraging evidence that feedback is effective when it includes aspirational norms from social “in-groups” of professional peers.22,30,31 Because the act of rating other specialists’ eConsults is itself an intervention, the trial was designed to adjust for the impact of rating others. In the first phase of the trial, only one group provided ratings. In the second phase, the original rating group and a second naïve group received feedback about their performance, and a third naïve group began the rating process. By comparing rated eConsults of raters to non-raters in the first phase, we can observe the impact of simply providing ratings to other providers without receiving feedback (see Section 4 of trial protocol; Appendix 1).

Setting

LADHS is the second largest public health provider in the USA. It is an integrated system with 23 health centers, 4 hospitals, and a network of referring federally qualified health centers serving about 670,000 patients annually. The LADHS eConsult platform allows PCPs to direct requests to a specific specialty for guidance and referrals through asynchronous messages.9 EConsult requests are reviewed by specialist reviewers recruited by DHS leadership. PCPs describe the case, optionally upload documentation, and ask questions of the assigned specialist. Specialists respond to eConsults in free text and may include pre-composed content. Prior to this trial, specialists were encouraged by institutional leadership to pursue 5 aspirational principles: responsiveness, collaboration, equity, customer service, and effective practice. However, until this trial, specialists only received feedback on their eConsult productivity (i.e., timeliness and number of eConsults completed per month).

Rating Instrument

We developed and tested a rating instrument that reflected institutional and local professional eConsult practice standards. We interviewed 10 subspecialists with high volumes of eConsults about characteristics they considered to be markers of high-quality eConsults (see interview guide; Appendix 2). Using interviewees’ examples of eConsults that they considered “bad” and “good” interactions and their rationale for these designations, we devised a set of five performance dimensions with established methodology.32 These included efforts to elicit additional information from PCPs (when needed), two aspects of medical decision-making (adherence to guidelines or “Expected Practices,” which provide consistent and targeted decision support in efforts to achieve clinical practice standardization33 or agreement with decision-making when no guideline applied), inclusion of educational content for the PCP, and collegiality (i.e., strengthening or weakening the interpersonal relationship between PCP and subspecialist). These dimensions overlap with those identified in other recent studies.15 The rating instrument included gate questions ensuring that only relevant consults were rated. One investigator (MWF) drafted the rating instrument to assess these dimensions, and the entire research team suggested multiple rounds of revisions. We finalized the instrument when no additional revisions were suggested (Appendix 3).

We randomly selected 135 eConsults for double rating and found a high degree of interrater agreement in each of the five dimensions: (a) elicitation of information from PCPs (87.5%); (b) adherence to institutional guidelines (68.4%); (c) agreement with peer’s medical decision-making (94.0%); (d) educational value (88.9%); and (e) relationship building (98.0%) (eTable1).

Rating Process and Selection of Consults

The rating instrument was integrated directly into the eConsult platform, along with a list of other specialists’ eConsults to be rated displayed in a new section below each specialist’s own list of pending eConsult requests. The eConsult software module collected rating responses while allowing raters to remain anonymous. Specialists were provided with guidance on the rating process (see Quick Guide; Appendix 4), and estimated spending 5 min to rate each eConsult. eConsults were added to the task list weekly, randomly drawing from the baseline period (ranging from 6 to 11 months in length depending on study arm; eFigure 1) of eConsults and assigned to specialists from the same discipline. By including eConsults that occurred prior to the intervention, we established baseline performance for rated specialists. Structured data about consultants’ identities and dates of service were masked to minimize bias.

Performance Feedback

Past studies have successfully nudged providers to improve decisions by providing them with information about their performance on standardized quality metrics, such as guideline adherence, relative to top-performing peers, labeling recipients explicitly as either a “Top Performer” or “Not Top Performer.”22,27 However, this technique has not yet been evaluated using structured subjective peer ratings of performance. Providers in the feedback arms were sent messages that either announced their membership in an elite group of “Top Performers” or provided actionable recommendations with feedback for “Not Top Performers.”

“Top Performers” were those with peer ratings in the top tenth, including ties. Importantly, the phrases “Top Performer” or “Not Top Performer” were included in the email subject line. The body of the email included ratings of the recipient on each dimension, ratings of top-performing peers, links to rated eConsults for reference, and suggestions for improvement when relevant (feedback templates; Appendix 5). All messages were sent from an executive physician in the health system (PG).

Outcome Measures

The outcomes of interest were changes in ratings in each performance dimension before versus after feedback was received. As a secondary outcome, we analyzed improvements in rates of consultation in which a resolution was reached without a face-to-face specialist visit—a commonly used measure of eConsult effectiveness.3,4,5

Inclusion/Exclusion Criteria

All clinicians in all specialties regularly using eConsult were included; podiatry and surgery were excluded.

Power Analysis

After adjusting effective sample size to accommodate an intracluster correlation of 0.055,34 a 1-point increase in rating was detectable with 80% power (β=0.80) and α=0.05 with 24 specialists and a total of 81 ratings per arm.

Random Allocation and Blinding

Specialists were randomly allocated to the feedback intervention \( \left(\raisebox{1ex}{$2$}\!\left/ \!\raisebox{-1ex}{$3$}\right.\right) \) or control (no feedback) \( \left(\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$3$}\right.\right) \). To minimize contamination, specialties affiliated with different facilities were grouped for assignment to study arms, resulting in 80 facility-specialty clusters that were randomized. It was not possible to blind specialists to the intervention assignment. Specialists were blinded to rater identity; structured data about consultants’ identities were removed from the eConsult rating interface. Each eConsult was randomly assigned within specialty. This allocation allowed us to measure the extent to which rating other specialists’ eConsults affected raters’ own performance.

Statistical Analysis

We used mixed-effects ordinal logistic regressions to estimate the impact of treatment. In each regression, the consult’s rating on each dimension was used as the dependent variable, with independent variables for treatment group assignment at the time of the consult, time relative to the start of the trial, and the rated consultant’s history of rating others at the time of the consult. Random effects addressed repeated measures, variation across specialties, and nesting of both specialists within specialty and eConsults within specialists. This allowed us to estimate the effects of the feedback intervention on each dimension of eConsult performance, adjusting for secular trends in ratings over time, participation in the peer rating process, specialty, and specialist-level random-effects.

IRB Approval

All study procedures were approved by the University of Southern California Institutional Review Board prior to commencement.

RESULTS

Participants

Figure 1 shows the flow of participants. Of the originally randomized 214 specialists, 64 were excluded from the analysis for lack of ratings, including one withdrawal.

Figure 1
figure 1

CONSORT diagram. Specialty-affiliation clusters (m=80) were each randomized to one of three arms, ultimately comprising 214 randomized specialists.

Table 1 shows the distribution of characteristics of participants by arm, including specialty. Table 2 shows the baseline performance of specialists in each arm, number of PCPs per specialist, proportion of consults in each rating category which resolved without a face-to-face specialist visit, and odds that rating improvements in each dimension improved resolution outcomes.

Table. 1 Distribution of Rated Specialists
Table. 2 Baseline Performance: Number of Consults (%)

Intervention Administration and Adherence

Intervention and rating activities were staggered to minimize interaction among participants from different clusters and account for effects of rating other specialists’ eConsults. The first ratings were assigned in March 2017; the first feedback was delivered in August 2017.

We are unable to measure whether specialists opened the feedback emails; our intent-to-treat analysis analyzes all persons assigned to feedback. The optionality of rating and random assignment patterns generated an uneven number of completed reviews across the 144 specialists who were assigned ratings; 45 raters completed at least one assigned review.

Primary Outcomes

We analyzed 2190 eConsults completed during the pre-intervention period (Table 2) and 1064 in the intervention period, for a total of 3254 ratings. Ratings were not solicited if the reviewer deemed the consult to be administrative rather than clinical; 92% of PCP’s inquiries were categorized as clinical (i.e., not administrative). If the PCP’s initial questions required additional information gathering (760 consults, 23.4%), the specialists’ effectiveness in eliciting additional information was rated. If the rater judged that institutional guidelines applied (1189 consults, 36.5%), the consultants’ adherence to the recommendation was rated. If no guideline applied, the subjective agreement with medical decision-making was rated instead (2065 consults, 63.5%). If the rater reported that the PCP question presented an educational opportunity for the PCP (1441 consults, 44.3%), the educational value of the specialist’s response was rated. All eConsults included in the analysis were rated on whether the specialists’ response to the PCP would cause their professional relationship to worsen, remain the same, or improve.

Receiving normative feedback from peers improved performance on four of the five dimensions nominated for evaluation during the instrument design phase. Table 3 and Figure 2 shows the adjusted odds that performance improves after the rating and feedback intervention. Receiving feedback improved performance in all rated dimensions with the exception of elicitation of information from PCPs, with significant improvement in three of the four improved domains.

Table. 3 Adjusted Odds Ratio (95% CI) of Improved Ratings
Figure 2
figure 2

Adjusted odds of improvement after feedback. The odds ratio for improvement for each rated dimension after feedback.

Rating other specialists’ eConsults improved raters’ own performance on two performance dimensions: expert elicitation (OR 1.86; SD 1.04–3.35) and relationship building (OR 1.44; SD 1.01–2.06).

Secondary Outcome

Across all performance dimensions, higher baseline ratings were associated with PCPs resolving the case without face-to-face specialist visits; lower rates of face-to-face visits were significantly associated with information solicitation, educational value, and relationship building (Table 2). Feedback improved resolution outcomes, but adjusted reductions were not significant (eTable2).

DISCUSSION

This RCT shows that specialists can become more effective electronic consultants with feedback, significantly improving on ratings of medical decision-making, education, and relationship building. Most prior studies of consultation quality have used blunt instruments like data transfer or specialist utilization.35,36 Several previous studies have successfully employed feedback using standardized measures in other clinical domains,22,26,27 but feedback based on peers’ ratings has not been explored.

Our strategy of comparing specialists to “Top Performers” leverages aspirational social norms that may be particularly motivating to professionals who identify with high standards. First, while people naturally tend to adhere to perceived social norms, highlighting the top tenth rather than average performance sets a high but achievable bar. Moreover, “Top Performer” status signals an injunctive norm without disclosing practitioners’ precise position in a distribution of peers, preventing regression among top performers.23 Second, consultation performance does not lend itself easily to objective measurement in the same way that metrics of productivity or prescribing do. Ratings from peer physicians (particularly specialists) familiar with the patient population and local practice standards may confer greater credibility than standard metrics. Furthermore, associations between our ratings and eConsult outcomes used in prior studies affirm the predictive validity of the ratings.4,9

The results we observe suggest ratings from peers are not dismissed by practitioners. Performance significantly improves after feedback, and this effect remains after accounting for participation in the rating process and associated observer effects. We show this type of information can be marshalled and framed as feedback to encourage specialists to promote relationships with PCPs that advance evidence-based practices with educational value.

Not all improvements were statistically significant, perhaps for interpretable reasons that can inform future design. In particular, the effect of feedback on concordance with institutional guidelines was not significant. Tellingly, interrater agreement on whether an institutional guideline applied was lowest among our measures; this variability suggests knowledge of guidelines among specialists was imperfect. Additionally, we noted during our interviews that some specialists had never entertained the idea that institutional guidelines might apply to their own decision-making. Another promising result was that participating in the rating process itself appears to independently improve the rater’s own performance, but rating, an optional task, had uneven uptake. A larger sample may also allow further exploration of heterogeneity of effects, particularly across specialties. Another optimistic finding is that the intervention, which did not explicitly address referrals, reduced the need for face-to-face specialist encounters compared to controls. While some improvements were not significant with this sample size, it suggests that, at scale, this nudge may increase PCP capacity, potentially outweighing costs.

Future work may shed more light on the impact and mechanisms behind this intervention. For example, we might investigate the long-term impact on utilization, guideline adherence, and patient outcomes. We did not perform content analysis after development and validation of the rating instrument. A systematic coding and abstraction from consult text may explain which features of the communication are associated with ratings. These types of follow-up studies may reveal which of the rated dimensions are most important for improving patient and professional outcomes, optimize the instrument, and provide more specific guidance to specialists.

Limitations

While our trial was conducted in a large system with diverse facilities and participants, conclusions about generalizability require additional research. Not all health systems have an incentive to facilitate communications between PCPs and specialists. However, COVID-19 has rapidly increased the importance of remote communication, and recently introduced Medicare benefits make eConsults more accessible as a means to improve care coordination,37 increasing the potential impact. This intervention may require adaptation in other health systems to ensure rating instruments and rating processes are consistent with institutional culture and goals. Environments with different status relationships among specialists might impact the perceived credibility and impact of ratings — for example, a senior surgeon may not consider a junior co-specialist a “peer” in all settings. As with other process-based measures of clinical quality, patient outcomes may be difficult to attribute to observed improvements. Additionally, by design, some of the specialists in the “No Feedback” arm performed some ratings; this might have attenuated the observed effect size. To our knowledge, this is the first time peer rating and feedback have been used to evaluate and improve specialist communication. Because our study involved specialist peer ratings, having the referring PCPs rate the eConsults with a parallel feedback intervention might be an important area for comparative research. Finally, face-to-face visit as a proxy for outcomes does not fully capture the extent to which eConsult outcomes are dependent on cases, specialists, and specialty. While our analysis controls for temporal trends and random effects at both specialty and specialist level (both significant), the cluster-randomized design did not generate balanced allocation of specialties across arms—we cannot completely rule out unobserved confounders at specialist and specialty levels.

Adding responsibility for rating to already demanding specialist practices workload might meet resistance; scaling and standardizing this model may be challenging. Given the expense of specialist’s time, costs of participation in rating should be balanced against the short- and long-term values. Differences between raters in the same specialty may suggest that time devoted to calibration is required. While this approach is far less costly than previous intraprofessional training programs,18 the cost-effectiveness of the rating system we tested is unknown. Larger-scale studies may be needed to determine under what circumstances these interventions are comparatively cost-effective at a system and societal level.

Our study predates the pandemic, when virtually all specialty visits were in person. Trends show that specialists did not adopt telehealth at comparable rates to PCPs in 2020, and have more quickly resumed in-person practice to near-2019 levels.38,39 Reducing referrals will continue to be an important way to limit in-person contact, and eConsult service providers have developed resources for optimizing use of eConsults for this purpose.40 Additional research is needed to understand impact on specialist visit referrals across modalities, including how best to tailor selection of in-person vs. eConsult.

CONCLUSIONS

Using peer ratings, the quality of specialists’ eConsults can be measured and improved by informing specialists of their performance compared to their top-performing colleagues.