We report the results of a cross-sectional study comparing SP examiners to physician examiners for a third year medical student internal medicine OSCE. Our results show that SP are acceptable as examiners to students in this type of examination. We showed a weak but significant correlation between SP examiners' and physician examiners' scores, although SP examiners tended to score students higher than physician examiners. Using performance on the formative multiple choice MCQ exam as an endpoint, physicians' scores on the OSCE had predictive value, whereas SP examiner scores did not.
Why do SP examiners score students higher than physician examiners? One possible explanation is that SP examiners may simply want to give students a higher mark, or at least the benefit of the doubt, as this favours a more pleasurable student-SP encounter (physicians are, of course, not immune to this as they may have prior knowledge of the students and can also expect future encounters, both of which may introduce a 'halo effect' into evaluation). While this may partially explain the 'determination bias' in the SP examiners score that inflates the students' scores when compared to physician examiner scores, it is unlikely to be the sole reason as a systematic inflation should mean scores that are higher than physician examiners' but retain predictive validity. A more likely explanation is that as a result of their limited training and background knowledge, SP may not be able to distinguish between students with surface knowledge and those with deep understanding of the topic. They may, therefore, inconsistently overestimate (± underestimate) students' competence at performing the required task. SP do not have the experience of seeing many students at different levels perform the same task over many years, as do physician examiners, and therefore do not have the same standard for comparison. By contrast, it has previously been shown that SP examiners do not overestimate ability in more 'generic' skills, such as communication .
SP documentation of examinee performance is already an integral part of several high-stakes examinations, including the USMLE. Opinions differ, however, as to who should evaluate the various components of the examinee performance. In a recent review on this topic as it relates to a high-stakes examination (Educational Commission for Foreign Medical Graduates' Clinical Skills Assessment), Whelan et al propose a hybrid form of evaluation in which each attribute is evaluated by the person best suited to evaluate. They suggest that aspects of communication are best evaluated by the patient (or the replacement for the patient) whereas problem solving skills are best evaluated by content experts, i.e., physicians . This study offers some support to the argument that clinical skills, such as physical examination skills, are better evaluated by content experts than SP.
This study has several important limitations. Firstly, in this study the SP and physicians examined different stations. This introduces the possibility of performance bias related to the specific stations. To address this we plan to compare SP and physician examiner evaluations of students on the same stations in future studies. By this head-to-head comparison we may be able to identify stations or tasks where SP could replace physician examiners and those where SP examiner scores are less valid. Another limitation is that this study evaluated the predictive validity of a formative test of competency compared to a summative test of competency. Ideally a test of 'performance' should be used as the outcome measure and should be congruent with the OSCE in evaluating behaviour-based performance rather than higher cognitive function that is evaluated in the MCQ.
It is unlikely that SP will completely replace physician examiners in the medical student evaluation process. However, with the growing number of medical students and physicians' increasingly busy schedules, educators may have to develop new ways to continue the evaluation process with limited physician involvement. One solution is to limit the OSCE to a formative evaluation or teaching tool, although many would argue against subordinating this reliable and valid evaluation with high fidelity to evaluations with lower fidelity, such as written evaluations . While students appear to find SP acceptable as examiners, the challenge will be to improve the predictive validity of SP evaluations. In order to do this, SP may require additional training to discern between students with surface and deep knowledge . If this is unsuccessful or unfeasible they may have a more limited role as examiners on specific types of stations or they may function in combination with physicians to evaluate different components of a single task.
Further studies are needed to evaluate the impact of additional training of SP on the ability to discern between students with surface and deep knowledge. Further studies are also needed to clearly define the potential role of SP examiners as a replacement for or addition to physician examiners.