In relation to the first aim of this study the findings suggest that the properties of the UKCAT are relatively temporally stable. As with the previous published analysis by James et al. [10] we confirmed a number of observations regarding the UKCAT scores in relation to other sociodemographic and educational variables:
-
1.
That performance on the UKCAT and at A levels are moderately correlated.
-
2.
That candidates from an independent or grammar school tend to achieve higher scores/grades at both the UKCAT and at A level compared to those who report a non-grammar school state education. This effect is apparent even after controlling for the effect of other predictor variables.
-
3.
That candidates reporting themselves as of White ethnicity, on average, achieve higher A level tariffs and UKCAT scores than those describing themselves as Non-white. This effect is apparent even after controlling for the effect of other predictor variables.
-
4.
Candidates from non-professional socioeconomic backgrounds were observed to achieve, on average, lower scores on both the UKCAT and at A level, even after controlling for the effects of other predictor variables.
-
5.
That male sex independently and significantly predicted higher total UKCAT scores. This effect is also apparent for the verbal reasoning and quantitative reasoning UKCAT subtest scores.
When comparing the results from the present study and those reported by James et al. it should be borne in mind that our WP categories were coded in the reverse direction to the latter study [10]. However, as highlighted above, allowing for this difference the results of the two studies were largely consistent. Nevertheless a number of our observations were in contrast to the results reported by James et al. in the earlier cohort. Firstly, unlike the previous report, we did not observe that males performed significantly better at A level compared to females. Also, in the present study, males did not score significantly higher on abstract reasoning and decision making scales compared to females, conflicting somewhat with the findings of James et al. where decision making was the only subtest where a sex difference was not apparent. There are potential explanations for these apparent inconsistencies (see later).
Our second aim was to explore in further detail the sociodemographic predictors of UKCAT performance and contrast these with those observed for A level attainment. Certainly the use of UKCAT scores and A levels as continuous outcome measures has allowed for a more in-depth comparison of the two metrics of ability. However, in the event, there were only a relatively small number of additional conclusions we could draw from this more detailed approach, which also used EASL status and age as additional predictors, compared to the previous study in the 2006 cohort:
-
1.
That UKCAT performance is independently predicted by both ethnicity and by EASL status; those individuals who report their ethnicity as ‘White’ and have English as a first language, on average, score more highly on the UKCAT than those reporting Non-white ethnicity and learning English after the age of two. In contrast, A level performance was only independently predicted by ethnicity, with those of White ethnicity achieving, on average, higher grades than those reporting ethnicity as non-White. This suggests that culture and language skills may have a somewhat larger negative impact on UKCAT compared to A level performance.
-
2.
When raw data is analysed according to reported ethnic group (e.g. White, Asian, Black etc.) average performance at A level is generally ranked in the same way as that for UKCAT performance, with those reporting being of White/Chinese achieving the highest scores/grades and those reporting Black ethnicity the lowest. The difference between these highest ranking groups and the lowest is considerable, at roughly one standard deviation for both A level and UKCAT performance.
-
3.
Whilst candidates from an independent or grammar school tend to achieve higher scores/grades at both the UKCAT and at A level compared to those who report a non-grammar school state education there is some suggestion from our results that this school-type bias may be more pronounced for A levels than for the UKCAT. For example, whilst the type of school attended is a significant univariate predictor of UKCAT score, the effect seems less pronounced than that for A levels; indeed the 95% confidence intervals touch but do not overlap (see Table 2). However, as highlighted earlier in the methods section, the regression coefficients derived from Tobit and linear regression may not be directly comparable and consequently some caution must be exercised in interpreting this finding.
-
4.
Older candidates (those over 20 years at the time of application) were more likely to report, on average, poorer A level grades and lower UKCAT scores. However, as most older candidates (all except 61 individuals over 20 years) were excluded on the basis of missing A level data this observation should be treated cautiously.
-
5.
That compared to females, males tend to perform less well on the decision making subtest of the UKCAT. No overall sex differences for the abstract reasoning subtest were observed in this analysis.
Thus we can conclude that some socioeconomic bias in the UKCAT scores exists but that this differs in a number of respects from that observed for A level attainment. Therefore, when considering issues relating to WP in medical and dental education the picture is more complex than simply favouring one metric of ability over another. Thus, these findings suggest that the UKCAT may be prone to more bias in some respects compared to A levels and less in others. Compared to A level performance the UKCAT may be more prone to effects related to sex. Moreover, whilst both metrics of ability show bias in favour of those reporting White ethnicity the UKCAT may be especially sensitive to linguistic ability, compared to A levels. The present sample largely took A levels in science and maths. These subjects may test language and communication skills less rigorously than the humanities. It is therefore unsurprising that the UKCAT appears to ‘penalise’ EASL status to a more significant degree compared to A levels in the present sample. In contrast, there were some suggestions from the data that UKCAT may, as a metric of ability, be less biased in favour of candidates from an independent or grammar school background than A level grades. Thus, the UKCAT may potentially offer some complimentary, if not incremental, value alongside educational attainment measures in relation to the medical and dental school selection process.
Further comparison with previous findings
This study builds on the previous work investigating sociodemographic predictors of A level and UKCAT performance [10]. Our study sample was relatively comparable with the subgroup of UKCAT candidates providing data in this earlier study. The present sample used was slightly smaller in that we only included those with complete sociodemographic information, as opposed to just non-missing A level data. In practice this meant that it was mainly those under 21 years that lacked information on socioeconomic background that were excluded from the final sub-sample for analysis, as this was the principal WP variable missing, aside from A level attainment. It should be noted, however, that many older individuals would have already been excluded on the basis of missing A level data. Nevertheless, as in the present sample, the sub-group of 2006 candidates included in the previous study tended to be younger, more likely to be of White ethnicity have attended an independent/grammar school (EASL status was not available in the 2006 cohort). As outlined earlier, our findings were largely consistent with the observations reported in this earlier study. However, it is important to consider the apparent inconsistencies in the findings between these two studies. Firstly, our lack of any observed sex difference in A level achievement, in contrast to the findings of James et al. [10], can be explained by the differing ways that the metric of A level performance was constructed. In the present study we created a tariff score by summing the UCAS for the three best exam grades (excluding general studies and critical thinking) irrespective of whether the A levels were ‘pure’ science (i.e. chemistry, biology, physics and mathematics). Indeed, James et al. report only slightly higher average tariff scores for pure science subjects in males and no sex difference in overall average tariff scores. Thus, our observations are largely consistent with those reported by James et al., though we are unable to rule out the effects of a recognised recent secular trend towards males obtaining more top grades in science A levels compared to females [15]. Secondly, in the present study, males did not score significantly higher than females on the abstract reasoning and decision making scales. This appears to contrast somewhat with the findings of James et al. where decision making was the only subtest where a sex difference was not apparent. However, these inconsistencies may be relatively trivial once the differences in analysis approach are accounted for. Firstly, the males in our cohort performed more poorly than females on the decision making items, although the magnitude of this difference was slight and the p value for significant testing (p = .027) could be considered modest given the number of observations in the analysis. Thus, it may have been the case that by using a dichotomous outcome for UKCAT scores the James study may have been underpowered to detect a slight sex difference in performance on this subscale. Similarly, in the case of the abstract reasoning subtest this previous study reported a slight (but statistically significant) tendency for males to perform, more poorly on this element of the UKCAT; males had a 16% lower odds of scoring above the 30th centile on this particular subtest. In contrast we observed no significant sex difference. However, we utilised UKCAT scores as a continuous, rather than as a dichotomous metric. Although these findings are not included in our results section, as with the report by James et al., we noted a slight excess of females scoring above the 30th centile on abstract reasoning (1,397 females compared to 1,186 males). Thus, our results are largely consistent with those reported by the earlier study once the differences in methodologies are accounted for.
A previous study demonstrated that use of the UKCAT as a threshold score in the admissions process appears to ameliorate the disadvantage faced by lower socio-economic groups when applying to medical schools [5]. In addition, use of the UKCAT scores as a threshold in the admissions process was associated with increased odds of entrants being male, from a low socioeconomic status background and a state (non-grammar) school (the latter trend not reaching statistical significance). In contrast, universities placing less emphasis on use of the test were more likely to admit entrants with relatively low academic attainment and with English as a second language. These observations are generally consistent with the properties of the two performance metrics as reported in this present study. Thus, the present findings imply that it is mainly the differences in sensitivity to sociodemographic factors (i.e. bias) between A levels and the UKCAT that are driving these differences. The obvious exception to this is that in the present study we found no evidence that the UKCAT was less biased than A levels against those from a non-professional socioeconomic background than were A levels. However, it is possible that if UKCAT performance is less sensitive to schooling than A level attainment then this difference may be at least partly mediating this previous observation [5]. Moreover, our present results do not explain why universities that place little emphasis on the test scores may be more likely to have entrants with below average educational performance.
Limitations
The primary limitation of this study was that analysis could only be conducted on a minority of applicants for 2009, due to missing data. This limits our confidence in the generalisability of these findings to the wider population of UKCAT candidates. In particular, the individuals with complete data were more likely to be of White ethnicity, have attended an independent/grammar school, be younger and to speak English as a first language. Thus, we must be extremely cautious in drawing any conclusions about the association between WP variables and UKCAT performance in sub-groups of candidates who belong to the opposite sociodemographic categories. Missing data modelling in conjunction with imputational approaches could have been used to inform sensitivity analysis (i.e. assess how strong the findings are under different assumptions). Such an approach has been previously employed with educational data to provide an indication of the extent to which data are missing at random as opposed to being non-ignorable [5]. However, it was felt that imputational approaches could have added a significant degree of uncertainty to the dataset, especially as more than one variable would have had to be imputed. In addition the final sample providing data for analysis would have been difficult to compare with the previous sub-group of candidates studied [10]. Inclusion of advanced qualifications other than A levels may have modestly addressed the missingness but also potentially added a degree of complexity and possible confounding; it is uncertain to what extent other tests of educational attainment are equivalent to each other (e.g. Scottish Highers vs A levels). Thus, on balance, it was felt that restricting the analysis to those with complete data would enhance the internal validity of the findings, accepting that this would be at the expense of generalisablity of the results observed.
Whilst some descriptive analysis was conducted using more broadly defined ethnic groups it would have been desirable to have detailed modelling in relation to ethnicity. Certain ethnic groups (e.g. those describing themselves as ‘Black’) are relatively under-represented in UK medical and dental education whilst others are over-represented (e.g. Asians) in relation to the national population demographics [12].
A further limitation, as stated earlier, is that the use of Tobit regression, whilst necessary with censored data, still leads to some informational loss compared to linear regression, and the coefficients and confidence intervals produced by the two approaches (i.e. Tobit and linear regression) may not be easily comparable.
Implications for practice and directions for future research
The UKCAT is a high stakes test; the psychometric properties of the test, in conjunction with a widespread adoption as part of the admissions process could at least partly determine the nature of the UK’s future medical and dental workforce. Both the present and a previous study report evidence of a certain degree of sociodemographic bias in the UKCAT responses. Firstly, possibly the most consistently reported of these is the observation that males achieve higher scores on the UKCAT than females. In the UK females are currently disproportionately represented amongst medical and dental school entrants. This issue has, at times, stirred up controversial debate [16]. Currently female doctors are more likely to work part-time [17] and retire early [18] compared to their male counterparts. Nevertheless, it should be highlighted that a previous study reported that female doctors were at lower risk of professional misconduct after qualification, even after adjusting for a number of potential confounding factors [19]. Moreover, in the UK, women may outperform men in certain medical undergraduate [20] and post-graduate exams [21]. This study has highlighted some differences in the differential sensitivities of the UKCAT subtests to sex. Thus our findings may assist universities making informed decisions about how much weight to place on each element of the UKCAT when selecting entrants.
The potential insensitivity of the UKCAT to educational background is certainly a factor that could help address the issue of widening participation in the professions. However, it should be highlighted that the WP agenda is not purely focused on issues of social equity and fairness; there is evidence from North American research that students drawn from minority populations may be more likely to eventually practice in areas that have been traditionally underserved by health care provision [22]. In the US attempts to address racial imbalances within the professions, including medicine, via the ‘affirmative action’ approach have proved controversial and have been the subject of a series of Court cases [23]. Earlier North American researchers have suggested that the use of cognitively based aptitude tests (such as the UKCAT) will never address the under representation of racial minorities in medical education as such instruments tend to produce similar mean raw scores according to ethnicity [24]. Rather, it has been postulated that the most plausible way of achieving a medical school population with a similar ethnic profile to the population from which they are drawn is to have quotas for each group. It has been suggested that these quotas can be fulfilled without any appreciable lowering of average academic performance at medical school [25]. In the UK such affirmative action-style approaches have not been adopted and even their legality would have to be tested.
Our findings suggest that the UKCAT test that may penalise those who do not speak English as a first language more severely than science-based A levels do. Nevertheless, fluency in spoken English has been reported to correlate significantly with patient and examiner ratings of global communication, which is considered a key attribute of a clinician [26]. It could therefore be argued that it is reasonable for the UKCAT to evaluate elements of linguistic ability such as verbal reasoning.
Future research should focus on obtaining further evidence regarding whether or not the UKCAT has the ability to predict undergraduate and post-graduate performance and progression, over and above that possible via traditional measures of educational attainment. Moreover, it may be that the content and delivery of the test can be modified to further decrease the sensitivity to educational background. In the field of education there is some evidence that ‘dynamic testing’ may be better at predicting an individual’s academic and potential compared to traditional (‘static’) cognitive assessments. This may be especially true where a candidate’s education has been poor or disrupted [27]. Dynamic tests create a learning environment within the test structure by providing novel situations and then evaluate the nature and number of prompts, hints and clues the candidate requires in order to achieve a correct response. Such tests correlate highly with traditional ‘intelligence tests’ but provide additional information relating to cognitive flexibility and learning potential [28]; attributes obviously pertinent to medical or dental practitioners.