- Research article
- Open Access
- Open Peer Review
Impact of Progress testing on the learning experiences of students in medicine, dentistry and dental therapy
BMC Medical Education volume 18, Article number: 253 (2018)
To investigate the impact of progress testing on the learning experiences of undergraduate students in three programs namely, medicine, dentistry and dental therapy.
Participants were invited to respond to an online questionnaire to share their perceptions and experiences of progress testing. Responses were recorded anonymously, but data on their program, year of study, age, gender, and ethnicity were also captured on a voluntary basis.
A total of 167 participants completed the questionnaire yielding a response rate of 27.2% (n = 167). These included 96 BMBS students (27.4%), 56 BDS students (24.7%), and 15 BScDTH students (39.5%). A 3 -Program (BMBS, BDS, BScDTH) by 8-Topic (A-H) mixed analysis of variance (ANOVA) was conducted on the questionnaire responses. This revealed statistically significant main effects of Program and Topic, as well as a statistically significant interaction between the two (i.e. the pattern of topic differences was different across programs).
Undergraduate students in medicine, dentistry, and dental therapy and hygiene regarded PT as a useful assessment to support their learning needs. However, in comparison to students in dentistry and dental therapy and hygiene, the perceptions of medical students were less positive in several aspects of PT.
Progress testing (PT) is now an established and accepted form of assessing applied knowledge in contemporary undergraduate medical curricula . Students are examined throughout the program facilitating a longitudinal and comprehensive assessment of growth of knowledge . The standard of questions is set at the level expected of a new graduate and students in all years of a program sit the same test simultaneously [1, 3]. Growth in applied knowledge is indexed by a steady increase in scores and enables reliable and valid decision-making about progression to the next stage of the program. PT is a feedback-oriented assessment and provides extensive opportunities for remediation of poorly-performing students; this allows students and their academic supervisors to identify areas of weakness and provide feedback to improve performance in successive years .
The rationale for the development and use of PT in undergraduate medical curricula is to minimize the traditional approaches to preparation adopted by the students for end-of-unit tests and may offer several advantages . Whereas traditional end-of-module tests promote short-term, surface-level revision strategies, PT encourages students to acquire information throughout the duration of the module, breaking the link between learning and revision and reinforcing the spiral curriculum [6, 7]. Given that PT is aimed at testing the application of knowledge to real-life clinical situations, it may enhance students’ motivation for learning  and may help reduce stress associated with assessment by avoiding high stakes examinations .
Although these varied benefits of PT are widely reported in the literature, there are few published studies which explore the experiences and perceptions of the undergraduate medical students undertaking PT [10, 11]. The Faculty of Medicine and Dentistry, University of Plymouth uses PT for the assessment of knowledge in three different undergraduate programs and one postgraduate program: Bachelor of Medicine and Bachelor of Surgery (BMBS); Bachelor of Dental Surgery (BDS); Bachelor of Dental Therapy and Hygiene (BScDTH); and postgraduate diploma in Physician Associate (PA) studies. The undergraduate curricula for the BMBS, BDS and BScDTH programs were designed around a problem-based learning model with spiraling curriculum in successive years to allow review and repeated exposure to applied knowledge [3, 12, 13]. We have reported our experience in the development and use of PT in Medicine , Dentistry [14, 15] and Dental Therapy Programs . To our knowledge, our institution is the first to use PT in BDS and BScDTH programs and we could not identify any published literature on the impact of PT on the learning experiences of students in these programs.
The aim of this study was to provide data allowing student perceptions of the impact of progress testing to be compared and contrasted across the three programs; BMBS, BDS, and BScDTH.
Ethics approval for the study was obtained from the institutional Research Ethics Committee (Reference Number 16/17–695).
The study was conducted at the Faculty of Medicine and Dentistry, University of Plymouth, United Kingdom (UK). The survey was open to responses throughout summer 2017.
Study design, sample, and materials
Invitations to participate in the study were circulated by the Faculty Administrator to all current BMBS (n = 350), BDS (n = 227) and BScDTH (n = 38) students (n = 615) by e-mail. The invitation was accompanied by a participant information sheet detailing the purpose and scope of the study along with a URL to an online version of a previously validated questionnaire  designed to investigate the impact of progress testing on the learning experiences of students (Appendix). The questionnaire was hosted on Google Forms and all students were sent reminder two weeks following the initial invitation.
All participants completed online consent forms prior to completing the survey. Responses were recorded anonymously, but data on their program, year of study, age, gender and ethnicity were also captured on a voluntary basis. Whilst the number of and timestamps for the data submitted did not raise any concerns the risks of students submitting multiple forms and non-students submitting are to be acknowledged.
Analyses were conducted using the R statistical environment for Windows https://www.r-project.org/.
Although the data are derived from ordinal level Likert-scale responses, and show some violations of parametric assumptions, previous work has shown that such data can be treated as interval for the purposes of analysis of variance (ANOVA) with minimal risk of Type I and Type II errors [16, 17], and it is largely treated as such in the medical education and social sciences literature. Furthermore, corrections for homogeneity of variance and equivalent non-parametric analyses conducted on the current data lead to the same statistical conclusions as ANOVA models; we therefore present the results of these familiar models to avoid the overall conclusions being confused by statistical nuance whilst acknowledging here the range of alternative analysis strategies for such data.
The questionnaire comprised 44 items, with 43 being scored on a Likert scale of one (strongly disagree) to five (strongly agree). Each of these items were allocated to one of eight ‘topics’ as shown in Table 1. The scoring of several items were reversed due to negative phrasing and these are indicated in Appendix, along with the group allocation and mean score by program for each item. Item 41 investigated the resources students use to prepare for a PT, offering five options, of which any number could be selected, along with a free text field. The data collected from this item has contributed to wider research .
The questionnaire was completed by 167 participants yielding a response rate of 27.2% (n = 167). These included 96 BMBS students (27.4%), 56 BDS students (24.7%), and 15 BScDTH students (39.5%). The demographics (Program, Stage, Gender and Ethnicity) of the participants are detailed in Table 2. Whilst this study only considers Program as a factor, additional demographic data is reported to allow readers to assess the generalizability of our results to other cohorts and samples.
Topic scores by program
Figure 1 shows the score distributions and observed mean scores (with reverse scoring applied) for each topic by program.
Variation by item and program across topics
A 3 Program (BMBS, BDS, BScDTH) by 8 Topic (A-H) mixed analysis of variance (ANOVA) was conducted using the item level scores. This revealed statistically significant main effects of Program (F(2, 6682) = 33.20, p < 0.001) and Topic (F(7, 6682) = 44.38, p < 0.001), as well as a statistically significant interaction between the two (i.e. the pattern of topic differences was different across programs; F(14, 6682) = 8.01, p < 0.001).
Variation by program and item within topics
The same ANOVA model was conducted on subsets of items by topic. Table 3 shows where there were statistically significant Program, Item, and Program-by-Item interactions for the scores within each topic. For example, the first row shows that students in different programs differed in their view of the overall value of the progress test, and their scores differed across items in this topic, but there was no program by item interaction for this group of questions (i.e. the same pattern of item differences was found across students in each programs).
Of most interest to this project are the differences between Programs and Programs by Item. The former identifies differences in attitudes to progress testing between students on different programs, the latter identifies where there may be differential responding to items in each topic across students in different programs. As can be seen in Table 3, a statistically significant effect of Program was found in Topics A (overall value of Progress Test) and D (Test Behavior), and a statistically significant interaction between Program and Item was found in Topics B (Preparation Styles), C (Psychological Impact) and D (Test Behavior).
Although significant differences were found between individual items within topics, these are of less value to the research questions and were not subjected to post-hoc Tables 4, 5, 6 and 7 provide the results of post-hoc analysis by topic, showing the estimate (the difference between the means), the standard error associated with the difference between the means, the t-test, and the p-value associated with these analyses.
Post hoc analysis by topic
Student views on the overall value of the Progress test (topic a)
Post-hoc analysis of the Program differences in the overall value of Progress Tests revealed that there were significant differences for seven of the 16 items within this topic, with the differences being between the BMBS and BDS and/or the BMBS and BScDTH program, as highlighted in Table 4. The specific differences are outlined below; where a comparison between two or more groups is omitted, it was not found to be statistically significant.
Across the three programs students agreed that the PT was a useful form of assessment, with the BDS students agreeing significantly more with this statement (Item 1; BDS mean = 4.33 versus 4.00 and 3.41 for BScDTH and BMBS respectively). BDS (M = 3.82) and BMBS (M = 2.77) student responses differed significantly in their scores for the statement ‘The Progress Test is not a fair test’, with the BMBS students agreeing more strongly that it is not a fair test (Item 2). BMBS students were significantly more inclined than both the BDS and BScDTH students to agree that the PT questions are too clinically based to be applicable to students in the early years (Item 6; means of 3.85, 3.73, and 2.09 for BScDTH, BDS, and BMBS respectively).
BDS (M = 4.06) and BScDTH (M = 4.00) students agreed that the PT is a good way to examine what they learn day to day on the course, with the BMBS students (M = 2.09) agreeing significantly less with this statement (Item 8). BMBS students agreed significantly more than the BDS students that they would be encouraged to work harder for an examination that just tested areas they had already covered (Item 10; means of 3.95 and 2.88 for BMBS and BDS respectively). BScDTH students agreed significantly more strongly than the BMBS students that patient contact in the early years is helpful preparation for the PT (Item 28; means of 4.62 and 2.21 for BScDTH and BMBS respectively). There was no strong positive agreement that the PT is a good way of assessing an EBL/PBL based curriculum, with BMBS students (M = 2.21) agreeing significantly less than the BDS students (M = 3.45, Item 39).
Preparation styles (topic B)
Of the nine items in this topic, program scores only differed within item 42, with only the difference between the BMBS (M = 3.46) and BDS (M = 2.51) programs reaching statistical significance (Tukey’s HSD, p = 0.024). BMBS students agreed significantly more strongly than the BDS students that textbooks are not the best source for preparation as shown in Table 5.
Psychological impact of the Progress test (topic C)
Of the two items in this topic, 4 and 5, program scores only differed within item 5, with only the difference between the BMBS (M = 3.12) and BDS (M = 2.45) programs reaching statistical significance (Tukey’s HSD, p = 0.024). BMBS students agreed significantly more strongly than the BDS students that it was disheartening to sit an exam with questions to which they knew so few of the answers (Item 5, reverse scored). These results are summarized in Table 6.
Test behavior (topic D)
Of the two items in this topic, 37 and 38, program scores only differed within Item 37, with only the difference between BDS and BMBS reaching statistical significance (Tukey’s HSD, p < 0.001). BMBS students were significantly more likely than BDS students to agree with the statement that they guessed the answers to most of the items in the PT (Item 37; means of 2.95, 2.15, and 1.98 for BMBS, BScDTH, and BDS respectively). These results are depicted in Table 7.
This is the first study exploring undergraduate experiences of PT across three different programs in healthcare education. Several advantages of PT over traditional assessment methods have been highlighted in the literature [19,20,21]. However, most published studies focus on the philosophy, format, and metrics of PT along with perspectives by the experts [3, 22]. Notwithstanding the need to ensure that assessments are valid, reliable, and feasible, it is crucial that the assessment methods are acceptable to the stakeholders . Given that students are the key stakeholders in the assessments used in educational programs, it is imperative to gauge their perceptions and experiences to inform the future development of assessment methods. Previous studies have reported that educators and students may sometimes be at odds about the usefulness of curriculum interventions and assessments . Although a small number of studies have reported the views and experiences of undergraduate medical students on PT [10, 11, 24], there are no previous studies involving undergraduate students in Dentistry and Dental Therapy and Hygiene.
Overall, students across all programs were positive about the value of PT as a useful assessment to support their learning in their respective domains of study. However, BMBS students were less positive about the clinical context of PT compared to BDS and BScDTH students. One possible explanation for this variation is that dental students get more structured clinical exposure in early years at our institution . Dentistry is a unique pedagogical experience and training in dentistry involves performing irreversible operative procedures on patients under supervision . BDS and BScDTH students at Plymouth University start treating patients towards the end of Year 1 of their respective programs and this may account for their ability to apply knowledge to clinical situations in early years of the program. Differences in the level of clinical exposure in early years may also account for lower scores reported by medical students with regards to the impact of enquiry-based learning and prior learning on their preparation and performance on PT. It has been reported that early clinical exposure translates into improved perceptions about the usefulness of PT and consequentially students are more likely to use a deep learning approach .
Another possible explanation for lower scores reported by medical students may be related to differences in the format and standard setting of PT for the BMBS program compared to those for the BDS and BScDTH programs at our institution. Firstly, the BMBS students sit 125-item single-best-answer multiple choice question assessments four times each academic year. On the other hand, PT for the BDS and BScDTH programs are homogenous in regard to the format, frequency and standard setting. BDS and BScDTH students sit 100-item single-best-answer multiple-choice assessments once per term (three times annually). Moreover, the final-year BMBS assessments are criterion-referenced with a pass-fail outcome, whereas earlier years are norm-referenced against set-proportion categorical grading. The BDS and DTH programs however are standard set using a combined Angoff-Hofstee procedure to generate a cut-score, around which categorical grade boundaries are constructed perceptions of discrimination between programs [27, 28]. Our findings are supported by previous studies on medical students which show that the format and specific details of how PT is conducted has an impact on student learning .
Assessments are generally reported to be stressful for students . Overall, the students reported mixed perceptions regarding the psychological impact of PT with, medical students being less positive. As explained earlier, these variations between medical and dental students may be attributed to the differences in the curriculum design, clinical exposure and format of PT, as well as variation in progression rules. Another study on medical students has reported that student stress associated with PT may be related to a general lack of understanding about the purpose of PT and struggles in developing a strategy to answer questions. Nevertheless, attitudes are likely to improve as the students progress through the program, developing strategies for using their time effectively in the assessment and an understanding of the underlying philosophy of PT .
One of the limitations of this study is a low response rate especially for the BMBS program. Moreover, the data was collected from a single institution. However, the authors have no reason to suspect the sample differs in any critical respect to the wider population of students in our institution or healthcare education more widely. However, future work may benefit from sampling across multiple sites to increase sample size and supplementing the quantitative measures with open-ended questions and qualitative exploration of differing perceptions across programs. Another potentially fruitful avenue of investigation would be to further our understanding of the views of academic staff, and explore how these may be reflected by, or otherwise influence, student perceptions. Where the current results begin to shed light on student perceptions of progress testing and its impact on their learning, these additional dimensions would further develop our understanding of the impact of progress testing across different domains of healthcare education.
Undergraduate students in medicine, dentistry, and dental therapy and hygiene regarded PT as a useful assessment to support their learning needs. However, in comparison to students in dentistry and dental therapy and hygiene, the perceptions of medical students were less positive in several aspects of PT. These variations may, in part, be attributed to differences in clinical exposure in early years and test standardization. Further research with a range of stakeholders is required to establish the causes of these differences and develop our understanding of the perceived value and impact of PT on the learning experiences of healthcare students in undergraduate programs.
Analysis of Variance
Bachelor of Dental Surgery
Bachelor of Medicine and Bachelor of Surgery
Bachelor of Science Dental Therapy and Hygiene
Van der Vleuten CPM, Verwijnen GM, Wijnen WHFW. Fifteen years of experience with progress testing in a problem-based learning curriculum. Med Teach. 1996;18:103–10.
Blake JM, Norman GR, Keane DR, Barber Mueller C, Cunnington J, Didyk N. Introducing progress testing in McMaster University’s problem-based medical curriculum: psychometric properties and effect on learning. Acad Med. 1996;71:1002–7.
Freeman AC, Ricketts C. Choosing and designing knowledge assessments: experience at a new medical school. Med Teach. 2010;32(7):578–81.
Coombes L, Ricketts C, Freeman A, Stratford J. Beyond assessment: feedback for individuals and institutions based on the progress test. Med Teach. 2010;32(6):486–90.
Muijtjens AM, Schuwirth LW, Cohen-Schotanus J, van der Vleuten CP. Differences in knowledge development exposed by multi-curricular progress test data. Adv Health Sci Educ Theory Pract. 2008;13(5):593–605.
Van der Vleuten CPM. The assessment of professional competence: developments, research and practical implications. Adv Health Sci Educ. 1996;1:41–67.
Pugh D, Regehr G. Taking the sting out of assessment: is there a role for progress testing? Med Educ. 2016;50(7):721–9.
Brunk I, Schauber S, Georg W. Do they know too little? An inter-institutional study on the anatomical knowledge of upper-year medical students based on multiple choice questions of a progress test. Ann Anat. 2017;209:93–100.
Chen Y, Henning M, Yielder J, Jones R, Wearn A, Weller J. Progress testing in the medical curriculum: students’ approaches to learning and perceived stress. BMC Med Educ. 2015;15:147.
Wade L, Harrison C, Hollands J, Mattick K, Ricketts C, Wass V. Student perceptions of the progress test in two settings and the implications for test deployment. Adv Health Sci Educ Theory Pract. 2012;17(4):573–83.
Yielder J, Wearn A, Chen Y, Henning MA, Weller J, Lillis S, Mogol V, Bagg W. A qualitative exploration of student perceptions of the impact of progress tests on learning and emotional wellbeing. BMC Med Educ. 2017;17(1):148.
McHarg J, Kay EJ. The anatomy of a new dental curriculum. Br Dent J. 2008;204(11):635–8.
Ali K, Zahra D, Tredwin C, Mcilwaine C, Jones G. Use of Progress testing in a UK dental therapy and hygiene educational program. J Dent Educ. 2018;82:130–6.
Ali K, Coombes L, Kay E, Tredwin C, Jones G, Ricketts C, Bennett J. Progress testing in undergraduate dental education: the peninsula experience and future opportunities. Eur J Dent Educ. 2016;20(3):129–34.
Bennett J, Freeman A, Coombes L, Kay L, Ricketts C. Adaptation of medical progress testing to a dental setting. Med Teach. 2010;32(6):500–2.
Norman G. Likert scales, levels of measurement and the “laws” of statistics. Adv in Health Sci Educ. 2010;15:625.
Gaito J. Measurement scales and statistics: resurgance of an old misconception. Psychol Bull. 1980;87:564–7.
Ali K, Cockerill J, Zahra D, Ferguson C. Do students on different programmes use different resources to prepare for the Progress test? EBMA looking ahead in Progress testing conference; 2018.
Schuwirth L, Bosman G, Henning R, Rinkel R, Wenink A. Collaboration on progress testing in medical schools in the Netherlands. Med Teach. 2010;32:476–9.
Swanson DB, Holtzman KZ, Butler A, Langer MM, Nelson MV, Chow JW, Fuller R, Patterson JA, Boohan M. The multi-school Progress testing committee. Collaboration across the pond: the multi-school progress testing project. Med Teach. 2010;32(6):480–5.
Wrigley W, van der Vleuten CP, Freeman A, Muijtjens A. A systemic framework for the progress test: strengths, constraints and issues: AMEE guide no. 71. Med Teach. 2012;34(9):683–97.
Ravesloot C, Van der Schaaf M, Muijtjens A, Haaring C, Kruitwagen C, Beek F, Bakker J, Van Schaik J, Ten Cate T. The don’t know option in progress testing. Adv Health Sci Educ Theory Pract. 2015;20:1325–38.
Muijtjens AM, Timmermans I, Donkers J, Peperkamp R, Medema H, Cohen-Schotanus J, Thoben A, Wenink AC, van der Vleuten CP. Flexible electronic feedback using the virtues of progress testing. Med Teach. 2010;32(6):491–5.
Matsuyama Y, Muijtjens AM, Kikukawa M, Stalmeijer R, Murakami R, Ishikawa S, Okazaki H. A first report of east Asian students’ perception of progress testing:a focus group study. BMC Med Educ. 2016;16(1):245.
Ali K, Zahra D, McColl E, Salih V, Tredwin C. Impact of early clinical exposure on the learning experience of undergraduate dental students. Eur J Dent Educ. 2017;22(1):e75–80. https://doi.org/10.1111/eje.12260.
Polychronopoulou A, Divaris K. Dental students’ perceived sources of stress: a multi-country study. J Dent Educ. 2009;73(5):631–9.
Verhoeven BH, Van der Steeg AFW, Scherpbier AJJA, Muijtjens AMM, Verwijnen GM, van der Vleuten CPM. Reliability and credibility of an Angoff standard setting procedure in progress testing using recent graduates as judges. Med Educ. 1999;33:832–7.
Norcini JJ. Setting standards on educational tests. Med Educ. 2003;37(5):464–9.
Pradhan G, Mendinca NL, Kar M. Evaluation of examination stress and its effect on cognitive function among first year medical students. J Clin Diagn Res. 2014;8(8):BC05–7.
The authors would like to thank the Academic Staff, Assessment Working Group and the Examinations Team Faculty of Medicine and Dentistry, University of Plymouth for their contribution to Progress tests.
The authors would also like to thank the reviewers for their thoughtful suggestions on the methods of analysis. It has enabled us to gain confidence in the conclusions drawn.
There was no funding for the research reported in this article.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Ethics approval and consent to participate
Ethics approval for the study was obtained from the Faculty of Health and Faculty of Medicine and Dentistry, University of Plymouth Research Ethics Committee (Reference Number 16/17–695).
The study was conducted online and all participants completed an online consent form prior to providing their responses.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.