The search findings
Following the primary search, 2534 studies were found by reviewing and hand searching the databases and references; 2309 studies remained after redundant literature was eliminated. Upon reviewing various titles and abstracts, 19 studies that satisfied each of the selection standards were identified. Of the nine studies [9, 11–14, 16–19] selected for systemic review, eight [9, 11, 13, 14, 16–19] were conducted within the realm of nursing education, while the remaining study [12] involved occupational therapy students.
However, Velde and colleagues [12] didn’t report measurement tool’s subscale data. Therefore, this study was excluded in this meta-analysis. Consequently, eight studies were selected for the final review (Fig. 1).
Study quality
Overall, eight selected studies were assessed on risk of bias (Fig. 2). The results of the quality assessment revealed one study [14] satisfied six items of risk bias, six studies [11, 13, 16–19] satisfied five items, and one study [9] satisfied only three items. Three studies [11, 16, 18] were judged as having high risk of random sequence generation because these studies didn’t randomly assigned control and experimental group. Furthermore, only one study [13] had low risk on allocation concealment, while remaining seven studies didn’t reported the allocation sequence. Only one study [9] didn’t blind the intervention program to experimental group and investigator. Also during the program, participants were realized that they were observed by the researcher. Therefore, this study had a potential risk of Hawthorne effect that can produce an invalid result attributed to participants’ expectation. One study [13] might have attrition bias and reporting bias. Because the study reported selectively, i.e., mentioning effective experimental results only. Additionally, this study didn’t report missing data, which is an attrition bias.
Study characteristics
Of the selected eight studies, one was published in 2006; four of them were published in each of 2003, 2004, 2007, and 2008; and three were published in 2012. The studies were conducted in a wide variety of countries including Korea [11], China [19], Thailand [14], Hong Kong [13], Taiwan [9], Turkey [16], Iran [17], and the United States [18].
Regarding the research design employed by the studies, four (50 %) used a randomized pretest-posttest control group design [9, 13, 14, 16], while four (50 %) used a quasi-experimental, nonequivalent pretest-posttest control group [11, 17–19]. Concerning the measurement used to measure critical thinking, three studies [13, 16, 17] used the CCTDI, three [11, 18, 19] were based upon the CCTST and two studies [9, 14] utilized both the CCTDI and CCTST.
The research subjects in most studies (6 studies; 75 %) included nursing students (midwifery students in one study), while staff nurses and nurse practitioner students were the participants in the remaining two. The range of the sample size was between 23 and 67, while the pooled sample size was 647 (experimental group = 327, control group = 320) and 452 (experimental group = 230, control group = 222) in studies that measured CCTDI and CCTST, respectively.
Characteristics of educational method
For the teaching and learning methods used to improve the subjects’ critical thinking skills, three used PBL [11, 13, 19], three used concept mapping [9, 16, 18], one used bioscientific multimedia [14], and one used a collaborative method [17].
The intervention period varied from 8 weeks to two semesters. Regarding the PBL, Yuan et al.’s [19] implementation lasted one semester, i.e., 2 h weekly for 18 weeks, totaling 36 h. Tiwari et al.’s [13] spanned two semesters, which took 3–6 h weekly for 28 weeks. On the other hand, lessons using concept mapping were conducted for 40 min on a biweekly basis for 16 weeks [9]; alternatively, as in Wheeler and Collins’ [18] implementation, participants prepared concept maps for practical training each week during a 15-week training period following a simple orientation.
To verify the long-term effects of education, Tiwari et al. [13] measured subjects three times following intervention, while Kaveevivitchai et al. [14] measured subjects two times after intervention. The remaining seven studies measured subjects only once immediately after intervention. The characteristics of the included studies are summarized in Additional file 2.
Results of the meta-analysis
The following are the results of the meta-analysis on the overall and subscale scores using eight and six CCTDI and CCTST outcome datasets, respectively, from each study. The eight CCTDI datasets showed moderate differences (χ2 = 19.08, p = .008, I2 = 63 %). The random effects model analysis revealed that the teaching and learning methods used in these studies were significantly different than the control group (SMD: 0.42, 95 % CI: 0.26–0.57, p < .00001; Fig. 3). The CCTDI cutoff and target scores were 280 and 350, respectively [9]. Scores of the experimental group in three studies [13, 14, 17] exhibited higher than 280 after the non-traditional educational intervention. However, each of experimental group did not reach the target score, i.e., 350. Analysis of the CCTDI subscale scores for truth-seeking (SMD: 0.32, 95 % CI: 0.01–0.47, p < .0001), open-mindedness (SMD: 0.37, 95 % CI: 0.22–0.53, p < .00001), analyticity (SMD: 0.28, 95 % CI: 0.09–0.46, p = .004), critical thinking confidence (SMD: 0.34, 95 % CI: 0.18–0.49, p < .0001), inquisitiveness (SMD: 0.36, 95 % CI: 0.21–0.52, p < .00001), and maturity (SMD: 0.16, 95 % CI: −0.01–0.32, p = 0.06) revealed a more effective increase as compared to the control group (Additional file 3). When the score of the CCTDI subscale should be higher than 50 to indicate strengthen critical thinking disposition, only one study showed a score of 50 or higher for ‘open-mindedness’ and ‘inquisitiveness’ [14]. In the funnel plot, there was symmetric shape suggesting a lack of publication bias (Fig. 4).
The six datasets presenting the effects of teaching and learning methods on CCTST exhibited a high level of difference (χ2 = 23.32, p = .0003, I2 = 79 %). Consequently, the random effects model was used for analysis, teaching and learning methods used in these studies were significant effects on the overall CCTST score when compared to the control group (SMD: 0.29, 95 % CI: 0.10–0.48, p = 0.003; Fig. 3). Analysis of the subscale scores, however were not revealed a more effective increase as compared to the control group (Additional file 4). Publication bias was examined using the funnel plot that revealed a symmetrical shape suggesting a lack of bias (Fig. 5).
Analysis of the effects of teaching and learning methods revealed that concept mapping (SMD: 0.68, 95 % CI: 0.26–1.11, p = 0.002, I2 = 77 %) was effective in improving critical thinking (Fig. 6). However, PBL (SMD: 0.34, 95 % CI: −0.03–0.70, p = 0.07, I2 = 62 %) was not significantly effective in improving critical thinking.