Skip to main content

Pilot study of the DART tool - an objective healthcare simulation debriefing assessment instrument



Various rating tools aim to assess simulation debriefing quality, but their use may be limited by complexity and subjectivity. The Debriefing Assessment in Real Time (DART) tool represents an alternative debriefing aid that uses quantitative measures to estimate quality and requires minimal training to use. The DART is uses a cumulative tally of instructor questions (IQ), instructor statements (IS) and trainee responses (TR). Ratios for IQ:IS and TR:[IQ + IS] may estimate the level of debriefer inclusivity and participant engagement.


Experienced faculty from four geographically disparate university-affiliated simulation centers rated video-based debriefings and a transcript using the DART. The primary endpoint was an assessment of the estimated reliability of the tool. The small sample size confined analysis to descriptive statistics and coefficient of variations (CV%) as an estimate of reliability.


Ratings for Video A (n = 7), Video B (n = 6), and Transcript A (n = 6) demonstrated mean CV% for IQ (27.8%), IS (39.5%), TR (34.8%), IQ:IS (40.8%), and TR:[IQ + IS] (28.0%). Higher CV% observed in IS and TR may be attributable to rater characterizations of longer contributions as either lumped or split. Lower variances in IQ and TR:[IQ + IS] suggest overall consistency regardless of scores being lumped or split.


The DART tool appears to be reliable for the recording of data which may be useful for informing feedback to debriefers. Future studies should assess reliability in a wider pool of debriefings and examine potential uses in faculty development.

Peer Review reports


Simulation-based medical education (SBME) allows participants to safely apply skills in a team-based context with debriefing allowing for collective reflection and learning [1, 2]. Facilitation of debriefings is viewed as a difficult skill to master. Effective debriefers are often seen to encourage reflection, uncover performance gaps and promote a discussion of how to improve management of future scenarios [3, 4].

Debriefing is recognized as an essential component of SBME delivery [3]. Assessments of debriefing quality assist in improving the future performance of debriefers [5]. A number of recognized scoring aids are commonly used to assess debriefing quality including the Objective Structured Assessment of Debriefing (OSAD), and the Debriefing Assessment for Simulation in Healthcare (DASH) tools [2, 6]. These aids assess debriefers’ performance on a Likert-scale based on specific observable behaviors [1, 6]. For instance, in the DASH debriefers are assessed globally on their ability to provide an “engaging learning environment” and explore “performance gaps” [1]. These tools provide a useful framework and are demonstrative of ideal behaviors but are not without limitations. First, they are relatively time consuming and use subjective scales. For instance, what may be considered an engaging learning environment for one rater may be viewed as challenging, onerous, or problematic by other raters. Local culture is widely understood to influence engagement and expectations during debriefings and therefore may undermine the accuracy of the various tools [7, 8]. Furthermore, similar survey tools may lead to response biases in raters [9]. These biases could diminish the reliability of Likert-scale scoring of debriefing assessment tools. We have observed this as provision of socially desirable (higher) ratings in a peer context or extreme responding (e.g., blanket scoring of 7/7 in all domains) [10, 11]. To summarize, despite widespread use of SBME for healthcare professions learning, our current assessment tools for debriefer performance are qualitative, subjective, and focus only on ideal behaviors. Therefore, a gap exists for complementary ‘quantitative’ approaches to rating performance and providing feedback. To address this issue, we propose a new scoring system - ‘The Debriefing Assessment in Real Time (DART) tool’. The goal of this pilot study was to explore and investigate the reliability and potential utility of the DART tool as an alternative approach to assessment of debriefing quality.


Study setting

This international study was a collaboration between the Center for Advanced Pediatric and Perinatal Education (CAPE) at Stanford University (USA) and three Australian hospitals affiliated SBME centers in Western Sydney. A supervising author (LPH) has over 25 years of SBME experience and conceived the Debriefing Assessment in Real Time (DART) tool following observation of simulation and debriefing at the National Aeronautics and Space Administration (NASA) and extensive debriefing experience with CAPE faculty [12]. As stated above the stated goals were to explore and investigate the reliability and potential utility of the DART tool as an alternative approach to assessment of debriefing quality.

DART tool

The DART (Fig. 1) was developed as a real-time objective measure of debriefing performance by faculty at the Center for Advanced Pediatric and Perinatal Education (CAPE) based on practices in simulation and debriefing in non-healthcare industries. This tool scores observable sequential debriefing contributions in a cumulative fashion including Instructor Questions (IQ), Instructor Statements (IS) and Trainee Responses (TR). Furthermore, the tool provides information to SBME supervisors on key timings and ratios of instructor questions:statements (IQ:IS) and trainee:instructor verbalizations (TR:[IQ + IS]) can be calculated.

Fig. 1
figure 1

CAPE Debriefing Assessment in Real Time (DART) Tool

Subject selection

Eligible subjects were interdisciplinary adult simulation faculty with a formal simulation center or university affiliation. No specific exclusion criteria were determined prior to subject selection as this was an explorative project for the generalizability of the DART tool. Subjects were faculty who volunteered their time for the pilot study.

Study overview

Two pre-filmed video examples (Video A and Video B) of post-simulation debriefing were selected for the assessment of the DART. Using the DART, subjects (n = 8) individually rated the debriefings while watching the video. Printed paper copies of the DART tool (Fig. 1) were used to score Video A and Video B in real time (in a single take) as per instructions of the tool’s designer (LPH). Videos were viewed separately on desktop computers to ensure subjects were blinded to each other’s scores. Responses were collated and tabulated by a single investigator (KB).

Video transcript

Video A was selected for additional assessment. A university staff member with training in qualitative methods professionally transcribed the video (Fig. 2). The use of a transcript for rating was intended to provide an in-depth analysis identifying areas where subjects differed the most in their recorded observations. Subjects were instructed to highlight sentences that translated to their recorded observations while they rated the transcript using the DART. In order to ensure accuracy, subjects were not limited to only reading sentences once. Upon completion, a discussion took place among subjects regarding their reasoning behind their DART scores.

Fig. 2
figure 2

Video A Transcript

Calibration video selection

The two short debriefing videos were selected from Free Open Access Medical Education sources. Sample brief debriefing videos (purporting to represent good performance) from various formal simulation organizations were reviewed and after consultation among our collaborative research group two contemporary videos were selected for this pilot study. Video A (California Simulation Alliance) exemplified a predominance for an Advocacy-Inquiry approach to debriefing, whereas Video B (The Patient Safety Institute) exemplified the D.E.B.R.I.E.F. model of debriefing:

  • Debriefing Video A (December 2018) - California Simulation Alliance Health Impact (Origin – United States). Description - 'Filmed on location at Highland Hospital in Oakland, California' [13].

  • Debriefing Video B (March 2016) - The Patient Safety Institute (Origin – United States) - Description - 'This demonstrates what a good debrief looks like using the D.E.B.R.I.E.F. method' [14].


As per the DART, subjects recorded the number of instructor questions (IQ), instructor statements (IS), and trainee responses (TR). Two different ratios were calculated from the recorded values: A ratio of instructor questions to instructor statements (IQ:IS) and a ratio of trainee responses to instructor questions and statements (TR:[IQ + IS]). In this study, inter-rater reliability of the DART was estimated by a calculated Coefficient of Variation (CV%). The CV% describes the dispersion of data relative to its mean. CV% for each reported cumulative tally and ratio were calculated using descriptive statistics (SD ÷ mean). CV% was selected for statistical analysis rather than Intraclass Correlation Coefficients (ICC) because of the limited sample size. It is recommended to include 30 or more samples involving at least 3 raters in order to interpret the ICC accurately [15].

Values of CV% were compared within each of the three data debriefings (Video A, Video B, and Transcript A) in order to estimate variability in ratings. We compared the mean CV% of the recorded observations to each calculated ratio within each data set. Additionally, we compared the CV% between the ratings for videos versus those for the transcript. Finally, we compared the mean CV% of each individual recorded observation and calculated the ratio between each data set.


Tables 1, 2 and 3 show each subject’s (n = 8) demographic characteristics, self-reported DART scores and calculated ratios for Video A (n = 7), Video B (n = 6), and the transcript of Video A (n = 6). Due to limited availability, not all subjects were able to rate each video and transcript. Subjects used were experienced in simulation and debriefing, with a median of 9.0 (IQR 7.5-12.5) years of experience. There were more subjects with a physician background (n = 5) than a nursing background (n = 3).

Table 1 Video A
Table 2 Video B
Table 3 Transcript for Video A

The mean of each individual variable across all three data sets (Video A, Video B, and the Transcript), and the mean of all reported observations (IQ, IS, and TR) within a data set were calculated for analysis. We found the mean CV% for the three reported observations in Video A, Video B, and the transcript was 33.3, 41.5, and 27.1%, respectively. When comparing these values with the CV% values of both ratios, we found them lower for TR:[IQ + IS], but higher for IQ:IS. Further, we found the CV% for each variable in the transcript (IQ = 21.0%, IS = 29.1%, TR = 31.3%, IQ:IS = 32.2%, TR:[IQ + IS] = 21.1%) lower than the same CV% values for Video A (IQ = 28.3%, IS = 33.0%, TR = 38.8%, IQ:IS = 42.6%, TR:[IQ + IS] = 32.0%) and Video B (IQ = 34.1%, IS = 56.3%, TR = 34.2%, IQ:IS = 47.5%, TR:[IQ + IS] = 31.0%). When comparing individual scores across each data set, the mean CV% for IQ (27.8%) was lower than the mean CV% for IS (39.5%) and for TR (34.8%). Additionally, the mean CV% for IQ:IS ratio (40.8%) was higher than the mean CV% for either of the individual scores used in the ratio (IQ = 27.1%, IS = 39.5%).


In this study we explored the use of DART as a simple and objective scoring system for recorded interdisciplinary healthcare simulation debriefings. We assessed heterogenous sources (debriefings and transcripts) and enrolled eight interdisciplinary raters from four simulation centers to estimate variation in scoring. Observed variances in IQ, IS, TR and IQ:IS were higher compared with the TR:[IQ + IS] ratio. The difference may be attributable to whether raters were “lumpers” or “splitters” in their characterization of long statements as single or multiple concepts. “Lumpers” are study subjects who had the tendency to score long statements as a single concept, and “splitters” as subjects who had the tendency to score the longer statements as multiple concepts. Regardless of whether subjects were considered “lumpers” or “splitters”, the low variance in TR:[IQ + IS] suggests the DART is internally consistent. Furthermore, we observed a lower mean variance for IQ in comparison to IS or TR (Tables 1, 2 and 3). The lower variance in identification of questions (IQ) indicates that debriefing raters are readily able to recognize questions compared to statements. We note that with the commonly used advocacy-inquiry (AI) approach to debriefing, the debriefers often mix statements and questions together. This in turn could reduce reliability of DART scores as well as the inferences drawn about quality from the tool. For example, a debriefer using AI may make more statements and ask less questions and therefore, from their DART score, appear less effective or less inclusive facilitator. Of course, the opposite may be true. While we recognize this as a limitation of the DART tool for measuring debriefing quality, the tool scores could still be used as a basis for giving peer-feedback to debriefer colleagues. For instance, one might share with a colleague: “I noticed that you asked 3 questions and made 25 statements about respiratory failure. This count back of your questions and statements might suggest some room for improvement in our encouragement of reflection in this debriefing. What was your thought process at the time?” Furthermore, in reflecting on why the alternative debriefing assessment tools like the DART is likely to useful in many simulation settings we ask the reader to consider if they ever or often observe either a lack of questions or lecturing by debriefers? [7] We do not by any means claim these behaviors are universal, a predominance of the debriefer talking has been a proven observation across the simulation sites where this study was based [16].

When comparing the scores of videos as compared to the transcript, we found lower variances for each reported observation and calculated ratio in the transcript scores. Subjects rating the transcript had no limitations regarding rereading sentences, while subjects rating Videos A and B were unable to rewind or rewatch film and were limited to watching in real time. While accuracy increases with the ability to reread and reflect, these circumstances do not represent practical use of the DART. As a result, transcript scores may underestimate the true variation in scoring, and videos A and B may better represent the real-world use of DART [17]. It may be easier to determine the breakdown of statements in the scoring of a written transcript but use in real-time leading to variation in scoring is unlikely to preclude the tool’s usefulness in faculty feedback. Moreover, it is recognized in high stakes assessments that observer error is a significant problem [18, 19]. Similarly, debriefing scoring variation could be prone to rater error. However, given the intended use of the DART in debriefing for new faculty feedback, the thresholds of acceptable error may be wider than for high-stakes assessments. As a result, in our view the CV% observed in this study are acceptable for further work that tests the validity of the tool for faculty development. One issue that has not been clarified at this point is what various DART ratios scores represent in terms of a representation of true debriefing quality. Sanders’ prior work on promoting reflective practice may suggest that higher cumulative tallies of questions (IQ) and participant contributions (TR) are observed in debriefings where reflection and practice change is being promoted [20].

In terms of specific problems with the DART tool, we identified more errors in IS scores. After discussion with each rater and review of our transcript we believe this variation may be attributable to a “lumper/splitter” phenomenon. Variation in each rater’s assessment of a single “statement” or “single concept” appears to be problematic and may have led to the higher CV% observed for IS. As an example, we can address this statement taken from the transcript: “So let’s spend the next five- or 10-minutes debriefing. So, a reminder, debriefing is a guided reflection, and our goal is to improve how we work and care for our patients. I want to restate our basic assumption that we’re all intelligent, motivated and want to do the right thing” When asked about their scores, raters that were “lumpers” may have considered this as a single statement, giving a score of one. However, “splitters” may consider each sentence in the quote as a separate statement, giving a score of three. Implementing a standardized training protocol and calibration exercises may reduce these differences, but it is not our intention to increase cognitive load or over complicate the use of a tool that was designed to be easy to use [21].

Comparison with other rating tools

The SBME literature outlines a range of ideal behaviors exhibited by debriefers that can promote reflective practice and improve performance [3]. Existing models of providing feedback have a key role in identifying the factors listed but may fail to provide quantitative information to debriefers seeking to understand performance. Further, existing tools (i.e., DASH and OSAD) have notable limitations in their validation studies and may be subject to “response bias”, which is a problem with Likert scales [9].

The OSAD tool [6], which has recently been validated electronically and in languages other than English, provides useful feedback to debriefers, but also uses a relatively subjective 1-5 Likert rating scale [22, 23]. Further, while the OSAD tool has been studied for a wider range of settings than the DASH tool, including pediatric simulation, one of the major validation studies used just two raters to examine the tool [6, 23]. Of note, a recently described tool known as the Simulation in Healthcare retrOaction Rating Tool (SHORT) was described as an alternative approach for shorter debriefings [24]. The authors simultaneously derived and validated their tool, which appears to have excellent agreement and good inter-rater reliability when used for assessing SBME.

The widely used DASH tool was validated with the use of 3 debriefing example videos that were scored by more than 100 online raters [1]. However, the DASH has neither been externally validated nor translated into other languages or formats. In addition, from a user standpoint, it is challenging to use the DASH tool for feedback after the debriefing because the attributed scores do not provide specific goals to target in the next debriefing opportunity in terms of definitive actions or targets. We suggest that the DASH, SHORT or OSAD (which highlight many of the subjective qualities expected of facilitators) could be used in combination with the DART tool to enhance feedback for novice debriefers or for peer coaching of experienced debriefers [25].

Implications for faculty development

A recent study recognized that traditional methods of SBME faculty development lack a structured approach to achieve expertise and proposed the use of DebriefLive, a virtual teaching environment that allows faculty to review their debriefing performances by observing recorded videos and scoring themselves [26]. Direct observation of debriefers by experienced faculty, faculty mentoring to achieve debriefing expertise, and targeted coaching conversations using an agreed-upon approach may all have some role in assisting with the development of skill in debriefing [5, 25, 27]. Moreover, the use of quantitative scoring systems have the potential to provide conversational substrate for all of these approaches, and may help debriefers improve at all levels of experience.

In non-healthcare settings, it is generally established that those participating in debriefings engage with each other in problem solving and the debriefer is generally a “guide on the side” rather than “a sage on the stage” [3]. However, our personal observation in healthcare simulation practice is that debriefers are frequently in the latter category. In the second video example (Video B) assessed in this study the video publishers cited the video as a “good example” but the debriefer(s) talked for > 80% of the debriefing [14]. We have observed that the length of time talking seems to be a germane factor when assessing the quality of facilitation [16]. Using the OSAD tool or the DASH score for this video debriefing, or a similar real life equivalent, may not have resulted in a true understanding of the issues requiring improvement (i.e. the debriefer dominating the conversation). To summarize, using quantitative data may help amplify feedback to debriefer colleagues and this in turn may help behavior change. The DART tool provides point of care information to debriefers, and this can either supplement the use of the OSAD, DASH or SHORT tools, or be used as a standalone matrix for debriefer feedback. The DART addresses the limitations of qualitative measures by replacing subjective scales with a cumulative scoring method, avoiding response bias, and reducing complexity. This ease of use permits the DART with the potential to track debriefer progression over time by continually comparing current scores to previous ones. From the results of this pilot study, we plan to further assess the reliability and validity of the DART tool by expanding the number of study sites, videos and raters in a future study.


We acknowledge the limitations of our study. Firstly, the limited number of debriefings assessed restricted the quantity of ratings available for analysis. Unlike CV%, which was used in this study and may be a sub-optimal analysis, the ICC would have provided a standardized stratification system for evaluating variability [15]. Secondly, we recognize the creators of this tool are listed as authors which could have led to unrecognized implicit bias in the study. Thirdly, as discussed above, there was no standardized tool orientation used in this study. This may have led to the higher variance in some reported observations. From the experiences of conducting the studies we have made a training video and calibration exercise hosted at which is free to use. Finally, in terms of real-world extrapolation, the DART is meant to be used to evaluate debriefers in real time. However, in this study we used video debriefings and written transcripts which may not represent real world use of the tool.


The DART tool has the potential to provide reliable data about healthcare simulation debriefing. As a real-time instrument, DART can be used either alone or in conjunction with qualitative tools such as DASH or SHORT for assessing the quality of debriefings [28]. Further evaluation using a spectrum of debriefings at users should now be conducted to determine the best future role of this tool.

Availability of data and materials

All data generated or analysed during this study are included in this published article. SiLECT centre data is available on request from





Australian Institute of Medical Simulation and Innovation


Center for Advanced Pediatric and Perinatal Education


Coefficient of Variation


Debriefing Assessment in Real Time


Debriefing Assessment for Simulation in Healthcare


Human research and ethics committee


Intraclass Correlation Coefficient


Instructor Questions


Instructor Statements


National Aeronautics and Space Administration


Objective Structured Assessment of Debriefing


Simulation-based medical education


Simulation in Healthcare retrOaction Rating Tool


Simulated Learning Environment for Clinical Training


Trainee Responses


Western Sydney Local Health District


  1. Brett-Fleegler M, Rudolph J, Eppich W, Monuteaux M, Fleegler E, Cheng A, et al. Debriefing assessment for simulation in healthcare: development and psychometric properties. Simul Healthc. 2012;7(5):288–94.

    Article  Google Scholar 

  2. Tannenbaum SI, Cerasoli CP. Do team and individual debriefs enhance performance? A meta-analysis. Hum Fact. 2013;55(1):231–45.

    Article  Google Scholar 

  3. Eppich W, Cheng A. Promoting excellence and reflective learning in simulation (PEARLS): development and rationale for a blended approach to health care simulation debriefing. Simul Healthc. 2015;10(2):106–15.

    Article  Google Scholar 

  4. Husebø S, Dieckmann P, Rystedt H, Søreide E, Friberg F. The relationship between Facilitators' questions and the level of reflection in Postsimulation debriefing. Simul Healthc. 2013;8:135–42.

    Article  Google Scholar 

  5. Cheng A, Grant V, Huffman J, Burgess G, Szyld D, Robinson T, et al. Coaching the Debriefer: peer coaching to improve debriefing quality in simulation programs. Simul Healthc. 2017;12(5):319–25.

    Article  Google Scholar 

  6. Arora S, Ahmed M, Paige J, Nestel D, Runnacles J, Hull L, et al. Objective structured assessment of debriefing: bringing science to the art of debriefing in surgery. Ann Surg. 2012;256(6):982–8.

    Article  Google Scholar 

  7. Ulmer FF, Sharara-Chami R, Lakissian Z, Stocker M, Scott E, Dieckmann P. Cultural prototypes and differences in simulation debriefing. Simul Healthc. 2018 Aug;13(4):239–46.

    Article  Google Scholar 

  8. Chung HS, Dieckmann P, Issenberg SB. It is time to consider cultural differences in debriefing. Simul Healthc. 2013;8(3):166–70.

    Article  Google Scholar 

  9. Kreitchmann RS, Abad FJ, Ponsoda V, Nieto MD, Morillo D. Controlling for response biases in self-report scales: forced-choice vs Psychometric Modeling of Likert Items. Front Psychol. 2019;10:2309.

    Article  Google Scholar 

  10. Furnham A. Response bias, social desirability and dissimulation. Personal Individ Differ. 1986;7(3):385–400.

    Article  Google Scholar 

  11. Nederhof AJ. Methods of coping with social desirability bias: a review. Eur J Soc Psychol. 1985;15(3):263–80.

    Article  Google Scholar 

  12. Halamek L, Cheng A. Debrief2Learn [internet]2017 [cited 06/04/21]. Podcast. Available from:

  13. Simulation Debrief [Internet]. CSA Health Impact 2018 [cited 30/06/21]. Available from:

  14. Simulation Instructor Course - Good Debrief (Using D.E.B.R.I.E.F. Method) [Internet]. The Patient Safety Institute. 2016 [cited 30/06/21]. Available from:

  15. Koo TK, Li MY. A guideline of selecting and reporting Intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–63.

    Article  Google Scholar 

  16. Coggins A, Hong SS, Baliga K, Halamek LP. Immediate faculty feedback using debriefing timing data and conversational diagrams. Adv Simul (Lond). 2022;7(1):7.

    Article  Google Scholar 

  17. MacLean LM, Meyer M, Estable A. Improving accuracy of transcripts in qualitative research. Qual Health Res. 2004;14(1):113–23.

    Article  Google Scholar 

  18. Norcini JJ. The death of the long case? BMJ. 2002;324(7334):408–9.

    Article  Google Scholar 

  19. Schleicher I, Leitner K, Juenger J, Moeltner A, Ruesseler M, Bender B, et al. Examiner effect on the objective structured clinical exam - a study at five medical schools. BMC Med Educ. 2017;17(1):71.

    Article  Google Scholar 

  20. Sandars J. The use of reflection in medical education: AMEE guide no. 44. Med Teach. 2009;31(8):685–95.

    Article  Google Scholar 

  21. Nair BKR, Moonen-van Loon JM, Parvathy M, Jolly BC, van der Vleuten CP. Composite reliability of workplace-based assessment of international medical graduates. Med J Aust. 2017;207(10):453.

    Article  Google Scholar 

  22. Abegglen S, Krieg A, Eigenmann H, Greif R. Objective structured assessment of debriefing (OSAD) in simulation-based medical education: translation and validation of the German version. PLoS One. 2020;15(12):e0244816.

    Article  Google Scholar 

  23. Zamjahn JB, Baroni de Carvalho R, Bronson MH, Garbee DD, Paige JT. eAssessment: development of an electronic version of the objective structured assessment of debriefing tool to streamline evaluation of video recorded debriefings. J Am Med Inform Assoc. 2018;25(10):1284–91.

    Article  Google Scholar 

  24. Runnacles J, Thomas L, Korndorffer J, Arora S, Sevdalis N. Validation evidence of the paediatric objective structured assessment of debriefing (OSAD) tool. BMJ Simul Technol Enhanc Learn. 2016;2(3):61.

    Article  Google Scholar 

  25. Cheng A, Eppich W, Kolbe M, Meguerdichian M, Bajaj K, Grant V. A conceptual framework for the development of debriefing skills: a journey of discovery, growth, and maturity. Simul Healthc. 2020;15(1):55–60.

    Article  Google Scholar 

  26. Wong NL, Peng C, Park CW, Jt P, Vashi A, Robinson J, et al. DebriefLive: a pilot study of a virtual faculty development tool for debriefing. Simul Healthc. 2020;15(5):363–9.

    Article  Google Scholar 

  27. Cheng A, Grant V, Dieckmann P, Arora S, Robinson T, Eppich W. Faculty development for simulation programs: five issues for the future of debriefing training. Simul Healthc. 2015;10(4):217–22.

    Article  Google Scholar 

  28. Riviere E, Aubin E, Tremblay SL, Lortie G, Chiniara G. A new tool for assessing short debriefings after immersive simulation: validity of the SHORT scale. BMC Med Educ. 2019;19(1):82.

    Article  Google Scholar 

Download references


The authors would like to thank Nicole King and Nathan Moore for supporting the project.


The Health Education and Training Institute (HETI) provided limited funding for simulation equipment prior to the study. None of the authors have relevant commercial conflicts of interest to declare.

Author information

Authors and Affiliations



K.B, A. C and L. H conceived the study. K. B and A. C extracted data from the data collection sheets and collated the data. K. B led the analysis of results. All authors contributed to and have approved the final manuscript.

Corresponding author

Correspondence to Andrew Coggins.

Ethics declarations

Ethics approval and consent to participate

The protocols for this study were prospectively examined and approved (Ref: 2020/ETH01903) by the Western Sydney Local Health District (WSLHD) human research and ethics committee (HREC). The study was carried out in accordance with relevant guidelines and regulations (NHMRC 2022). Informed consent was obtained from all subjects according to local guidelines.

Consent for publication

Participants consented using a standard HREC process.

Competing interests

None declared.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baliga, K., Coggins, A., Warburton, S. et al. Pilot study of the DART tool - an objective healthcare simulation debriefing assessment instrument. BMC Med Educ 22, 636 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: