Skip to main content

Training and evaluating simulation debriefers in low-resource settings: lessons learned from Bihar, India



To develop effective and sustainable simulation training programs in low-resource settings, it is critical that facilitators are thoroughly trained in debriefing, a critical component of simulation learning. However, large knowledge gaps exist regarding the best way to train and evaluate debrief facilitators in low-resource settings.


Using a mixed methods approach, this study explored the feasibility of evaluating the debriefing skills of nurse mentors in Bihar, India. Videos of obstetric and neonatal post-simulation debriefs were assessed using two known tools: the Center for Advanced Pediatric and Perinatal Education (CAPE) tool and Debriefing Assessment for Simulation in Healthcare (DASH). Video data was used to evaluate interrater reliability and changes in debriefing performance over time. Additionally, twenty semi-structured interviews with nurse mentors explored perceived barriers and enablers of debriefing in Bihar.


A total of 73 debriefing videos, averaging 18 min each, were analyzed by two raters. The CAPE tool demonstrated higher interrater reliability than the DASH; 13 of 16 CAPE indicators and two of six DASH indicators were judged reliable (ICC > 0.6 or kappa > 0.40). All indicators remained stable or improved over time. The number of ‘instructors questions,’ the amount of ‘trainee responses,’ and the ability to ‘organize the debrief’ improved significantly over time (p < 0.01, p < 0.01, p = 0.04). Barriers included fear of making mistakes, time constraints, and technical challenges. Enablers included creating a safe learning environment, using contextually appropriate debriefing strategies, and team building. Overall, nurse mentors believed that debriefing was a vital aspect of simulation-based training.


Simulation debriefing and evaluation was feasible among nurse mentors in Bihar. Results demonstrated that the CAPE demonstrated higher interrater reliability than the DASH and that nurse mentors were able to maintain or improve their debriefing skills overtime. Further, debriefing was considered to be critical to the success of the simulation training. However, fear of making mistakes and logistical challenges must be addressed to maximize learning. Teamwork, adaptability, and building a safe learning environment enhanced the quality enhanced the quality of simulation-based training, which could ultimately help to improve maternal and neonatal health outcomes in Bihar.

Peer Review reports


Simulation-based training for health providers is becoming widely recognized as a tool for improving facility-based care of mothers and neonates globally [1]. Post-simulation debriefs, where learners identify clinical weaknesses, discuss team functioning, expand their knowledge base, and subsequently apply lessons learned to real cases, is the cornerstone of the learning process [2]. The World Health Organization (WHO) recommends that simulations be added to quality improvement trainings to help address skill gaps [1]. Several programs, including PRONTO (Programa de Rescate Obstétrico y Neonatal: Tratamiento Óptimo y Oportuno) International [3], Jhepiego [4], and Helping Babies Breathe (HBB) [5] have implemented simulation-based maternal and neonatal training programs in low- and middle-income countries (LMIC), including Mexico [6], Guatemala [7], Tanzania [8], and India [9], These programs have demonstrated improvements in clinical skills and in 24-h neonatal survival [10]; however, several critical implementation questions remain. In low-resource settings, how do you support and sustain debriefing competency, the most challenging skill of simulation facilitation? How is this best done at scale?

Debrief facilitation of simulations is difficult to learn, and achieving fluency and expertise requires time and experience [2]. For simulation to have its optimal effects, an experienced facilitator guides reflective learning, creates a safe learning environment, and encourages self-reflection [11, 12]. Several debrief evaluation tools have been designed, validated, and implemented in high-resource settings for simulations [13,14,15,16]. These tools provide valuable feedback, which is critical for facilitator development and for enhancing of the learning experience of future simulation participants.

Despite the known importance of effective debriefing and the growing demand for simulation-based training globally, the optimal way to train and evaluate simulation facilitators in low-resource settings is unknown [17]. Recent research has highlighted the complex role that culture plays on debrief facilitation. In low-resource settings, two important challenges exist. First, facilitators generally have limited to no previous experience with simulation-based training and rely heavily on unilateral, didactic approaches. A multi-country study demonstrated that debrief facilitators from high-power difference (i.e., hierarchical) cultures were less likely to ask open-ended questions and more likely to talk rather than facilitate discussion [18]. Second, health facilities in many LMIC settings lack a culture of non-punitive feedback, a key component of successful debriefing. In the Rwanda Emergency Triage, Assessment and Treatment plus admission care (ETAT+) trial, authors reported that reviewing mortality data with trainees was difficult because this practice made trainees feel shameful [19]. A HBB training program in Guatemala found that debriefing was a new concept for participants and suggested increased training time focused on debriefing methods and feedback for future participants [20]. In Bihar, a predominately rural Indian state with very low socioeconomic status [21] and a largely didactic model of education [22], such challenges are likely more pervasive.

Given the rapid growth of simulation-based training in low-resource settings, it is critical to have tools to accurately evaluate the debriefing abilities of facilitators. This knowledge will allow simulation programs to provide feedback to help facilitators improve their skills and maximize trainee learning. The Debriefing Assessment for Simulation in Healthcare (DASH) tool, developed at the Harvard Simulation Center, is the most widely used debrief evaluation tool and has been extensively validated in high-resource settings [13]. The DASH tool evaluates instructors on key behaviors that facilitate learning and change using six behavioral components [23]. The Center for Advanced Pediatric and Perinatal Education (CAPE) Debriefing Evaluation Tool was developed at Stanford University [14]. Compared to the DASH tool, the CAPE tool uses more objective criteria, which we hypothesize may be more accessible to less experienced debrief facilitators in LMICs. The aim of this study was to explore debrief training and evaluation in Bihar by i) evaluating the interrater reliability of the CAPE and DASH tools using video-recorded debriefing sessions conducted during simulation trainings, ii) assessing changes in nurse mentors’ debriefing skills over time, and iii) exploring barriers and enablers of simulation debriefing among nurse mentors.


Study setting

Bihar has a population of over 100 million, with 89% living in rural areas [24]. In 2012, the maternal mortality rate (MMR) was 208 per 100,000 live births in Bihar and the neonatal mortality rate (death within the first days) was 34 per 1000 live births [25]. In Bihar, each block primary health center (PHC) serves an average population of ~ 190,000. One nurse is frequently responsible for all obstetric and neonatal delivery care at a given PHC [26].

Study design

This was a mixed methods study, including quantitative and qualitative data.

Program overview

The Mobile Nurse Mentoring Program, AMANAT (meaning “something given in trust”), was a large-scale obstetric and neonatal nurse mentoring program led by CARE India in collaboration with the Government of Bihar. The AMANAT program was implemented at 320 PHCs across Bihar between 2015 and 2017 over four 8-month rounds. Rounds 1–3 were included in this analysis as Round 4 was ongoing. During each round, nurse mentors rotated in pairs between four sites, spending one week per month at each site for a total of six to eight weeks at each PHC. Each PHC had six to eight nurse mentees. Starting in week 3 on their third visit, nurse mentors facilitated a minimum of three simulations per week focused on key maternal and neonatal scenarios. Each simulation was followed by a debrief, which was recorded using a handheld camera.

Study population

A total of 120 nurse mentors participated in AMANAT rounds 1 through 3, which were conducted from March 2015 to June 2016. Nurses were selected by CARE India and the Government of Bihar to work as on-site mentors and simulation facilitators. The details of this program are described elsewhere [27]. Nurse mentees were nurses working in PHCs in Bihar, who were required to have an Auxiliary Nurse Midwife (ANM) or General Nursing and Midwifery (GNM) qualification. ANM and GNM qualifications require a secondary education with an additional two or three and a half years of nursing training, respectively.

Simulation facilitation training

Nurse mentor training was implemented using the train-the-trainer approach. Nurse mentors underwent four weeks of in-depth training with CARE India. One week was entirely devoted to Basic Emergency Obstetric and Neonatal Care (BEmONC) simulation training with PRONTO International. This training included simulation facilitation, teamwork, communication, and debriefing skills. One half-day focused exclusively on the theory of debriefing and 1.5 days allowed for the practice of debriefing skills. Nurse mentors were taught to facilitate debriefs using the diamond debriefing method, a structure that includes three phases: description, analysis, and application [28]. This approach encourages participants to reflect on their behavior, review practice guidelines, focus on teamwork and communication (based on TeamSTEPPS™) [11], and consider how to apply knowledge and skills to real-life clinical practice. Throughout the training, the key concepts from the CAPE and DASH, particularly the importance of facilitating discussion rather than lecturing, was emphasized. Nurse mentors were provided a menu of 31 SimPacks™ (simulation scenario and debriefing guide) from which they could choose. Due to time constraints, mentors did not receive individualized feedback from videos until after the round had been completed; however, four months following the initial training, nurse mentors completed an additional four-day Advanced Simulation Facilitator training with PRONTO, which focused on simulation facilitation and debriefing skills [29].

Part 1. Evaluating inter-rater reliability of the DASH and CAPE tools

Debrief monitoring

To evaluate debriefing quality, the research team randomly selected one debriefing video per mentor pair during three time points: early (months 3–4), mid (month 5), and late (months 6–7). The target sample size was 85, based on the suggestion of Bujang and Baharum that 85 items are required when the null hypothesis can be assumed to not equal zero and there are two observations per subject [30]. This sample size provides adequate power for estimating Cohen’s Kappa with 2 raters per item [31]. This study included debriefs of normal spontaneous vaginal delivery (NSVD), postpartum hemorrhage (PPH) and neonatal resuscitation (NR) simulation scenarios. Debrief videos were analyzed using the DASH and CAPE tools. These two tools were modified by a group of low-resource simulation experts at University of California San Francisco (UCSF), PRONTO, and University of Utah, with the input of clinical providers in Bihar. We made two modifications to the DASH tool. First, the 1–7 Likert scale was reduced to a 1–5 scale because evidence from the literature suggests higher data validity with 1–5 scales if respondents have variable levels of education [32]. Second, the first element of the DASH tool, ‘establishes and engaging learning environment,’ was skipped because the pre-debrief was not filmed and therefore could not be evaluated [23]. The DASH forms were then inputted into Qualtrics™ surveys for electronic data collection. Several modifications to the CAPE tool were also made. Due to logistical challenges, the following indicators were removed: ‘time between end of scenario and start of debriefing;’ ‘time when audio first rolls during debriefing;’ ‘length of debriefing to length of scenario ratio;’ and ‘percentage of scenario covered during debrief.’ The indicator, ‘percent of learning objectives covered during debriefing,’ was adjusted to reflect the ‘total number of cognitive, technical, and behavioral objectives covered’ in order to simplify coding. Finally, a code window of the modified CAPE tool (Appendix 1) was developed using Studiocode™. video coding software.

Two nurses (henceforth called video analysts), both based in Bihar, not involved in program implementation, and fluent in Hindi (the local language), were trained in debrief video analysis. This training consisted of a two-hour lesson on debrief theory, a detailed review of the modified DASH and CAPE tools, and coding in Studiocode™.. During rater training, the video analysts and one Hindi-speaking simulation expert triple-coded 10 debrief videos. Each watched the videos twice, first completing the modified DASH form and then the CAPE code window. With the guidance of a PRONTO expert trained in DASH evaluation, any resulting discrepancies were discussed and resolved. This process was repeated until the PRONTO expert trained in DASH-evaluation determined that both raters demonstrated proficiency with the DASH and CAPE constructs. Additionally, the video analysts participated in biweekly calls throughout the course of the project to review progress and discuss coding-related questions.

Statistical analysis

All videos selected were of sufficient quality to analyze. Any missing individual responses were excluded from analysis, with the exception of certain CAPE variables that asked about the presence of a certain component (i.e., ‘Is the analysis phase present?,’ ‘Number of times the video was paused?’). In these cases, a missing response was replaced with a zero. All videos were double-coded by the two video analysts. To mitigate rater bias, the video files presented to the video analysts were given in batches. Intra-class correlation coefficients (ICC)Footnote 1 with 95% confidence intervals (CI) were calculated for continuous CAPE variables and DASH elements. Variables lacking normal distribution were log-transformed prior to ICC calculation. ICCs < 0.40, 0.40–0.59, 0.60–0.74, and ≥ 0.75 were considered poor, fair, good, and excellent, respectively [33]. Reliability for binary variables was assessed using Cohen’s kappa with 95% CIs, with levels of agreement < 0.40, 0.40–0.70, and > 0.75 considered low, fair to good, and excellent, respectively [34]. To assess the internal consistency of the elements of the DASH scale, Cronbach’s α was calculated for both raters using all double-coded videos. Cronbach’s α was not calculated for the CAPE tool, as it contains continuous variables [35].

Part 2. Assessing changes in nurse mentors’ debriefing skills over time

Changes in nurse mentors’ debriefing skills were evaluated over each 8-month round, using unpaired debriefing videos. We hypothesized that mentors’ skills would improve over time secondary to increased practice, strengthened relationships with their learners, and the simulation refresher training conducted after month four. Only indicators that were found to have fair to excellent interrater reliability were included in the analysis to assess change over time, a decision that was made a priori to maximize accuracy. Depending on the timing of the debriefs, videos were categorized into three time-points: early (months 3–4), mid (month 5), and late (months 6–7). Trends over time for all continuous and categorical variables were assessed using linear and logistic regression, respectively, adjusted for rater. Because there was significant variation between which of the two mentor pairs led the debrief at each timepoint, a paired analysis was not possible. In more conservative models, we used generalized estimating equations (GEE) to account for correlations between double coding by raters per video. The GEE and non-GEE models yielded similar results. For simplicity of interpretation, only linear and logistic models are reported, except when differences were observed. Regression assumptions, including normality, homoscedasticity, outlier and influential analysis, were examined to detect any potential violations. All analyses were conducted in R Core Team version 0.99.903 (R Foundation for Statistical Computing, Vienna, Austria) [36].

Part 3. Exploring barriers and enablers of debriefing among nurse mentors

We explored barriers and enablers of simulation debriefing through semi-structured interviews with current AMANAT nurse mentors. Interviews took place between June and August 2016. The interview guide was designed in English, translated into Hindi, and then translated back to English to ensure accuracy. Two pilot interviews were completed to refine the interview guide. The pilot interviews were excluded from the final analysis. Interviews were conducted by two female interviewers, who had received training on study objectives and qualitative methodology. One interviewer was fluent in Hindi. Interviews were conducted in the language preferred by participants. Interviews were held in private rooms at PHCs. Interview duration ranged from 40 to 60 min.

Thematic analysis

Interviews were transcribed and, where necessary, translated to English by a bilingual Indian simulation specialist. To ensure transcription and translation accuracy, two independent staff double-checked all transcriptions and translations. Interview data were analyzed using the thematic content approach, which included four steps: data familiarization; identifying codes and themes; developing a coding scheme and applying it to the data; and organizing codes and themes [37, 38]. Two interviews were double-coded by the second author and another co-author. Any discrepancies were discussed and resolved to develop the final coding framework. The second author coded all remaining transcripts.


Written informed consent was obtained from all participants. The study was approved by the UCSF Committee on Human Research (Approval# 14–15,446) and the Indian Institute of Health Management Research Institutional Review Board.


Part 1. Evaluating the interrater reliability of the DASH and CAPE tools

Across three mentoring phases from March 2015 to June 2016, 4066 simulation debrief videos were collected. A total of 73 debrief videos were included in the analysis (Table 1).

Table 1 Characteristics of debrief videos included in the analysis

Overall, the CAPE tool had high interrater reliability than the DASH tool. Eight CAPE indicators had excellent interrater reliability (50%), while only 3 of 16 (19%) demonstrated poor reliability (Table 2). In comparison, 3 of 5 DASH elements had poor reliability and none had excellent reliability.

Table 2 Interrater reliability of CAPE and DASH variables in Bihar, India, 2015–2017 (N = 73 simulation debrief videos)

One of the most important CAPE indicators, ‘Instructor questions to instructor statements ratio,’ was not reliably coded. However, a composite indicator, ‘instructor questions and statements,’ demonstrated high reliability (data not shown).

Only two DASH indicators, ‘organize the debrief’ and ‘facilitate the debrief,’ demonstrated fair and good reliability, respectively; four of six (66%) demonstrated poor reliability (Table 2). Cronbach’s α of the DASH tool was 0.96 and 0.95 for raters 1 and 2, respectively.

Part 2. Assessing changes in nurse mentors’ debriefing skills over time

Following training, nurse mentors’ performance increased for several reliable CAPE and DASH indicators that are key to the essence of the debrief quality (Table 3). The average number of ‘instructor questions’ increased from 34 to 49 per debrief (p < 0.01). The number of ‘trainee responses’ increased from 50 to 64 per debrief (p < 0.01). The DASH indicator, ‘organize the debrief,’ increased from 3.3 to 3.5 (p = 0.04) on a 5-point Likert scale.

Table 3 Changes in nurse mentors’ debriefing skills over time in Bihar, India, 2015–2017 (N = 73 simulation debrief videos)

The majority of indicators did not change over time. For example, ‘trainee response to instructor questions and statements ratio’ changed from 0.75 to 0.78. The ‘number of times the videotape was paused’ during debriefing, ‘debrief length’, and ‘number of behavioral and technical objectives’ mentioned all remained constant (p > 0.05).

No indicators decreased significantly over time, though some trended downwards. The ‘length of tape segment played’ decreased from 4.5 to 3.3 min, and ‘use of video playback’ decreased from 85 to 73%. Additionally, 71 and 61% of debriefs had ‘all three phases present’ during month 1 and months 6–7, respectively. The most commonly omitted phase of the debrief was ‘application’ (data not shown).

Part 3. Exploring barriers and enablers of debriefing among nurse mentors

A total of 20 nurse mentors, with a median age of 24 years, were interviewed. On average, they had 14 months of mentoring experience. Only three had previous teaching experience and none had prior simulation debriefing experience. Participants were from states across India, including Delhi [7], West Bengal [4], Kerala [3], Bihar [2], Maharashtra [2], Uttar Pradesh, and Orissa [1].


Uncomfortable discussing mistakes

Many participants described that mentees disliked having their mistakes identified, especially when these mistakes were captured on video. Mentees, particularly older nurses, worried that such videos would be used to publicly display mistakes to peers. While mentors acknowledged that the videos sometimes made mentees nervous, they found them helpful in providing feedback.

"It should be good... continuing with the video, because the person... if they are doing mistake, they can observe, 'Oh yeah, they are doing.' According to me, the video should be there." (Mentor, age 22)

To mitigate anxiety, mentors tried to reassure mentees that the videos were only learning tools.

Time management

Participants commonly struggled with time management during debriefing. Several mentioned that it was difficult to keep all of the mentees engaged when debriefs were longer than 20 to 30 min. Challenges included mentees exhibiting disinterest, talking simultaneously, and arguing about clinical management. Further, because mentees were frequently scheduled to work on training days, they sometimes had to leave debrief sessions to care for patients.

Technical challenges

Several mentors described technical barriers related to video-recording. When the video, camera or laptop was not working, mentors often used mobile phones to record videos.


Create a positive learning environment

Numerous mentors highlighted the importance of creating a safe learning environment for mentees. To do this, mentors would begin debriefs by discussing what went well. Mentors framed mistakes in a constructive way and encouraged mentees to self-identify how they could improve in the future. Additionally, mentors emphasized the importance of using supportive language.

"First of all, we used to take the positive points what all they have done. Actually, I didn't used to take the negative points... I used to ask them what [they] could have done at this place." (Mentor, age 24)

“We say... 'Sister, you tell us what got missed and what should you have added,' so then she herself will tell her mistakes. Through this, what happens is that we are on the safe side. In the beginning, they used to feel very guilty... 'Sister, I made a mistake... the mistake has been recorded in the video.' Then we say, 'Sister, let mistakes happen, only then we can learn from them, but don’t repeat them.'" (Mentor, age 24)

“Don't call out mistakes. [Instead] say 'missed out.'” (Mentor, age 24)

Contextually appropriate debriefing

Several participants discussed the importance of being efficient and organized during debriefs. Mentors tried to keep debriefs as short as possible, while still covering the key points. At times, this required group management skills; for example, “If you sit, it will be done” (Mentor, age 26). When PHCs were really busy, mentors utilized flash debriefs. These pre-written debrief scripts consisted of 3 questions (what when well, what could have gone better, what will you do next time you encounter a similar clinical scenario) and rapidly covered the most important messages for a given simulation scenario.

Team building

Mentors also discussed strategies to increase the participation of mentees and other PHC providers in debriefing sessions. The majority of mentors recommended including doctors in simulations and debriefs. Additionally, several suggested beginning the debriefs with mentees summarizing the preceding simulation scenario.

Overall perception

Nearly all mentors had a positive perception of debriefs, describing them as a critical element of simulation training.

"If we do not debrief, there is no point of simulation." (Mentor, age 22)

A majority of mentors believed that debriefs helped clarify clinical weaknesses, so that mistakes that occurred during simulations would not happen while taking care of patients.

"I think debriefing is like the backbone of simulation... because with debriefing, they used to understand everything they did not understand well with the simulation… if they used to think that. ‘I have done this well,’ then in debriefing they used to realize that, ‘No, I could not do it.’ If they feel that I have done any mistake then... it used to get clear in the debrief." (Mentor, age 22)

Mentors also felt that debriefing was valuable for improving provider communication, discussing doctor-nurse and nurse-nurse hierarchy, and identifying other health system-related challenges such as human resource shortages and long distances between the delivery room and the pharmacy where necessary medications are kept.


To develop effective and sustainable simulation training programs in low-resource settings, it is critical that facilitators are thoroughly trained in debriefing. Through this unique approach using video analysis, we were able to remotely monitor and evaluate simulation debriefing in Bihar. Results suggest that the CAPE tool more reliably assessed debriefs, compared to the DASH tool. Thirteen of the 16 CAPE indicators had fair to excellent reliability (81%). This may partially be related to the fact that the DASH evaluates skills at the composite level, whereas the CAPE, which is scored at the individual item level, does not. Notably, this finding suggests that the CAPE tool’s objectivity may be especially helpful in settings where evaluators have less experience evaluating debriefers. One key indicator, ‘ratio of instructor questions to statements,’ had low reliability. However, a composite indicator of the sum of ‘instructor questions and statements’ was highly reliable, suggesting that the two video analysts were systematically categorizing questions and statements differently. For example, one video analyst was coding rhetorical questions as a statement, while the other was not. The DASH tool demonstrated high internal consistency with a Cronbach α of > 0.95, which is higher than the original high-resource validation study that found a Cronbach α of 0.89 [13]. This could suggest that the analysts scored each DASH question similarly and did not understand the different elements of the tool [39]. All indicators with high interrater reliability increased or were maintained over the 8-month mentoring period. This suggests that, as mentors improved their facilitation skills, mentees were empowered to develop the confidence required to discuss performance in simulations with peers. However, a significant improvement in mentor debriefing skills over time was not identified. This highlights the need for more timely and frequent debriefing feedback as well as revision of the debrief evaluation tools to better reflect the context in which mentors are working. Nevertheless, in a culture that largely utilizes a traditional didactic model of teaching [22], these findings represent meaningful progress.

Mentors identified fear of making mistakes, time-constraints, and technical challenges as key challenges to successful debriefing. Previous studies have similarly identified lack of protected time for professional development [40] and lack of feedback culture [17, 19] as significant barriers to provider training in LMIC settings. A Rwandan study found that providers who attended training outside of their usual workplace, where they were guaranteed to be free of clinical duties, had two-fold increased odds of passing practical skills assessments compared to providers who completed training in their workplace [41]. A multi-country study found that Asian simulation participants were often uncomfortable correcting other participants, especially those in authority positions, for fear of causing shame or appearing oppositional [42].

Interviews revealed several approaches to address identified barriers to enable success in this resource-constrained context. While mentees initially felt shameful about mistakes, mentors increased participation by constructively framing mistakes as learning points. This thoughtful attention to language allowed mentees to feel comfortable discussing mistakes, while still maintaining a respectful learning environment.

Findings suggest that contextually appropriate flash debriefs, which may be easily adapted to reflect trainee needs, could help overcome the important barrier of time-constraints, though future studies are required to explore whether these are equally effective from a learning perspective. This is consistent with previous studies that have recommended adaptation of debriefs to fit the environment and skill level of trainees [12]. Acceptability of this flexible approach to debriefing is critical, as government-run PHCs in India often face severe human resource shortages and, as a result, clinical duties are routinely prioritized over training [24, 43]. Additional recommendations related to increasing group participation and including doctors in both simulations and debriefs. A previous study in Bihar similarly suggested that inclusion of doctors in simulation training leads to improved communication and reduced hierarchy in PHCs [43]. This is consistent with previous studies in Rwanda and Kenya that highlighted the importance of teamwork [44] and leadership buy-in, respectively [45].

This study has several limitations. First, the video analysts did not participate in the official DASH training due to time and financial constraints. This may have been an important contributor to the low interrater reliability reported in this study. The number of debrief videos analyzed from round 2 was relatively small as a result of missing data from a third video analyst, who left after a brief period of employment; this may have resulted in an underestimation of interrater reliability or failure to detect changes in debriefing performance. Video coding was both time- and resource-intensive. Finally, interviews were conducted by members of the study team, which may have introduced social desirability bias. All mentors were informed in advance that data resulting from interviews was confidential in nature and would not be used for purposes other than research and programmatic improvement.


This study has demonstrated the feasibility of evaluating simulation debriefing in Bihar, India. Multiple CAPE indicators reliably assessed debriefing performance, showing that nurse mentors maintained or improved their facilitation skills over time. Barriers included fear of mistakes and time constraints. Enablers included having a safe learning environment, a flexible approach to debriefing, and leadership buy-in. An in-depth understanding of the barriers and enablers of debriefing is essential to improve the quality of simulation training programs in LMICs. Establishing the feasibility of debriefing and debrief evaluation is a meaningful step toward the development of successful simulation training programs and ultimately improving BEmONC skills among providers in Bihar and related low-resource settings.

Availability of data and materials

Analyses are ongoing so the data is not yet publically available.


  1. Model 1 with k raters


  1. Organization WH. The world health report 2006: working together for health Genova: World Health Organization; 2006 [cited 2018. June 5th]. Available from:

  2. Fanning RM, Gaba DM. The role of debriefing in simulation-based learning. Simul Healthc. 2007;2(2):115–25.

    Article  Google Scholar 

  3. PRONTO International 2018 [cited 2018 June 5th]. Available from:

  4. Jhpiego 2018 [cited 2018 June 5th]. Available from:

  5. Helping Babies Breathe 2018 [cited 2018 June 5th]. Available from:

  6. Fritz J, Walker DM, Cohen S, Angeles G, Lamadrid-Figueroa H. Can a simulation-based training program impact the use of evidence based routine practices at birth? Results of a hospital-based cluster randomized trial in Mexico. PLoS One. 2017;12(3):e0172623.

    Article  Google Scholar 

  7. Walton A, Kestler E, Dettinger JC, Zelek S, Holme F, Walker D. Impact of a low-technology simulation-based obstetric and newborn care training scheme on non-emergency delivery practices in Guatemala. Int J Gynaecol Obstet. 2016;132(3):359–64.

    Article  Google Scholar 

  8. Nelissen E, Ersdal H, Ostergaard D, Mduma E, Broerse J, Evjen-Olsen B, et al. Helping mothers survive bleeding after birth: an evaluation of simulation-based training in a low-resource setting. Acta Obstet Gynecol Scand. 2014;93(3):287–95.

    Article  Google Scholar 

  9. Das A, Nawal D, Singh MK, Karthick M, Pahwa P, Shah MB, et al. Impact of a nursing skill-improvement intervention on newborn-specific delivery practices: an experience from Bihar. India Birth. 2016;43(4):328–35.

    Article  Google Scholar 

  10. Mduma E, Ersdal H, Svensen E, Kidanto H, Auestad B, Perlman J. Frequent brief on-site simulation training and reduction in 24-h neonatal mortality--an educational intervention study. Resuscitation. 2015;93:1–7.

    Article  Google Scholar 

  11. Hunter LA. Debriefing and feedback in the current healthcare environment. J Perinat Neonatal Nurs. 2016;30(3):174–8.

    Article  Google Scholar 

  12. Sawyer T, Eppich W, Brett-Fleegler M, Grant V, Cheng A. More than one way to debrief: a critical review of healthcare simulation debriefing methods. Simul Healthc. 2016;11(3):209–17.

    Article  Google Scholar 

  13. Brett-Fleegler M, Rudolph J, Eppich W, Monuteaux M, Fleegler E, Cheng A, et al. Debriefing assessment for simulation in healthcare: development and psychometric properties. Simul Healthc. 2012;7(5):288–94.

    Article  Google Scholar 

  14. Center for Advanced Pediatric & Perinatal Education (CAPE) 2018 [cited 2018 June 5th]. Available from:

  15. Kolbe M, Weiss M, Grote G, Knauth A, Dambach M, Spahn DR, et al. TeamGAINS: a tool for structured debriefings for simulation-based team trainings. BMJ Qual Saf. 2013;22(7):541–53.

    Article  Google Scholar 

  16. Saylor JL, Wainwright SF, Herge EA, Pohlig RT. Peer-assessment debriefing instrument (PADI): assessing faculty effectiveness in simulation education. J Allied Health. 2016;45(3):e27–30.

    Google Scholar 

  17. Rule ARL, Tabangin M, Cheruiyot D, Mueri P, Kamath-Rayne BD. The call and the challenge of pediatric resuscitation and simulation research in low-resource settings. Simul Healthc. 2017;12(6):402–6.

    Google Scholar 

  18. Ulmer FF, Sharara-Chami R, Lakissian Z, Stocker M, Scott E, Dieckmann P. Cultural prototypes and differences in simulation debriefing. Simul Healthc. 2018;13(4):239–46.

    Article  Google Scholar 

  19. Hategeka C, Mwai L, Tuyisenge L. Implementing the emergency triage, assessment and treatment plus admission care (ETAT+) clinical practice guidelines to improve quality of hospital care in Rwandan district hospitals: healthcare workers' perspectives on relevance and challenges. BMC Health Serv Res. 2017;17(1):256.

    Article  Google Scholar 

  20. Perry MF, Seto TL, Vasquez JC, Josyula S, Rule ARL, Rule DW, et al. The influence of culture on teamwork and communication in a simulation-based resuscitation training at a Community Hospital in Honduras. Simul Healthc. 2018;13(5):363–70.

    Article  Google Scholar 

  21. Oxford Poverty and Human Development Index Multidimensional Poverty Index 2016 Highlights ~ South Asia. 2016 [cited 2018 June 5th]. Available from:

  22. Evans C, Razia R, Cook E. Building nurse education capacity in India: insights from a faculty development programme in Andhra Pradesh. BMC Nurs. 2013;12:8.

    Article  Google Scholar 

  23. Simon R RD, Rudolph JW. . Debriefing Assessment for Simulation in Healthcare (DASH)© – Rater Version, Short Form. Center for Medical Simulation, Boston, Massachusetts.2011 [cited 2018 June 5th]. Available from:

  24. Sharma B. Rural Health Statistics. Government of India Ministry of Health and Family Welfare Statistics Division. 2015 [cited 2018 June 5th]. Available from:

  25. Office of the Registrar General & Census Commissioner. Census of India 2011: Provisional Population Totals. 2011 [cited 2018 June 5th]. Available from:

  26. CARE M&E data.

  27. Vail B, Spindler H, Morgan MC, Cohen SR, Christmas A, Sah P, et al. Care of the mother-infant dyad: a novel approach to conducting and evaluating neonatal resuscitation simulation training in Bihar. India BMC Pregnancy Childbirth. 2017;17(1):252.

    Article  Google Scholar 

  28. Jaye P, Thomas L, Reedy G. 'The Diamond': a structure for simulation debrief. Clin Teach. 2015;12(3):171–5.

    Article  Google Scholar 

  29. Dyer J, Spindler H, Christmas A, Shah MB, Morgan M, Cohen SR, et al. Video monitoring a simulation-based quality improvement program in Bihar. India Clin Simul Nurs. 2018;17:19–27.

    Article  Google Scholar 

  30. Grandemange M, Costet N, Doyen M, Monfort C, Michineau L, Saade MB, et al. Blood pressure, heart rate variability, and adiposity in Caribbean pre-pubertal children. Front Pediatr. 2019;7:269.

    Article  Google Scholar 

  31. Haddad R, Concha-Benavente F, Blumenschein G Jr, Fayette J, Guigay J, Colevas AD, et al. Nivolumab treatment beyond RECIST-defined progression in recurrent or metastatic squamous cell carcinoma of the head and neck in CheckMate 141: a subgroup analysis of a randomized phase 3 clinical trial. Cancer. 2019.

  32. Weijters B, Cabooter E, Schillewaert N. The effect of rating scale format on response styles: the number of response categories and response category labels. Int J Res Mark. 2010;27(3):236–47.

    Article  Google Scholar 

  33. Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hrobjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. Int J Nurs Stud. 2011;48(6):661–71.

    Article  Google Scholar 

  34. A. G. Kappa Statistics for Multiple Raters Using Categorical Classifications. . Proceeds of the Twenty-Second Annual Conference of SAS Users Group,; San Diego, CA, USA.1997.

  35. Bland JM, Altman DG. Cronbach's alpha. BMJ. 1997;314(7080):572.

    Article  Google Scholar 

  36. (2016). RCT. R: A language and environment for statistical computing. . Viena, Austria: R Foundation for Statistical Computing.; 2016.

  37. Pope C MN. Qualitative Research in Health Care. 2nd ed. ed. London.: BMJ Books.; 2000.

  38. Green J TN. Qualitative Methods for Health Research. 3rd ed. London: Sage; 2014.

  39. Tavakol M, Dennick R. Making sense of Cronbach's alpha. Int J Med Educ. 2011;2:53–5.

    Article  Google Scholar 

  40. Hoban R, Bucher S, Neuman I, Chen M, Tesfaye N, Spector JM. 'Helping babies breathe' training in sub-saharan Africa: educational impact and learner impressions. J Trop Pediatr. 2013;59(3):180–6.

    Article  Google Scholar 

  41. Hategekimana C, Shoveller J, Tuyisenge L, Kenyon C, Cechetto DF, Lynd LD. Correlates of performance of healthcare Workers in Emergency, triage, assessment and treatment plus admission care (ETAT+) course in Rwanda: context matters. PLoS One. 2016;11(3):e0152882.

    Article  Google Scholar 

  42. Shilkofski N, Hunt EA. Identification of barriers to pediatric Care in Limited-Resource Settings: a simulation study. Pediatrics. 2015;136(6):e1569–75.

    Article  Google Scholar 

  43. Morgan MC, Dyer J, Abril A, Christmas A, Mahapatra T, Das A, et al. Barriers and facilitators to the provision of optimal obstetric and neonatal emergency care and to the implementation of simulation-enhanced mentorship in primary care facilities in Bihar, India: a qualitative study. BMC Pregnancy Childbirth. 2018;18(1):420.

    Article  Google Scholar 

  44. Munabi-Babigumira S, Glenton C, Lewin S, Fretheim A, Nabudere H. Factors that influence the provision of intrapartum and postnatal care by skilled birth attendants in low- and middle-income countries: a qualitative evidence synthesis. Cochrane Database Syst Rev. 2017;11:CD011558.

    Google Scholar 

  45. Mbindyo P, Gilson L, Blaauw D, English M. Contextual influences on health worker motivation in district hospitals in Kenya. Implement Sci. 2009;4:43.

    Article  Google Scholar 

Download references


The authors would like to thank video analysis team, Renu Sharma and Manju Siju, as well as Praicey Thomas, and Rohit Srivastava for their tireless efforts in video data management and coding. Additionally, we would like to thank all of the nurse mentors and mentees for their tremendous work in promoting obstetric and neonatal care throughout Bihar. We also thank Dr. Hemant Shah and the CARE India management team for their leadership and engagement in the nurse mentoring project. Finally, we would like to thank PRONTO International Master Trainers, Claudia Gerard, Jen Taylor and Patty Spencer, as well as PRONTO International staff members, Jessica Dyer and Kimberly Calkins.


This study was funded by the Bill and Melinda Gates Foundation [grant number OPP1112431, 2015]. The funding body had no role in study design, data collection, analysis, interpretation, manuscript writing, or the decision to submit the manuscript for publication.

Author information

Authors and Affiliations



JR was involved in study design, data collection, and analysis, as well as manuscript writing and revision. MM completed the qualitative analysis, contributed to study design, and played a key role in manuscript revision. SC led study design and implementation, and supported manuscript revision. HS and RG contributed greatly to study design and quantitative data analysis. HS was additionally involved in data management and providing oversight of the video coding process. AC played a key role in implementation, data collection, and manuscript revision. AD, AG, and TM helped with study design, program management, implementation, and manuscript revision. DW is the principal investigator and a major contributor to all aspects of this study and manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Julia H. Raney.

Ethics declarations

Ethics approval and consent to participate

All participants in the simulation videos provided written consent for the use of video simulation data in aggregated analyses. All nurse mentors provided written consent prior to being interviewed. Ethics approval was granted from the institutional review boards of the University of California San Francisco (14–15446) and the Indian Institute of Health Management Research.

Consent for publication

Not applicable.

Competing interests

Dilys Walker and Susanna Cohen are founding members of PRONTO International and sit on its board of directors. None of the other authors have any conflicts of interest to declare.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1

Interrater reliability of additional CAPE variables in Bihar, India, 2015–2017 (N = 73 simulation debrief videos).

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raney, J.H., Medvedev, M.M., Cohen, S.R. et al. Training and evaluating simulation debriefers in low-resource settings: lessons learned from Bihar, India. BMC Med Educ 20, 9 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: