GENESISS 2—Generating Standards for In-Situ Simulation project: a systematic mapping review

In-situ simulation is increasingly employed in healthcare settings to support learning and improve patient, staff and organisational outcomes. It can help participants to problem solve within real, dynamic and familiar clinical settings, develop effective multidisciplinary team working and facilitates learning into practice. There is nevertheless a reported lack of a standardised and cohesive approach across healthcare organisations. The aim of this systematic mapping review was to explore and map the current evidence base for in-situ interventions, identify gaps in the literature and inform future research and evaluation questions. A systematic mapping review of published in-situ simulation literature was conducted. Searches were conducted on MEDLINE, EMBASE, AMED, PsycINFO, CINAHL, MIDIRS and ProQuest databases to identify all relevant literature from inception to October 2020. Relevant papers were retrieved, reviewed and extracted data were organised into broad themes. Sixty-nine papers were included in the mapping review. In-situ simulation is used 1) as an assessment tool; 2) to assess and promote system readiness and safety cultures; 3) to improve clinical skills and patient outcomes; 4) to improve non-technical skills (NTS), knowledge and confidence. Most studies included were observational and assessed individual, team or departmental performance against clinical standards. There was considerable variation in assessment methods, length of study and the frequency of interventions. This mapping highlights various in-situ simulation approaches designed to address a range of objectives in healthcare settings; most studies report in-situ simulation to be feasible and beneficial in addressing various learning and improvement objectives. There is a lack of consensus for implementing and evaluating in-situ simulation and further studies are required to identify potential benefits and impacts on patient outcomes. In-situ simulation studies need to include detailed demographic and contextual data to consider transferability across care settings and teams and to assess possible confounding factors. Valid and reliable data collection tools should be developed to capture the complexity of team and individual performance in real settings. Research should focus on identifying the optimal frequency and length of in-situ simulations to improve outcomes and maximize participant experience.


Background
In-situ simulation (ISS) training enables teams to practice and be assessed in their own, familiar clinical environments [1,2]. ISS is often focused on training for low Open Access *Correspondence: Kerry.evans1@nottingham.ac.uk volume, high impact emergencies involving multidisciplinary teams (MDTs) with the aim of reinforcing knowledge and improving the functioning of the clinical team as a whole [3][4][5]. The main benefit of ISS over other traditional simulation approaches is reported as allowing participants to problem solve within their own dynamic setting which supports the implementation of learning into practice [1,2].
ISS has been identified as a useful mechanism to explore and learn from adverse events [6][7][8][9]. Embedding ISS activities underpinned by Human Factors principles can help to focus on the organisational, procedural and contextual influences on clinical reasoning and actions [10,11]. ISS has also been developed to test the synergy or dissonance between micro and macro factors: task factors, organisational factors, internal environments and external environments [12]. ISS interventions have been reported as a mechanism to enhance patient flow, improve the design of clinical spaces, and identify latent safety threats (LSTs) within new clinical settings [13][14][15][16]. The ability to experiment and see what occurs through interactions, attunement and disturbances enables participants to try out different options and consider possible unintended outcomes [17].
Organisational resilience is focused on understanding how healthcare organisations can deliver standardised, replicable and predictable services while embracing inherent variations, disruptions and unexpected events [18]. During the Covid-19 pandemic, ISS proved useful in helping teams prepare in a rapidly emerging situation. ISS interventions included testing and implementing the use of personal protective equipment (PPE), infection control guidelines and supporting operational readiness of intensive care units and operating rooms [19][20][21][22][23]. ISS interventions are employed to improve the acquisition of NTS, task management, situation awareness, problemsolving, decision-making and enhancing teamwork while testing and probing real-world organisational systems [1,18,[24][25][26][27].
ISS offers a feasible and acceptable approach through which individual and team competency can be assessed through simulated scenarios in controlled and standardised clinical settings [28]. Griswold et al. [29] identify that summative assessment using ISS is suited to clinical procedures with clear chains of action and well-defined processes and standards. Clinical competency measurement and assessment tools are less well-defined for ISS and further complicated when individual performance needs to be isolated from the wider team. Concepts such as 'effective communication' are subject to interpretation, and clinical outcomes may be attributed to concepts such as teamwork and coordination in addition to individual clinical skills and knowledge [30].
Although ISS has been identified as a promising approach in healthcare settings, ISS terms and concepts require standardisation and integrated models of learning are required to provide a more comprehensive and cohesive strategic approach [1,31,32]. The overall aim of the Generating Standards for In-Situ Simulation project phase 2 (GENESISS -2) was to develop evidence-based standards for healthcare professionals, educators and managers interested in developing and implementing ISS interventions in clinical practice. The project was commissioned by Health Education England working across the Midlands and East. A conceptual model of ISS was developed in phase one [33] which proposed four main ISS functions (Fig. 1). The aim of this systematic mapping review was to: explore and map the current evidence base for ISS approaches, identify gaps in the literature and inform future research questions.

Methods
We chose to conduct a systematic mapping review to capture the wide evidence base on main uses of ISS in healthcare. Mapping reviews are specifically designed to describe the extent of research in a field, spanning broad topic areas and research objectives to identifying evidence gaps to be addressed by future research [34]. The report follows the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement guidelines [35]. The review protocol was registered on the PROSPERO database (CRD42019128071). Recommendations for systematic mapping reviews [36][37][38] guided the review conduct.

Search
The search strategy was developed for MEDLINE, EMBASE, AMED, PsycINFO, CINAHL, MIDIRS and ProQuest databases and completed the literature search in March 2019 and updated in October 2020. A summary of the search terms is included in Table 1 and supplementary file 1 provides details of the full Medline search strategy.
Papers were included in the review if they met the following criteria: (i) published in English, (ii) based in an Organisation for Economic Co-operation and Development (OECD) member country (to enable greater comparability between health systems and socio-economic contexts), (iii) reporting quantitative primary research including randomised controlled trials, quasi-experimental studies, cohort studies, economic evaluation and observational quantitative studies (iv), included healthcare practitioners as participants (individual and teams) (v) reported simulation training or interventions conducted in any patient care settings (vi) reported quantitative measures of safety, governance, quality improvement, technical and non-technical skills performance, and educational or clinical outcomes. Exclusion criteria were (i) papers reporting simulation activities conducted in educational institutions and centres, simulation laboratories or training suites or non-patient areas (ii) qualitative studies, secondary data analysis and literature reviews. The timeframe for inclusion was from inception to October 2020.
Papers retrieved from the literature databases were imported to an EndNote library, and duplicate records were identified. Two researchers independently screened the titles and abstracts against the review inclusion and exclusion criteria (KE, JW). Full text papers of the remaining citations were then retrieved and independently assessed by two researchers (first stage: KE, JW updated search: KE, AC). A third

Quality assessment
The quality of studies included in the review was evaluated using a range of established critical appraisal tools selected for the particular study design: Quality Assessment Tool for Before-After (Pre-Post) Studies with No Control Group [39]; The Cochrane Risk of Bias tool for Randomised Controlled Trials [40]; The Joanna Briggs Institute (JBI) Checklist for Quasi-Experimental Studies [41]; CASP tool for cohort studies [42]. Two independent researchers assessed study quality (first stage: KE, JW updated search: KE AC) and banded studies as low, medium and high quality. There was consensus between the two researchers. Although no studies were excluded on the basis of quality, the quality assessment was used to identify the strengths and limitations of the review [43]. JBI levels of evidence [44] for included studies was also reported.

Data extraction
Data extraction forms were designed and piloted before beginning data extraction, completed by two independent researchers. Data extraction tables consisting of numerical and textual data presented the study characteristics, results and quality assessments.

Data analysis and synthesis
Synthesis of the extracted data were conducted in a descriptive and tabular way [45]. Categories were developed through an iterative process, focusing on the main aims or purposes of ISS interventions, illustrating the range of methods, intervention components, duration, populations, outcome measures and gaps in the research within and between each category. A description of the quantitative data is presented in tables to enhance explanation, understanding and coherence of the findings [37].

Results
The search identified 6,105 potentially eligible papers. Duplicate papers were removed (n = 1493). Papers were then screened (4,612) based on the information provided by the title and abstract. Potentially eligible papers (n = 258) were retrieved for full text assessment by two independent reviewers (KE, JW) and any disagreement resolved by discussion with a third reviewer (BB) until agreement was reached. The level of agreement between the two reviewers produced a kappa value of 0.9 which suggests a very good strength of agreement (k = 0.9, p < 0.001). Excluded papers (n = 189) a) did not include relevant outcome measures, b) did not report ISS activities or interventions c) were not conducted in OECD countries. The literature search and inclusion process are detailed in the PRISMA Flow diagram [46] (Fig. 2). There were 68 papers included in the mapping review which met the inclusion criteria. Findings were organised into categories to reflect the aims and objectives of the included studies using ISS: 1) as an assessment tool; 2) to assess and promote system readiness and safety cultures; 3) to improve clinical skills and patient outcomes; 4) to improve NTS, knowledge and comfort and confidence. The themes presented are:

ISS to assess performance and identify risks
Eighteen studies conducted ISS as a method of assessment (Table 2). Studies were conducted in the US, Canada, Denmark, Sweden, UK, Germany, Switzerland. Most studies were observational (n = 17), with one study reporting a quasi-experimental design to compare outcomes using different resuscitation equipment [47]. Samples sizes (where reported) ranged from 12 to 277 participants. Five studies reported ISS interventions to assess performance and identify risks: medication errors in emergency departments [48], LSTs in a Children's medical centre [49], paediatric and neonatology departments [50], pediatric tracheostomy care management in Emergency Departments (EDs), Intensive Care Units (ICUs) [51], and blood transfusion policies in the operating room [52]. Four studies reported ISS interventions to assess compliance against clinical guidelines and standards: cardiac arrest guidelines [53], sepsis guidelines [54], blood transfusion policy and identification [52] and cardiopulmonary resuscitation (CPR) performance [55]. Four studies reported ISS interventions to assess clinical response and task completion time [56][57][58][59], with three studies employing a pre / post ISS evaluation to evaluate the effectiveness of training programmes [60][61][62]. ISS was used to test and assess the safety of new equipment and procedures in two studies: the use electronic health records in the ICU [63] and to assess and compare traditional and automated external defibrillator supplemented responder models [47]. One study [64] conducted ISS to assess performance-relevant effects of task distribution and communication amongst emergency teams.
Auerbach et al. [45] and Kessler et al. [50] employed voluntary participation for ISS assessments, although the authors discussed that selection bias may be introduced as individuals agreeing to participate may be more or less skilled than other staff [53]. In addition scheduling of ISS may have resulted in providers and departments preparing for the day (training effect). Lipman et al. [53] reported that clinical timings may have been underestimated due to participation of highly skilled teams, the close proximity of clinical departments and participants   to the drill area, absence of patient family members, participant knowledge of the imminent ISS activity and training conducted during daytime hours [55,58]. Involvement of participants without other clinical duties at a scheduled announced time may limit the generalisability of the findings [53]. ISS performance was assessed by direct observation and by accessing feedback from participants. Two studies used evidence based clinical standards to assess performance, quality and safety metrics [53,54]. Outcome measures based on established standards were reported to be easily measurable, reproducible, and reflect clinical metrics and benchmarks. However, ISS assessment can be limited by the inability to reliably assess the impact on clinical outcomes due to the low occurrence of critical events [61], and poor sensitivity of outcome measures to assess communication skills in functional teams [57]. Most of the included studies used locally developed checklists, developed through previous pilot testing or amended from checklists developed for other clinical settings. Studies which reported team and system level assessments used established outcome measures including the Simulation Team Assessment Tool [53,65], Anaesthetists' non-technical skills (ANTS) taxonomy and behaviour rating tool [66,67], TeamSTEPPS Team Performance Observation Tool [60,68].
Authors reported positive benefits of conducting ISS to identify risks and hazards in clinical environments and improve the ability to detect errors. ISS was reported to help identify system susceptibilities, evaluate the effectiveness of training programmes and highlight variability in performance across different departments and systems. Overall, aauthors reported positive benefits of ISS as a method of assessment, providing useful information to inform future improvement initiatives.

ISS to assess and promote system readiness and safety cultures
Nine studies conducted ISS interventions with the aim of improving system or departmental performance outcomes (Table 3). Studies were conducted in Denmark, the UK and US. All studies were observational, and data were collected via participant questionnaires, and/or direct observation (or a review of audio-visual recordings) by trained assessors or experienced clinicians. Five studies were conducted in EDs [69][70][71][72][73], two in operating theatres [74,75], one in a neonatal ICU [13] and one in an obstetric unit [76]. Samples sizes (where reported) ranged from 14 to 289 participants. ISS interventions varied from single training sessions to regular training sessions over a period of months. All studies included participants from multi professional healthcare teams. Studies reported ISS was used as a way to assess, prepare and orient staff to new facilities [70-72, 76, 77] and promote safety cultures across departments or systems [69,[73][74][75]. All of the studies reported improvements in readiness scores and safety attitudes outcomes.
Data were mainly collected via pre and post participant self-assessment questionnaires, outcomes included identification of LSTs, assessment of departmental readiness scores, safety cultures and attitudes, orientation and team and departmental performance. Identification of LSTs was captured via observation and via participant during ISS debriefing.
Ventre et al. [76] identified that although clinicians participated in a basic orientation to the new space, ISS provided additional opportunity to evaluate whether the electronic and information systems, equipment and devices performed adequately before opening. Kobayashi et al. [72] conducted ISS when a new ED was almost ready to open, yet with enough time remaining for adjustments and corrective actions on identified issues. However, ISS may assist not only in testing the new facility but also in designing the environments [78].
Three studies conducted ISS to improve safety compliance, cultures and attitudes [73][74][75]. Although safety and teamwork climates were reported as readily measured and amenable to improvement through ISS, it was difficult to demonstrate an association between team and safety training on patient outcomes as improved clinical outcomes are multifactorial [74], evaluating the role of team versus organisational processes can be challenging [73]. Paltved et al. [73] discussed how prolonged engagement with ISS interventions and longer follow-up periods may be required as safety attitudes do not suddenly appear but emerge over time. Jaffrey et al. [75] reported that ISS emphasises the importance of safety measures and empowers participants to make changes and implement them effectively. ISS provides both a learning and a working environment which incorporates the complexity and resources found in the clinical environment and supports knowledge transfer to actual practice [73].

ISS to improve clinical skills, performance and clinical management
Seventeen studies conducted ISS interventions with the aim of improving clinical skills, performance and clinical management (Table 4). Studies were conducted in Australia, Israel, Italy, the UK and US. Ten studies were Pre / Post observational studies which included ISS interventions, two were prospective cohort studies, two RCTs, one observational study with a control and one multicomponent quality improvement project. Studies were conducted in emergency and resuscitation teams and departments [79][80][81][82][83][84][85][86], paediatric and neonatal care settings [87][88][89], in-patient ward settings [90][91][92], coronary care [93], an obstetric unit [94] and a mental healthcare setting [2]. Where reported, ISS interventions frequency varied from single training sessions delivered over one day to repeat ISS training lasting 18 months. The length of ISS was reported to last 30 min to 3 h. Most studies included participants as multi professional healthcare teams, with two studies including doctors and one including only nurses. Sample sizes ranged from 22-303 participants. ISS frequency, outcomes and authors' conclusions are presented in Table 5.
Some studies which involved more complex practices and clinical outcomes implemented regular ISS interventions over longer time periods. Andreatta et al. [87] conducted paediatric mock codes (resuscitation scenarios), on a monthly basis for 48 months and reported hospital survival rates improved significantly over study period. Knight et al. [84] conducted 16 paediatric ISS sessions over 18 months and reported that survival rates had improved when compared to historical controls. Other studies reporting favourable outcomes for regular ISS training included anaphylaxis management [79], sepsis management [90] response times to hospital emergencies [91], detection of arrhythmias [81], management of medical deterioration [2,89] and CPR performance [83,86].
Studies which included more easily defined or isolated tasks, reported one to three ISS sessions as effective in improving: infection control practices [26]; thoracotomy procedures [93]; response times and management of PPH [94]; sedation practices [80]; and resuscitation response times [82].
Outcome measures included self-reported confidence scores, performance scores, management and leadership scores, communication, and self-reported anxiety and knowledge. Outcome measures, ISS frequency and outcomes scores are presented in Table 7.
• Significant improvements in confidence scores were reported for single session [96,98,111,114], three session [112,117] or regular departmental training [2]. • Improvements in participants' performance scores were reported in six studies [24,71,96,104,108,113], with most studies conducting a single ISS intervention. • Two studies reported significant improvements in participants management and leadership scores following a single session [111] and three session ISS intervention [112]. • Two studies [71,118] reported an improvement in communication scores following 1-3 ISS interventions. • Two studies reported significant improvement in anxiety scores following a single ISS intervention [104,111]. • Four studies reported a significant improvement in participants knowledge scores following a brief ISS intervention [2,101,113,115].
Rubio-Gurung et al. [24] compared a four-hour ISS intervention to improve neonatal resuscitation across maternity units with control groups (n = 12, 6 units in each group). The median technical score was significantly higher for the ISS groups compared to the control groups. In the ISS groups, the frequency of achieving a heart rate of 90 per minute at 3 min improved significantly and the number of hazardous events decreased significantly. Four studies which compared ISS groups with control or comparison groups reported no statistical significant difference in outcomes: Gundrosen et al. [28] compared nurses one hour lecture-based training with ISS training on participants situational awareness and team working (ANTS taxonomy); Crofts et al. [115] compared a ISS intervention for obstetric emergency management with training conducted in a simulation centre; Villemure et al. [118] compared ISS in post anaesthetic care units with a control group (no particular interprofessional education).; Dowson et al. [112] compared regular ISS training to improve nurses' clinical confidence in the management of paediatric emergencies with a control group (mandatory resuscitation training).

ISS settings and methods
Studies conducted ISS interventions in in-patient care settings, predominantly in adult and paediatric EDs, obstetric/maternity units, cardiac response teams, adult and paediatric ICUs, and operating rooms. Data collection methods included direct observation, video review and data collected from simulation or clinical equipment. Participants' knowledge, anxiety, comfort and safety attitudes were exclusively measured by selfreported questionnaires. There was a range of methods between and within studies to measure task performance, clinical management, teamwork and communication (including assessment from direct or video observation), alongside participants' self-reported outcomes and /or clinical outcomes data.
Studies used various tools to assess performance during ISS interventions including:  [126] • Communication and collaboration [127] The benefits and limitations of conducting ISS reported across all included studies are summarised in Table 8.       [128] make an important distinction between research which is conducted about simulation and research conducted through simulation. The findings from this review include both of these approaches, which at times overlap, studied though various experimental designs. Research conducted about ISS (where ISS was an active intervention) included studies exploring acceptability and usefulness of ISS to clinicians and educators and evaluating the ability of ISS to identify LSTs and improve individual, team and system-level outcomes. Research conducted through ISS often included ISS as part of a multicomponent approach to improve clinical skills, performance and outcomes.
ISS outcomes were used to highlight where additional or new methods of training might be required to improve the quality of care, to identify LSTs and explore the accuracy and efficiency of task completion over the period of a working shift. Exploring the factors that can affect variations in adherence to clinical procedures, outcomes and performance may help to uncover where and why errors occur. ISS has the potential to reveal the constraining and facilitating mechanisms which impact performance and to identify modifiable factors at the individual, departmental, institutional level or system level [52][53][54].
Some multicentre studies were conducted to assess clinical performance used validated tools to assessed adherence to guidelines and departmental readiness scores. The ability to standardise simulation across participating sites can help isolate independent variables and to reduce the risk of bias introduced by variations in local contexts [129]. Differences in performance can be explored between sites and be used to generate theory about why differences may occur. For example, Auerbach et al. [53] used ISS to explore hospital characteristics to adherence to paediatric cardiac arrest guidelines across four paediatric EDs. ISS outcomes based on clinical standards can serve as a proxy for real performance, enhancing the external validity of the study findings [54].
There were considerable variations in the frequency of ISS sessions, length of ISS sessions and use of announced and unannounced ISS. However the length and frequency of ISS were not always reported. Studies which are focused on relatively straightforward, easily defined or isolated tasks, see improved outcomes after one to three ISS sessions [80,82,88,93,94]. Studies involving more complex practices or outcomes seem to require interventions over longer time periods [2,79,84,87]. This may indicate a potential benefit of ISS to support complex skills acquisition through behavioural learning strategies, where skills are developed through repetition and behaviour change occurs through feedback from the simulation activity, interaction between the task, environment, and the team.
Most of the studies included in the review used locally developed checklists, developed through previous pilot testing or amended from checklists developed for other clinical settings. In general, there was a paucity of reporting of the validity and reliability of assessment measures and tools. Studies which reported team and system level assessments adopted more established outcome measures [65,67,68,120,121,123]. Measurement methods for assessing individual competencies involved in complex care processes are less well-defined, and further complicated when individual performance needs to be isolated from the wider team. Concepts such as 'effective communication' are subject to interpretation and clinical outcomes may be attributed to concepts such as teamwork, communication and leadership in addition to clinical skills and knowledge [30]. Griswold et al. [29] identify that for clinical procedures with clear chains of action and well-defined processes and standards, summative ISS assessment is much simpler than in more "dynamic, multifactorial practices in which cognitive, procedural, and communication skills are simultaneously applied in a team environment" (Griswold et al. 2017, page 170). Criterion standards and benchmarks of quality performance need to be further developed to reliably and accurately capture the individual performance which is linked to relevant clinical competencies.
Goldstein et al. [130] stated that literature reporting ISS interventions on patient outcomes is scarce. Surrogate endpoints, such as response times are frequently adopted but this does not truly represent the complex factors that lead to improved patient outcomes [130]. In this review, ISS was often incorporated within larger, multi-component educational improvement projects. Most studies were observational with only thirteen adopting experimental designs. Small, observational studies are often limited by the potential for introducing selection bias, observer bias and confounding. Lamé & Dixon-Woods [128] state that ISS which can reproduce situations identically before and after the intervention increases confidence that the intervention can explain the variation in outcomes. Time-series designs which collect data at multiple times before and after the intervention or controlled studies are required to provide greater confidence in the findings of ISS interventions [128].
Unannounced ISS (or mock drills) were mainly conducted where studies sought to carry out a system audit or to assess clinical performance against a benchmark. Whereas announced ISS, which gave participants varying levels of notice and access to supportive resources, were mainly conducted as part of improvement projects or as part of clinical training. Posner et al. [32] highlight that both announced and unannounced ISS approaches can be conducted to detect LSTs, although assessment of factors such as response times and leadership assignment are more suited to unannounced ISS [55,58]. Freund et al. [105] compared unannounced to announced (one hour prior to ISS) team training and reported no significant differences on self-perceived learning and selfreported stress outcomes. It is reported that ISS can pose numerous threats an individual's psychological safety which can have a negative effect on learning. Participants may feel under increased scrutiny from colleagues or burdened by their other clinical work. Psychological safety can be supported by including a pre-simulation brief to discuss training objectives, expectations and develop trust between educators and learners [32,131,132].
Cheng et al. [129] recommend an extension to the CONSORT guidelines for reporting simulation-based research to include demographics and clinical characteristics of participants and the setting. This should include participants' previous experience with simulation, skill mix, staffing, capacity pressures and other relevant features to facilitate an assessment of the external validity of the findings [53]. A review by Goldshtein et al. [130] reported that it was difficult to assess who was participating in ISS and their prior experience of ISS participation. Lipman et al. [53] reported that clinical timings evaluated in their study may have been underestimated due to participation of highly skilled teams, the close proximity of clinical departments and participants to the drill area, absence of patient family members, participant knowledge of the imminent ISS activity and the daytime hours [55]. In future studies, detailed information on other potential sources of bias and other confounding, contextual and system level factors should be presented to assist researchers, educators and clinicians to assess the relevance of the findings to other settings and participant groups [129].
ISS to assist teams train, rehearse and practice for low frequency, high impact events were frequently reported simulation activities in the review. The theoretical base for ISS as a training intervention was not reported in many studies, however ISS as a training intervention maps to the concepts within cognitive learning approaches where participants preconceptions are explored, and new or unexpected events are presented via the simulation activity to challenge precognitions [133]. ISS is also underpinned by situativity theory, in which knowledge transfer is considered optimal when the learning environment matches the environment in which it will be applied [28,131,134]. During the Covid-19 pandemic, ISS has been used to help staff prepare for emerging challenges. ISS interventions have helped to identify LSTs, highlight inadequacies in guidelines and protocols policies, improve the correct use of PPE, and orientate staff to newly established Covid-19 intensive care unit and wards [135,136].

Study strengths and limitations
This review should be viewed in light of several limitations. This review did not include grey literature, conference abstracts and academic theses. It is likely that grey literature may include ISS practice-based improvement and educational projects which further illustrate the current uses of ISS in healthcare settings. However, this review highlights the lack of rigorous intervention ISS research and the urgent need to increase research output and methodological quality. The mapping review aimed to provide an overview of the broad ISS published literature and did not conduct in-depth analysis of study outcomes to enable meaningful comparisons. The review has highlighted different categories and approaches to ISS, identifying common outcomes measures and measurement tools. Mapping reviews are distinguished by the presentation of the data in a digestible format and assessment of whether the total population of studies is similar enough to undertake a coherent synthesis of the current data [36]. Therefore, this review may provide a useful starting point for other researchers seeking to develop and define parameters for future ISS systematic reviews.

Conclusion
This review presents an overview of the literature on ISS interventions by mapping the study objectives, methods, outcomes, barriers, and facilitators at work across different settings. The mapping review provides a useful summary for healthcare educators and researchers seeking to develop ISS strategies in healthcare settings. Additionally, it highlights important evidence gaps, including the need to (1) identify appropriate tasks capable of standardisation and reproducibility in ISS assessment scenarios (2) capture adequate demographic data from participants to assess the impact on outcomes (e.g. work-patterns, skill-mix, experience, ISS experience and exposure, willingness to participate) (3) explore different methodologies in an attempt to reduce bias and confounding factors (4) develop and validate sensitive data collection methods and tools to capture the complexity of team and individual performance in real settings (5) identify optimal frequency and length of time to complete ISS, considering feasibility and acceptability in the clinical setting. This systematic mapping review has provided a useful framework to navigate the expansive and diverse research literature on a relatively new and underdefined approach to ISS as a function to assess individual, team and departmental performance. There is currently a lack of consensus for the rationale for conducting ISS interventions and well-developed studies are required to identify the potential benefits of ISS and the impacts on patient outcomes. Overall, studies reported ISS to be feasible and beneficial to address various learning and improvement objectives. The components and mechanisms employed across the included studies which have been designed to address a range of objectives can inform future design of ISS interventions to meet specific objectives.