Skip to main content

Development of an integrated milestone assessment tool across multiple early-adopter programs for breaking bad news: a pilot project



The transition of the Accreditation Council for Graduate Medical Education (ACGME) to milestone assessment creates opportunities for collaboration and shared assessments across graduate medical programs. Breaking bad news is an essential communication skill that is a common milestone across almost every medical specialty. The purpose of this study was to develop and pilot an integrated milestone assessment (IMA) tool for breaking bad news using ACGME milestone criteria and to compare the IMA tool with the existing SPIKES protocol.


The IMA tool was created using sub-anchors in professionalism and interpersonal communication skills that are applicable to every specialty and to the ability to break bad news. Two cases of breaking bad news, designed to be “easy” and “intermediate” in difficulty, were used to assess basic skills in breaking bad news in first-year medical residents from six residency specialties. Eight standardized patients were trained to portray the cases in sessions held in November 2013 and May 2014. Standardized patients completed an assessment checklist to evaluate each resident’s performance in breaking bad news based on their use of the SPIKES protocol and IMA tool. Residents answered post-encounter questions about their training and comfort in breaking bad news. The association between SPIKES and IMA scores was investigated by simple linear regression models and Spearman rank correlations.


There were 136 eligible medical residents: 108 (79.4%) participated in the first session and 97 (71.3%) participated in the second session, with 96 (70.6%) residents participating in both sessions. Overall, we were able to identify residents that performed at both extremes of the assessment criteria using the integrated milestone assessment (IMA) and the SPIKES protocol. Interestingly, residents rated themselves below “comfortable” on average.


We developed an integrated milestone assessment (IMA) that was better than the SPIKES protocol at assessing the skill of breaking bad news. This collaborative assessment tool can be used as supplement tool in the era of milestone transformation. We aim assess our tool in other specialties and institutions, as well as assess other shared milestones across specialties.

Peer Review reports

Practice points

  1. 1.

    The assessment of the ACGME core competencies interpersonal communication skills and professionalism is challenging in the clinical setting. The frequently required skills used when breaking bad news can be evaluated as a proxy to both competencies in nearly all medical specialties [1].

  2. 2.

    Milestones 2.0 emphasizes accurate evaluations, ideally through direct observational tools. Currently, no consistent process exists for summative and formative assessment of trainees regarding the skill of breaking bad news [2,3,4].

  3. 3.

    The Integrated Milestone Assessment (IMA) tool that we developed for breaking bad news would be a valuable addition to the Milestones 2.0 toolbox [3].

  4. 4.

    We developed an IMA tool that closely aligned with the well-accepted SPIKES protocol, yet the internal consistency and performance differentiation of our IMA tool surpassed that of the SPIKES protocol.


In 2013, the Accreditation Council for Graduate Medical Education (ACGME) transitioned to milestones as a competency-based assessment tool for medical resident trainees. While the competencies were crafted to provide a shared model of professional advancement among physicians in training, users found it difficult to understand the meaning of competencies in the context of their specialty [3, 5,6,7,8,9]. The resulting development of specialty-specific Milestones 1.0 lead to substantial variability in content and progression across milestone levels [7, 8]. These limitations led to the implementation of Milestones 2.0, which allows the use of more consistent, harmonized milestones and sub-competencies as quality assessment tools for programs to use across medical residency programs [7]. While there are several Milestone 2.0 assessment tools for the six ACGME core competencies, assessment of interpersonal communication skills (ICS) and professionalism (Prof) are particularly challenging in the clinical setting because of variations in faculty frames of reference and the influence of external-to-resident performance [7,8,9]. Faculty may use themselves, other doctors, or patient outcomes as frames of reference when assessing residents or may use their gut feeling or gestalt to translate their observations to numerical assessment scores [4].

Breaking bad news is an essential communication skill that is either explicitly included or strongly implied to be included in the ICS and Prof competencies and related milestones of nearly all medical specialties. While previous studies have utilized various tools to assess residents on the use of appropriate breaking bad news techniques, the development of a validated and simplified tool that includes direct observational assessment in any clinical setting and evaluates progress toward effective delivery of bad news would be valuable for all residency programs and would be a useful addition to the Milestones 2.0 toolbox [9,10,11,12,13,14]. Our objectives were to develop and pilot an integrated milestone assessment (IMA) tool for breaking bad news using ACGME milestone criteria, to compare this tool against the SPIKES protocol (the most commonly used and reported protocol for breaking bad news), and to assess medical residents’ self-perceptions of and comfort with their ability to break bad news [1, 10, 11].


Bad news is defined as any news that drastically and negatively alters the patient’s view of his or her future [15,16,17]. Breaking bad news in a compassionate way is an essential component of the doctor- patient relationship [18,19,20]. Nonetheless, most undergraduate and graduate medical programs lack formative or summative assessments for breaking bad news, leaving many physicians unprepared to handle such conversations with patients [18, 19]. Even experienced clinicians report that having to break bad news is a source of significant stress [4, 17,18,19,20]. Additionally, trainees enter residency programs with different levels of experience, as formal instruction during medical school for breaking bad news is highly variable [20]. Because most trainees lack previous hands-on experience, an intern’s first clinical experience of delivering bad news typically occurs during residency [17, 19]. This lack of practice and experience is less than ideal for patients and their families.

Despite the lack of training in medical school, residents are expected to use an appropriate technique for breaking bad news. Breaking bad news is a skill assessed in various ways for residency milestones, regardless of specialty, and is one of the top-three main themes in the assessment of ICS [5, 9, 15, 16]. Yet, the approach used to evaluate competency in breaking bad news is variable, subjective, and normative referenced.

To make assessments specific, objective, and criterion referenced, Milestones 2.0 encourages the development of validated assessment tools that would inform an institution’s clinical competency committee of the proficiency level within each sub-competency [3]. The SPIKES protocol is a widely recognized six-step protocol that was first developed in 2000 to deliver bad news to cancer patients. It has been adopted more widely and is now used by clinicians in various settings to communicate bad news to patients [1] in a clear, compassionate, and supportive manner. As the most popular protocol for delivery of difficult news, the SPIKES protocol has reached guideline status in the United States and a number of other countries and is used as a training guide for communication skills [1]. The SPIKES acronym stands for the following steps: Setting, Perception, Invitation, Knowledge, Empathy, and Summary. Although there are several other protocols for the delivery of bad news, such as GRIEV_ING, the ABCDE approach, and the BREAKS protocol, a review of the literature shows SPIKES to be one of the preferred protocols for teaching students to deliver bad news [10, 11, 21]. To our knowledge, a shared common assessment tool incorporated into the sub-competencies of the Milestone 2.0 ICS and Prof competencies does not exist. This project sought to pilot an innovative, harmonized ACGME milestone-based IMA tool across six of seven early-adopter programs at a single institution to evaluate medical resident proficiency in communication of bad news and to compare this new tool to the established SPIKES protocol.


This study was conducted at a single institution and qualified for institutional review board exemption. No formal training in breaking bad news existed for residents at the time of study. The milestones from six of seven early-adopter programs (emergency medicine, medicine, neurosurgery, orthopedics, radiology, and urology) were reviewed to find the most common assessment themes across programs. The breaking bad news sub-competency was present across the six programs, but at varying levels in each of the milestones and sub-competencies. Thus, instead of using each specialty's milestones for their respective residents, sub-anchors in Prof and ICS applicable to every specialty were created and labeled as IMAs (Table 1). Additionally, the SPIKES protocol was used as a measurement tool to lend validity evidence to our milestone assessment (Table 2).

Table 1 Integrated milestone assessment anchors (Supplemental content)
Table 2 SPIKES protocol (Supplemental content)

Two cases of breaking bad news were designed as part of the pilot formative assessment project to evaluate medical residents’ skills in breaking bad news. The two cases varied in difficulty, with one basic case designed to be of “easy” difficulty and another more challenging case of “intermediate” difficulty. Two separate sessions with standardized patients (SPs) were held to assess the skills of residents across the six specialties, which each learner taking the “easy” case followed by the “intermediate” case. The sessions were held in November 2013 (5 months into training) and in May 2014 (10 months into training), based largely on convenience of scheduling. Eight professional and experienced SPs were trained before each session by the medical director of the simulation center to ensure consistency in the cases and resident feedback for each session. The SPs completed two rubric-based assessments centered on the milestones’ competencies and on the SPIKES protocol for each resident. Finally, residents received feedback on their performance and answered post-encounter questions about their training and comfort in breaking bad news after each case.

Statistical analysis was performed using SAS software. The association between SPIKES and IMA scores was investigated by simple linear regression models, stratified by session. Spearman rank correlations were also estimated to describe the strength of the linear association between scores. Cronbach’s alpha was used to measure the internal consistency of the SPs responses on each scoring mechanism, with the alpha values being 0.61 (session one) and 0.44 (session two) for the SPIKES protocol and 0.91 (session one) and 0.83 (session two) for the IMA tool. Item deletion was used to investigate the relative item contribution to the overall scores. A general linear mixed model was used to test for significant changes in SPIKES and IMA scores between sessions. The model included fixed effects for session (discrete), and the residuals assumed an unstructured matrix of correlated error terms. Separate models were fitted for each score. A Bland–Altman analysis with 95% limits of agreement was used to investigate agreement between the SPIKES and IMA scores, stratified by session, by comparing differences in standardized scores. The scores were standardized within each session by subtracting the mean score and then dividing by the standard deviation of the mean score (SD).


There were 136 eligible residents across the six early-adopter programs: 108 (79.4%) residents participated in the first session and 97 (71.3%) residents participated in the second session, with 96 (70.6%) residents participating in both sessions. Table 3 summarizes the demographics of the participants. Table 4 summarizes the SPIKES score and IMA score for each of the two sessions.

Table 3 Overall demographics of session participants
Table 4 Summary of performance based on SPIKES vs. IMA assessment tools

For the SPIKES protocol, the overall Cronbach’s alpha values were 0.61 (session one) and 0.44 (session two). For the IMA tool, the overall Cronbach’s alpha values were 0.91 (session one) and 0.83 (session two).

In session one, 4 residents (3.8%) scored 0 in at least one of the three milestone areas and 44 residents (42.3%) scored 3 in all three areas. On a 1–5 Likert scale with 5 being “comfortable,” residents rated their ability to break bad news as a mean of 3.5 (SD 0.8) and their ability to deal with patients’ emotions as a mean of 3.7 (SD 0.8). In session two, 4 residents (4.1%) scored 0 in at least one of the three milestone areas and 9 (9.3%) scored 3 in all three areas. Just over half (55.7%) of the residents reported that they had not received any formal training in breaking bad news during their residency, although the large majority reported either informal training or modeling from other residents or attending physicians.

There was no statistically significant difference in SPIKES scores between sessions (P = 0.70). The mean SPIKES scores were 6.56 (SD 0.98) for session one and 6.52 (SD 0.75) for session two. However, there was a statistically significant difference in IMA scores between sessions (P < 0.001). The mean IMA scores were 6.81 (SD 2.35) for session one and 5.82 (SD 1.93) for session two. The mean difference was 1.02 points (95% CI: 0.48, 1.55). Figure 1 compares SPIKES and IMA scores for each resident stratified by session using a simple linear regression. IMA scores were significantly associated with SPIKES scores for session one (slope = 0.25, P < 0.001) and session two (slope = 0.19, P < 0.001). The correlation between scores was 0.62 and 0.53 for sessions one and two, respectively. Lastly, there was greater variability (SD) for the IMA tool total scores compared to the SPIKES protocol total scores for both sessions (Table 4).

Fig. 1
figure 1

Linear regression of SPIKES to Integrated Milestone Assessment (IMA) Scores (supplemental content)


The scores generated from the SPIKES protocol and the IMA tool were moderately and positively correlated. Because the SPIKES protocol is a recognized step-by-step checklist for breaking bad news, a positive correlation supports the use of the IMA tool to assess the skill of breaking bad news. Furthermore, we believe the IMA tool is better than the SPIKES protocol, because the IMA scores more accurately reflected the difficulty of the case and because the IMA scores were internally consistent, as measured by Cronbach’s alpha. Specifically, there was a statistically significant difference in IMA scores, but not SPIKES scores, between the two sessions, suggesting that the IMA tool better differentiates between high and low performance.

The Cronbach’s alpha values for the SPIKES scores suggest relatively poor internal consistency (i.e., reliability), which could be caused by the relatively small number of items included in the checklist, poor similarity between items, or heterogeneous constructs (i.e., there may be more than one latent variable [multidimensionality] described by the SPIKES score). Item deletion analysis did not identify a component of the SPIKES score that was especially important or that could be removed from session one. However, removing “invitation” (SPIKES protocol #4, Box 2) from session two scores would increase Cronbach’s alpha to 0.57 (from 0.44). In contrast, the Cronbach’s alpha value from the IMA scores suggests a strong internal consistency (i.e., reliability) and surpassed that of the broadly used SPIKES protocol. Again, item deletion analysis did not identify a component of the IMA score that was especially important or that could be removed from either session.

In the context of Milestones 2.0, the goal is to develop shared assessment tools that can be used across specialties to accurately evaluate skills. It is our belief that IMA can be used and applied to all specialties despite the appearance of the skill of breaking bad news in multiple milestone levels across the specialties [2, 7]. As Milestones 2.0 develops more harmonized sub-competencies and sub-anchors, we propose that the IMA tool could be used to further evaluate breaking bad news as a skill commonly listed in the ICS and Prof competencies. Additionally, the IMA tool could be an excellent evaluation tool for medical students and residents to assess their skills of breaking bad news in objective structured clinical examinations or in bedside evaluations, as SP encounters with validated tools is an established method of assessing learners [7, 22]. The data gathered could be useful to quantitate performance to an institution’s clinical competency committee and could be used to provide consistent and reliable quantitative data to medical residency program directors. We believe that the incorporation of a format similar to Milestones 2.0 makes the IMA tool more familiar and user-friendly for graduate medical education evaluators compared with the format of the SPIKES protocol.


Our study is limited in its design methodology, as it was a pilot study and tested two different difficulty levels of cases at two sessions during the medical residency year. As this study was a pilot study, all participants remained anonymous at the request of the Office of Graduate Medical Education, Baylor College of Medicine, Houston, TX, USA. This anonymity prevented us from performing a matched-pair analysis on the trainees’ data. The way the cases were tested also limited our design quality, as either the "easy" case and the "intermediate" case should have been tested at the same time or one difficulty level should have been tested at separate times. Also, residents break bad news routinely, despite the lack of formal teaching in breaking bad news; thus, some residents may gain more practice and proficiency in breaking bad news over time than others. Furthermore, residents may have received training in medical school or prior exposure to clinical settings. Additionally, the data from both sessions was reviewed retrospectively. Another limitation is that, since the sessions were months apart, the SPs knew there had been a previous encounter which could have injected bias into the second session. Finally, we asked post-encounter questions regarding residency background and comfort with breaking bad news but did not ask pre-encounter questions.


While the skill of breaking bad news is variably present across competencies in Milestone 1.0, this skill is one that is shared by all specialties and is an excellent target for Milestone 2.0’s goal of harmonized assessments. Currently, there is no standardized approach to assess the skill of breaking bad news. We developed the IMA tool to evaluate the delivery of bad news, and it would be a valuable addition to the Milestones 2.0 toolbox. A simplified method, such as the IMA tool, may be easier to use than the SPIKES protocol and may also address more milestone language, making it more valuable to clinical competency committees and more in agreement with Milestone 2.0 requirements. Finally, breaking bad news, while explicitly mentioned in the majority of specialties’ milestones, is not the only skill in the ICS and Prof competencies. By broadening the assessment and language, it might be possible to develop a tool that can assess specific skills required to meet the competencies and the broader skills required to meet the milestones. Our data are valuable to medical residency programs across the United States to further develop and test a shared assessment for breaking bad news that can also be used in milestone assessments by clinical competency committees. As such, the IMA tool demonstrates promise as an effective evaluator of medical resident proficiency in breaking bad news. Additionally, the concept and theoretical underpinnings of this study are essential for the advancement of learner assessment and can serve as a catalyst for further advancements in graduate medical education.

Availability of data and materials

The authors confirm that the data supporting the findings of this study are available within the article.


  1. Baile, W., Buckman R., Lenzi R., Glober G., Beale E., & Kudelka A: SPIKES—A Six-Step Protocol for Delivering Bad News: Application to the Patient with Cancer. The Oncologist. 2000;5(4).

  2. Baker-Genaw K, et al. Mapping Direct Observations from Objective Structured Clinical Examinations to Milestones Across Specialties. J Grad Med Educ. 2016;8:429–34.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Beeson MS. (2016: The Emergency Medicine Milestones: With Experience Comes Suggestions to Improve. Acad Emerg Med. 2016;23:1434–6.

    Article  PubMed  Google Scholar 

  4. Brown R, Dunn S, Byrnes K, Morris R, Heinrich P, Shaw J. Doctors’ stress responses and poor communication performance in simulated bad-news consultations. Acad Med. 2009;84:1595–602.

    Article  PubMed  Google Scholar 

  5. Chan TM, Wallner C, Swoboda TK, Leone KA, Kessler C. Assessing interpersonal and communication skills in emergency medicine. Acad Emerg Med. 2012;19:1390–402.

    Article  PubMed  Google Scholar 

  6. Downing SM, Yudkowsky R. Assessment in health professions education. 2009.

    Article  Google Scholar 

  7. Edgar L, Roberts S, Holmboe E. Milestones 2.0: A Step Forward. J Grad Med Educ. 2018;10:367–9.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Edgar L, Roberts S, Yaghmour N, et al. Competency crosswalk: a multispecialty review of the Accreditation Council for Graduate Medical Education milestones across four competency domains. Acad Med. 2017;21:10.

    Article  Google Scholar 

  9. Hilkert SM, Cebulla CM, Jain SG, Pfeil SA, Benes SC, Robbins SL: Breaking bad news: A communication competency for ophthalmology training programs. Surv Ophthalmol. 2016:791–798.

  10. Hobgood C,, Woodyard J., Sawning S., Savage K: Delivering The News With Compassion: The GRIEV_ING Death Notification Protocol. MedEdPORTAL Publications; 2010. Available from:

  11. Kaur B. Breaking Barriers to Effective Communication of Bad News -Obstetric Outlook. Open Access Journal of Gynecology and Obstetrics. 2019;2:18–21.

    Article  Google Scholar 

  12. Kiluk J, et al. Teaching medical students how to break bad news with standardized patients. J Canc Educ. 2012;27:277–80.

    Article  Google Scholar 

  13. Kogan JR, Conforti L, Bernabeo E, et al. Opening the black box of clinical skills assessment via observation: a conceptual model. Med Educ. 2011;45:1048–60.

    Article  PubMed  Google Scholar 

  14. Leep Hunderfund AN, Reed DA, Starr SR, et al. Ways to write a milestone: approaches to operationalizing the development of competence in graduate medical education. Acad Med. 2017;92:1328–34.

    Article  PubMed  Google Scholar 

  15. Monden KR, Gentry L, Cox T. Delivering Bad News to Patients. Proc (Bayl Univ Med Cent. 2016;29:101–2.

    PubMed  Google Scholar 

  16. Nasca TJ, Philibert I, Brigham T, et al. The next GME accreditation system—rationale and benefits. N Engl J Med. 2012;366:1051–6.

    Article  CAS  PubMed  Google Scholar 

  17. Ptacek JT, Eberhardt TL. Breaking bad news. A review of the literature. JAMA. 1996;276:496–502.

    Article  CAS  PubMed  Google Scholar 

  18. Reed S, Kassis K, Nagel R, et al. Breaking bad news is a teachable skill in pediatric residents: A feasibility study of an educational intervention. Patient Educ Couns. 2015;98:748–52.

    Article  PubMed  Google Scholar 

  19. Reed S, Kassis K, Nagel R, et al. Does emotional intelligence predict breaking bad news skills in pediatric interns? A pilot study. Med Educ Online. 2015;20:24245.

    Article  PubMed  Google Scholar 

  20. Rosenbaum M, Ferguson KJ, Lobas JG. Teaching Medical Students and Residents Skills for Delivering Bad News: A Review of Strategies. Acad Med. 2004;79:107–17.

    Article  PubMed  Google Scholar 

  21. Seifart C, Hofmann M, Bär T, Riera Knorrenschild J, Seifart U, Rief W. Breaking bad news-what patients want and what they get: evaluating the SPIKES protocol in Germany. Ann Oncol. 2014;25:707–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Vora S, Lineberry M, Dobiesz VA. Standardized Patients to Assess Resident Interpersonal Communication Skills and Professional Values Milestones. West J Emerg Med. 2018;19:1019–23.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


The Office of Graduate Medical Education and the Office of Curriculum’s Simulation and Standardized Patient Program at Baylor College of Medicine, Houston, TX, USA, are acknowledged for their support of this project.

IRB information

This study was found to be exempt by the Institutional Review Board of Baylor College of Medicine, Houston, TX, USA.


This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations



All authors conceived the idea and assisted in design and implementation. C.M. and D.G. performed and verified the analytical methods. A.T. supervised the findings and publication of the work. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Anisha Turner.

Ethics declarations

Ethics approval and consent to participate

Exemption from ethics/Institutional Review Board approval and waiver of informed consent granted: The study was approved by the Baylor College of Medicine Institutional Review Board (BCM IRB) which waived the need for ethics approval and the need to obtain consent for the collection, analysis and publication of the anonymized data for this non-interventional study under 45 CFR 46.102(d). Protocol #H-51407. All protocols were approved by the BCM IRB and the activity was deemed a QI activity designed solely for program evaluation purposes and does not constitute human subjects research. The activity was part of the educational curriculum of the medical school therefore informed consent was deemed not necessary. All methods were carried out in accordance with relevant guidelines and regulations.

Consent for publication


Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Turner, A., Gopakumar, S., Minard, C. et al. Development of an integrated milestone assessment tool across multiple early-adopter programs for breaking bad news: a pilot project. BMC Med Educ 24, 313 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: