Skip to main content

Table 1 International Test Commission Guidelines and Resident Feedback Regarding Evaluationa

From: An interpretive phenomenological analysis of formative feedback in anesthesia training: the residents’ perspective

Theme

ITC Guideline Recommendation

Resident Perspective on What Works

1

Competencies of those administering assessments

1.1

Professional and ethical standards that affect the way in which the process of testing is carried out and the way in which test users interact with others involved in the process.

“Some are chatty and some are not. Some will sort of teach intermittently as they do it, and some will not. And some are more invested in the structure, and some are not.”

“I don’t even think we got to the structured thing. But I wouldn’t say that’s not for lack of trying or out of interest on the part of either me or the staff. Like he was into it. Pulled it up and was like, “Okay, I’ve got this thing. We’ve got to go through it.”

The way in which the staff (test user) engaged the resident in the assessment was not professional. If a task-completion orientation is taken by the person conducting the assessment, then the resident is going to follow that example and know that this is just a thing to complete, not an important part of the learning process.

1.2

Knowledge of and respect for the rights of the test taker.

“There’s a lot of evaluation fatigue, I think. Because literally every single day we get at least one. And then we have Wednesday and we get 3. And then we get all these other ones on top of it.”

“I think over time people will just get fatigued with it and the value that we get out of these will just slowly start to wane. We are already seeing that it can be very useful but it can also be very short and quick and it can also be done in that sort of haphazard manner.”

Residents are not a part of the program design and implementation, leading to forms and processes that might work for a department but is not sensitive to the learning needs of learners.

1.3

Knowledge of basic psychometric principles and procedures and the technical requirements of tests (e.g. reliability, validity, standardization)

“I think that this is super staff dependent. Some staff can very efficiently sort of run through things, any salient points. And other staff kind of will get to a point and then pontificate a little bit, and use it as a teachable moment and a bit of a discussion point. So I mean it kind of depends on the staff. Because it’s a staff-led sort of feedback model, it depends on them and what they’re going to do with it.”

This comment raises concerns about threats to reliability due to inconsistent use.

1.4

Knowledge of the specific requirements and processes of the testing tools relevant to one’s area of practice. This includes relevant activities of test administration, reporting, and the provision of feedback to those being assessed.

“The quality is highly variable, I think, depending on the staff and depending on the day. And even with that, some staff are good at it, some staff aren’t. And some staff don’t do it all.”

“I would say 100% give you some sense of how you’re doing, just throughout the day. Maybe 90% fill out the forms online. And the number that sit down with you at the end of the day are maybe 20%. That would actually take you aside and sit down with you and talk to you about your performance.”

“I think your study is going to come down to it’s staff dependent. Some will solicit your feedback [as a learner] and say, “How can I be a better teacher?” And some, it’s really a one-way sort of thing.”

“Some people are good at giving evaluations and it does not matter what the form is, it will be valuable. Some people are just not good at it or don’t care. You can make them do it but it’s not going to add value. Or someone might be very good at giving verbal feedback and not so good at the written feedback. But, at the end of the day, the only thing the program has on record is the written feedback. So, I think there are some limitations, but that’s kind of inherent when you work with 80 different staff.”

Although teaching and assessment of trainees’ skills are competencies, assessment is not explicitly taught or evaluated. This results in highly variable knowledge and skills.

1.5

Oral, written and interpersonal communication skills sufficient for the preparation of test takers, administration of tests, and the provision of feedback of test results.

“It wasn’t too short, it wasn’t too long. We had a great day, a standard day, got done one time, took the full half hour, not in the OR, reviewed in this (private) room. It was good. You know, it was nice structured formal feedback. There were very good points and observations discussed.”

“Some staff appreciate that there’s variability between the practices. Some staff even if they acknowledge it still chastise you sometimes in a negative way if you don’t do something how they do it. So that can sometimes be frustrating and sort of mar any kind of other feedback you’re going to get from them that day because you know it is stupid and annoying and you can’t do anything about it, I find.”

The interpersonal skills of the staff are not sufficient in this reported experience. Feedback must be about valid observations of the competencies being developed, not the preferred habits of a specific staff. Further. To report being “chastised in a negative way” indicates that the feedback was not delivered in an effective manner.

1.6

Conduct communications with due concern for the sensitivities of the person being assessed and other relevant parties.

M: “On the form we currently get on [the online system], there’s a thing to click if you’ve had a chance to discuss this with your preceptor. I always just click yes, even though pretty much I never. You know, a lot of days you don’t actually even really have any kind of formal feedback other than just like “Good job today. See ya later.”

T: “I click whatever they clicked”

M: “Me too” (laughs)

T: “even if they’re like “we met” I’m like, “Yeah, sure we met”

M: “I agree with you”

T: “Yeah. Or we didn’t and like, “you did say something, but whatever”.

It is important to acknowledge the power dynamic of the preceptor over the resident. An online tool providing an option to disagree with the reporting of a preceptor is unlikely to produce results because the residents do not feel they are in a position to disagree.

“We did it [formal private conversation] in the OR. ..they (assessor) were like “do you care?”…[I said] Everybody just watched me do all the things you’re going to talk about to me right now. So…”

It is not considerate of the potential sensitivities of the residents’ needs to ask in front of other people to give the feedback in front of other people when the standardized assessment tool indicates to take a small amount of time in a private space. Given power dynamics, it is unlikely that a resident would say they want the standard private time if the senior staff conducting the assessment is indicating they do not want to.

1.8

Knowing when and when not to use tests.

“It was complicated to the point of not possible on our day. We just had a busier day. And all day we were kind of like, oh, we’ve got to try to find time for this, maybe we’ll fit it in here, there. Oh, we’re not supposed to do it in the OR. And then, in the end, kind of like half did it in the OR because it was a busy day. I don’t think I even got the structured thing.”

“I had the opposite experience. I was on OBs in a gynae room with like 3 cases and a fast surgeon. So we just took our time. My staff was in the corner with a folder, doing the assessment, watching me. And we literally at the end of the day could like …because we finished at like 3 pm, just went and found a room and did the whole structured feedback. There was tons of time to do it. But I can totally see if you have a busy day and lots of things are going on that it would be very difficult to do.”

Timing is a logistical constraint for these assessments. Setting up a system that does not force assessments when the time is not available would improve the quality of assessments when there is appropriate time to complete them.

1.9

Choice and evaluation of alternative tests

“I think that there are days where the ANTS part is more important and days where the observational part is more important. …Like for the stuff we don’t do a tonne of, like weird blocks or fibre optics or thoracic epidurals, or whatever it might be, those are days that [fit the ANTS] better. Having the structured feedback for those is pretty important.”

We trialed two standardized feedback forms. Having choice was contextually appropriate from the residents’ perspectives.

1.10

Knowledge, understanding and skills relating to the process of testing: What test users need to be able to do to administer, score, and interpret tests.

“If we are talking about quality, it’s not only a question of whether or not they fill it out but also what they put in there. So some stuff will go on no matter what, just fill it out like in a row, kind of wherever they think you fit in. Whereas some staff are very thoughtful and you can tell they put a lot of effort into it to give you specific feedback of things you can actually work on. And then some other staff will say “keep reading”, which we see millions times.”

Test users are critical to the process and residents see a high degree of variability. This influences their learning.

1.11

Report writing and feedback mechanisms that are accurate, timely, consistent and useful. Include within written reports a clear summary, and when relevant, specific recommendations.

“The feedback that’s most useful, when it was really good, usually revolves around decision-making and where we can identify points where critical decisions had to be made, the alternatives, and then giving me feedback about what can be done next time. That becomes more useful in the grand scheme of things.”

“Most staff, I would say, make an effort to make at least some sort of acknowledgement of how the day went… if there was an issue, that would be brought up at that time. Whether that gets translated in terms of written output …it takes a big steep drop-off after that. Because I think some people really sort of substitute what they’ve talked about as more meaningful and not really necessary to, you know, do the written thing if you’ve discussed it.”

“I’d say the minority would actually at the end of the day like bring you into the lounge or somewhere and sit you down, and actually do like a ‘what you did well’. Like, try to be more structured. Less than 10% actually. The vast majority are like ‘See ya!’”.

“The staff that give the more high quality feedback, it’s gotten really useful and the feedback prompts very good reflection that I then carry over to, you know, the rest of rotation. And some I’ve gotten is completely useless, and also sometimes doesn’t match between the written and oral. So at the end of the day, they’ll say “Good job, everything went well today, no issues.” And then I get the written form back and it says “I had to be in the room for technical skills.” You know, it totally doesn’t match my perception and it doesn’t match what they said. It is not reflective of me and the actual experience we had in the OR.”

The structure and forms are conducive to accurate, timely, consistent and useful feedback, though whether this happens is variable depending on the assessor’s knowledge, attitudes and skills.

2

Characteristics of Standardized Assessment Tools and Procedures

2.1

Supported by evidence of reliability and validity of their intended purpose.

“I think in terms of the timeliness and the face-to-face components of receiving feedback, I think these standardized tools ensured that that actually happened because there was something more structured that we both had to pay attention to. It was more than just like a “Good day. See ya.” So I think it did impact that significantly. All of us, even if it was only 5 min, had a point in time where we knew we were getting feedback and our preceptor knew they were giving us feedback, and it was happening on the day, face-to-face.”

Having a standardized form introduced increased the consistency of feedback, which is an indication of reliability.

2.2.

The assessment procedure provides evidence to support the inferences that may be drawn from the test.

“I think it’s a general consensus that some days are good and some days are bad. And for me, the way that you address that is you have a cumulative number of experiences with a certain staff. And at the end of a time period, you have a more encompassing thing when you have some time to deal with it.”

“I think that because everyone is going through a structuring thing, you at least have a face-to-face time to discuss any issue that might have come up. So think you aren’t going to run into what [participant] was talking about with regard to having a brief discussion of the day and then getting an eval online that’s like “What?!” Because there were specific times where you’re going through the thing, and the staff would say, “I thought you did this really well” or whatever “but maybe you were a little bit … I don’t know what happened here” And you could be like, “Oh, that was because…”. And then it was discussed and it was sort of like nothing could be hidden with that because it’s face-to-face. So I think as long as you’re following some sort of structure and you have a face-to-face discussion about it, the [verbal and written feedback will be reliable].

“We have to do quarterly… There are supposed to be reviews with our academic mentor. And [the forms submitted online] is the information that our academic mentor has about us. They read most of or all of the feedback forms that we get on these daily things. We have these meetings 4 times a year and we discuss them. So my mentor is like super on top of meeting 4 times a year. She goes through and she literally picks out things that people have written. So if it’s an informative evaluation from a given day, I would say that is useful for her because that’s how she’s evaluating me and like doing her quarterly review of how I’m doing.”

Overall the program has corrections against specific inaccurate forms because a collective of forms, in tandem with a mentorship relationship, is used quarterly to support resident progress.

2.3

Logistically feasible within and related to the test setting.

“I think day-to-day, to rely on the fact that you’re going to have half an hour to sit down and talk about something is unreliable.”

“The verbal feedback that’s given on the day of, it’s much more specific, it’s more precise, it’s more relevant, and a lot more useful.”

There was a range of scenarios discussed – when logistically feasible, the evaluations are effective. When not logistically feasible, it feels forced, does not contribute to learning and leads to evaluation fatigue.

  1. aResident’s verbatim words during the focus group are in quotes. Paraphrased words from the residents’ verbatim quotes are in []. Researchers’ connections to ITC guidelines are in italics
  2. The ITC guidelines outline what a quality assessment tool is, as well as the knowledge, skills, abilities and other personal characteristics requisite of those conducting evaluations of others that have consequences for the test-takers’ work or personal lives. The guidelines clarify that these standards apply beyond what might be formally termed a “test” to any assessment procedure that provides estimates of performance and involve the drawing of inferences from samples of behavior in professional practice settings where there are substantial consequences to the person being assessed, such as medical accreditations and career progression