Source of evidence | Definition | Relevant study phasesa | Evidence collected in the study |
---|---|---|---|
Content | The degree to which the test content reflects the underlying construct it is intended to measure | Phase 1 | Test development procedure designed to ensure adequacy of the test for assessment of technical aptitude (expert blueprint, pilot testing and revision, the use of simulated tasks) |
Phase 2 | Relevance and difficulty ratings of expert surgeons Ratings of how well the tasks simulated reality General remarks or suggestion regarding the suitability of the test | ||
Response process | The degree to which sources of error associated with test administration were eliminated | Phase 1 | Test development procedure designed to minimize sources of error associated with test administration (detailed instructions, accommodation of the simulator to different needs of participants, allowing practice period for familiarization with the simulator) |
Phase 2 & Phase 3 | Clarity of instructions ratings | ||
The appropriateness of combining different performance parameters into a composite score | Phase 3 | Correlation between the different performance parameters (success rate, time, number of mistakes, path length, and percent of time within scope) | |
Internal structure | The quality of statistical and psychometric properties of the test | Phase 3 | Item analysis (reliability, item discrimination) |
Relationships with other variables | The degree to which the relationships of the test scores with other variables are consistent with the construct underlying the proposed interpretation of the test score | Phase 3 | Correlations between the test scores and interns’ characteristics (age, gender, dominant hand, desired training, previous experience with surgical simulators, and previous experience with video games) |
Feasibility | The practicality and ease with which a test or assessment can be given | Phase 2 & Phase 3 | Assessment of the appropriateness of the time limits Assessment of the appropriateness of the instructions Assessment of how comfortable the use of the simulator is Difficulty ratings of specific tasks and for the test as a whole |
Acceptability | The extent to which a test is viewed as suitable and appropriate by those who take it | Phase 3 | Relevance ratings of interns |