Skip to main content

Table 3 Summary of effects, estimated variance components and reliability coefficients, and results of D-study (expected reliability for different measurement scenarios)

From: Validity of a new assessment rubric for a short-answer test of clinical reasoning

Effect Variance component df MS VC (with negative values set to ‘0’) % variance
Experience/level of training (e) σ2 (e) 1 365.1019 0.22457 12.92556
p:e σ2 (p:e) 28 22.2046 0.1565 9.007661
rater (r) σ2 (r) 1 20.6960 0 0
Question (q) σ2 (q) 17 17.5271 0.0353 2.0312
Item within question (i:q) σ2 (i:q) 28 3.0545 0.0187 1.0746
experience*rater σ2 (e*r) 1 30.4104 0.0372 2.1382
experience*question σ2 (e*q) 17 8.9908 0.0795 4.5775
experience*item within question (ei:q) σ2 (e*i:q) 28 1.9765 0.0310 1.7837
person*rater:experience (pr:e) σ2 (p*r:e) 28 5.6480 0.0954 5.4932
person*question:experience (pq:e) σ2 (p*q:e) 476 2.5138 0.2725 15.6860
person*item:experience*question σ2 (p*i:e*q) 784 0.5783 0.0837 4.8198
rater*question σ2 (r:q) 17 2.7036 0.0286 1.6473
rater*item:question σ2 (r*i:q) 28 0.8331 0 0
experience*rater*question σ2 (e*rq) 17 0.6226 0 0
experience*rater*item:question σ2 (e*r*i:q) 28 0.8834 0.0316 1.8211
person*rater*question:experience σ2 (p*r*q:e) 476 0.9885 0.2319 13.3475
person*rater*item:experience*question σ2 (p*r*i:e*q) 784 0.4109 0.4108 23.6467
TOTAL variance     1.7374 100
G-coefficient (95 % confidence interval) 0.749
Number of raters (random) G-coefficient
3 raters, 18 questions (fixed) 0.818
4 raters, 18 questions (fixed) 0.857
5 raters, 18 questions (fixed) 0.882
6 raters, 18 questions (fixed) 0.900
7 raters, 18 questions (fixed) 0.913