Automatic analysis of summary statements in virtual patients - a pilot study evaluating a machine learning approach

Table 3 Comparison of manual (columns) and automatic (rows) rating of summary statements in the six categories and Cohen’s kappa as measure of agreement between the manual and the automatic rating

Category	Automatic trating	Manual rating			Congruent rating
Category	Automatic trating	0	1	2	Congruent rating
Semantic qualifiers	0	39	15	0	75.2%, κ = .557
	1	5	51	9
	2	0	2	4
Appropriate narrowing	0	21	9	1	81.6%, κ = .458
	1	8	68	13
	2	0	2	3
Transformation	0	47	14	1	69.6%, κ = .484
	1	11	35	5
	2	0	6	5
Factual accuracy	0	5	2	–	93.6%, κ = .366
Factual accuracy	1	12	106	–	93.6%, κ = .366
Patient name	0	78	10	–	90.4, κ = .783
Patient name	1	2	35	–	90.4, κ = .783
Global rating	0	24	4	0	80.0%, κ = .582
	1	8	72	5
	2	0	8	4

ISSN: 1472-6920