Skip to main content
  • Matters Arising
  • Open access
  • Published:

Matters arising: methodological issues on evaluating agreement between medical students’ attitudes towards drugs/alcohol use during pregnancy by Cohen’s kappa analysis

28 February 2023 Editor's Note: The authors of the published article this Correspondence article refers to were invited to submit a reply to this Correspondence article, but have declined to provide a response at this time.

The Original Article was published on 02 May 2022

Abstract

Background

The purpose of this article is to discuss the statistical methods for agreement analysis used in Richelle’s article (BMC Med Educ 22:335, 2022). The authors investigated the attitudes of final-year medical students regarding substance use during pregnancy and identified the factors that influence these attitudes.

Methods

We found that Cohen’s kappa value for measuring the agreement between these medical students’ attitudes towards drugs/alcohol use during pregnancy was questionable. In addition, we recommend using weighted kappa instead of Cohen’s kappa for agreement analysis at the presence of three categories.

Results

The agreement improved from “good” (Cohen’s kappa) to “very good” (weighted kappa) for medical students’ attitudes towards drugs/alcohol use during pregnancy.

Conclusion

To conclude, we recognize that this does not significantly alter the conclusions of the Richelle et al. paper, but it is necessary to ensure that the appropriate statistical tools are used.

Peer Review reports

Background

We read with interest the article entitled: “Factors influencing medical students’ attitudes towards substance use during pregnancy” which was published in BMC Medical Education on 2 May 2022 [1]. The authors investigated the attitudes of final-year medical students regarding substance use during pregnancy and identified the factors that influence these attitudes. They focused on two items, including drugs and alcohol, regarding the punishment of substance use during pregnancy. Nonetheless, we found that Cohen’s kappa value for measuring the agreement between these medical students’ attitudes towards drugs/alcohol use during pregnancy was questionable. We recommend using weighted kappa instead of Cohen’s kappa for agreement analysis at the presence of three categories. The agreement improved from “good” (Cohen’s kappa) to “very good” (weighted kappa) for medical students’ attitudes towards drugs/alcohol use during pregnancy. To conclude, we recognize that this does not significantly alter the conclusions of the Richelle et al. paper, but it is necessary to ensure that the appropriate statistical tools are used.

Main text

Cohen’s kappa statistic is generally suitable for evaluating two raters [2]. Especially in Cohen’s kappa statistic, the weighted kappa statistic should be used to estimate the inter-rater reliability in the presence of more than two categories [3]. In contrast to Cohen’s kappa statistic, weighted kappa statistic relies more on predefined cell weights that reflect agreement or disagreement.

Cohen’s kappa is calculated as follows:

$${k}_C=\frac{\sum_{j=1}^n{u}_{jj}\left(i{i}^{\prime}\right)-\sum_{j=1}^n{p}_{ij}{p}_{i^{\prime }j}}{1-\sum_{j=1}^n{p}_{ij}{p}_{i^{\prime }j}}$$
(1)

Weighted kappa is calculated as follows:

$${k}_w=1-\frac{\sum_{i=1}^n\sum_{j=1}^n{w}_{ij}{p}_{ij}}{\sum_{i=1}^n\sum_{j=1}^n{w}_{ij}{p}_i{q}_j}$$
(2)

The value of ujj(ii) is the proportion of objects put in the same category j by both raters i and i. The value of pij is the proportion of objects that rater i assigned to category j, and k is the number of raters. Cohen [4] suggested the k value should be interpreted as follows: < 0.20 as poor agreement, 0.20–0.40 as fair agreement, 0.41–0.60 as moderate agreement, 0.61–0.80 as good agreement, and > 0.80 as very good agreement.

In the authors’ Table 1, according to the authors’ calculation, inter-rater reliability was good for medical students’ attitudes towards drugs/alcohol use during pregnancy (Cohen’s kappa = 0.775, 95% [confidence interval] CI = 0.714–0.837). But, in our opinion, weighted kappa was more applicable than Cohen’s kappa for the presence of three categories. Consequently, we performed the weighted kappa statistics (linear and quadratic) for evaluating the agreement according to the authors’ data. The linear weighted kappa value was 0.804 (95% [confidence interval] CI = 0.746–0.863) as very good agreement. The quadratic weighted kappa value was 0.831 (95% [confidence interval] CI = 0.770–0.892) as very good agreement, too. The greater the difference between the ratings of the same data, the stronger the hint of inconsistency. For example, the penalty for category disagree into category agree should be significantly greater than that for predicting category disagree into category undecided. If using Cohen’s kappa, there is no difference between the former and the later. If using linear weights, the penalty for the former is equal to 2 times the later. If using quadratic weights, the penalty for the former equal to 4 times the later. Therefore, we recommend that quadratic weighted kappa should be used to evaluate the agreement for magnifying the degree of inconsistency in the judgment of large level distance.

Table 1 Punishment for pregnant women using drugs or alcohol

In conclusion, the authors underestimated the agreement between medical students’ attitudes towards drugs/alcohol use during pregnancy. The reasonable agreement was “very good” for medical students’ attitudes towards drugs/alcohol use during pregnancy. Anyway, we recognize that this does not significantly alter the conclusions of the Richelle et al. paper, but it is necessary to ensure that the appropriate statistical tools are used. We highlight that the rigor and use of the correct statistical approach is crucial for any scientific publication. Applying appropriate statistical methods can enhance the scientific accuracy of research results.

Availability of data and materials

Not applicable.

Change history

  • 28 February 2023

    Editor's Note: The authors of the published article this Correspondence article refers to were invited to submit a reply to this Correspondence article, but have declined to provide a response at this time.

Abbreviations

CI:

Confidence interval

References

  1. Richelle L, Dramaix-Wilmet M, Roland M, Kacenelenbogen N. Factors influencing medical students’ attitudes towards substance use during pregnancy. BMC Med Educ. 2022;22(1):335. https://doi.org/10.1186/s12909-022-03394-8.

    Article  Google Scholar 

  2. McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276–82. https://doi.org/10.11613/BM.2012.031.

    Article  Google Scholar 

  3. Marasini D, Quatto P, Ripamonti E. Assessing the inter-rater agreement for ordinal data through weighted indexes. Stat Methods Med Res. 2016;25(6):2611–33. https://doi.org/10.1177/0962280214529560.

    Article  Google Scholar 

  4. Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull. 1968;70(4):213–20. https://doi.org/10.1037/h0026256.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the Heilongjiang Province Higher Education Teaching Reform Project (SJGY20200799), Fundamental Research Funds in Heilongjiang Provincial Universities (135509160) and Qiqihar University Degree and Postgraduate Education and Teaching Reform Research Project (JGXM_QUG_Z2019003, JGXM_QUG_Z2019002).

The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

TY wrote the original draft of the manuscript. LY, XJ and SS were involved in the analysis and interpretation of the data. WS was a major contributor in revising the manuscript. ML contributed to the conception and design of the study. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tianfei Yu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, T., Yang, L., Jiang, X. et al. Matters arising: methodological issues on evaluating agreement between medical students’ attitudes towards drugs/alcohol use during pregnancy by Cohen’s kappa analysis. BMC Med Educ 23, 118 (2023). https://doi.org/10.1186/s12909-023-04071-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12909-023-04071-0

Keywords