Study design, setting, and participants
The IPAMP was subjected to cross-cultural translation and adaptation into Japanese. We conducted a cross-sectional survey to examine its psychometric properties. We contacted two postgraduate clinical training hospitals (Oji Seikyo Hospital and Suwa Central Hospital) in Japan, and both agreed to cooperate. Oji Seikyo Hospital is located in an urban area and has 159 beds, whereas Suwa Central Hospital is in a rural area and has 360 beds. From September 2021 to March 2022, we distributed an anonymous self-administered version of the J-IPAMP to potential participants. The eligibility criteria for patients were: aged 20 years and above, admitted to one of these two hospitals, and assigned to clinical trainees (postgraduate years 1–5) during the survey period. All voluntarily agreed to join this study beforehand. Patients who were expected to find it difficult to answer the paper-based questionnaire on their own because of severe physical (e.g., fracture of the dominant arm) or mental (e.g., severe dementia) disorders, based on hospitalization observations by hospital staff, were excluded from the study. The questionnaires were delivered by the receptionists or surveyors at each institution. The respondents filled out the questionnaire, put it in an envelope, and dropped it in the collection box at each institution.
Translation process
The original IPAMP has 11 items, each of which is rated on a 5-point Likert scale (1 = poor, 2 = fair, 3 = good, 4 = very good, and 5 = excellent). Factor analysis showed a two-factor model comprising “involvement and respect” (Q5, Q6, and Q8–Q11) and “compassion and rapport” (Q1–Q4 and Q7) [16].
After obtaining permission from the original author, we translated the IPAMP into Japanese following an international guideline for the cross-cultural adaptation process [21]. First, three forward translations from English to Japanese were conducted independently by three translators (HF, DS, and KK). All three were fluent in English, familiar with the cultures of healthcare in which both languages are used, and had professional experience translating questionnaires in the field of medical education [22]. Second, the translators worked together to synthesize and refine the Japanese translations (Version 1). Third, we requested professional bilingual translators who had no prior knowledge of the questionnaire to back translate Version 1 into English. We compared the back-translated version with the original English version, proofread it, and produced Version 2. Fourth, we asked an expert in medical professionalism (YT) to review Version 2 and modified it based on the feedback (Version 3). Fifth, we asked the original author to review Version 3 to ensure that there were no problems with the translation process. Finally, from July to August 2021, a pilot test was conducted on 11 inpatients at Suwa Central Hospital. The author (HF) interviewed the inpatients on whether Version 3 was intelligible and understood as intended. As the pilot test revealed that there were no problematic items in the translation processes, Version 3 was considered final. The tool’s face and content validity were ensured by all authors.
Statistical analysis
The structural validity of the J-IPAMP was tested through both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). We applied the split-half validation approach and randomly split the sample into two independent groups. As this study sought to develop a scale that is optimized for the Japanese healthcare context, we decided to perform the EFA first.
Before the EFA, the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy and Bartlett’s test of sphericity were performed to assess the suitability of the data for performing EFA. A KMO value greater than 0.60 [23] and a significant Bartlett’s test (p < 0.05) indicates suitability for EFA. EFA was conducted on half of the dataset using the maximum likelihood with promax rotation method. For factor extraction, Kaiser-Guttman Criterion (Eigenvalues greater than 1) and parallel analysis were employed [23]. A cut-off value of 0.40 was adopted for factor loadings.
We conducted CFA using the maximum likelihood estimation approaches on the other half of the data to assess the suitability of the original two-factor model and identify an alternative model. The model fitness of the data was determined by calculating goodness-of-fit indices. We employed the following criteria to assess the model fitness: comparative fit index (CFI) close to 0.90 or higher, Tucker–Lewis index (TLI) close to 0.90 or higher, root mean square error of approximation (RMSEA) close to 0.08 or below, and standardized root mean square residual (SRMR) close to 0.08 or below [23,24,25].
We used the J-IPAMP total scores and the global rating to examine criterion-related validity. The question for the global rating was as follows: “Using any number from 0 to 10, where 0 is the worst doctor possible and 10 is the best doctor possible, what number would you use to rate this doctor as a professional doctor?” Criterion-related validity was assessed using Pearson correlation coefficients between the J-IPAMP total scores and the global rating. Correlation coefficients were considered meaningful if they were above 0.30 [26]. Finally, we utilized Cronbach’s alpha and omega coefficients to examine the internal consistency reliability of the scale. A value of above 0.70 for both coefficients is considered satisfactory reliability [27, 28]. We chose a complete case analysis method because of the small amount of missing data. We analyzed the data using R version 4.2.1 (R Foundation for Statistical Computing, Vienna, Austria; www.R-project.org). We used psych version 2.2.5 and GPArotation version 2022.4–1 to perform EFA and lavaan version 0.6–12 and semPlot version 1.1.6 to conduct CFA [29,30,31,32].
Ethical considerations
All participants gave their oral consent before the study began. We obtained ethics approval from the Institutional Review Board of the University of Tokyo (2021074NI).