Skip to main content

Development and validation of teacher and student questionnaires measuring inhibitors of curriculum viability



Curriculum viability is determined by the degree to which quality standards have or have not been met, and by the inhibitors that affect attainment of those standards. The literature reports many ways to evaluate whether a curriculum reaches its quality standards, but less attention is paid to the identification of viability inhibitors in different areas of the curriculum that hamper the attainment of quality. The purpose of this study is to develop and establish the reliability and validity of questionnaires that measure the presence of inhibitors in an undergraduate medical curriculum.


Teacher and student questionnaires developed by the authors were sent to medical educationalists for qualitative expert validation and to establish their content validity. To establish the response process validity, cognitive interviews were held with teachers and students to clarify any confusion about the meaning of items in the questionnaires. Reliability and construct validity of the questionnaires were established by responses from 575 teachers and 247 final-year medical students.


Qualitative expert validation was provided by 21 experts. The initial teacher and student questionnaires containing respectively 62 items to measure 12 theoretical constructs, and 28 items to measure 7 constructs, were modified to improve their clarity and relevance. The overall scale validity index for the questionnaires was, in order, .95 and .94. Following the cognitive interviews, the resultant teacher and student questionnaires were reduced to respectively 52 and 23 items. Furthermore, after the confirmatory analysis, the final version of the teacher questionnaire was reduced to 25 items to measure 6 constructs and the student questionnaire was reduced to 14 items to measure 3 constructs. Good-for-fit indices were established for the final model and Cronbach alphas of, in order, .89 and .81 were found for the teacher and student questionnaire.


The valid and reliable curriculum viability inhibitor questionnaires for teachers and students developed in this study can be used by medical schools to identify inhibitors to achieve standards in different areas of the curriculum.

Peer Review reports


Curriculum quality is typically assessed through curriculum evaluation [1], which determines the quality of a curriculum by assessing its various aspects against a particular set of standards. This process, however, does not explicitly involve finding the issues that inhibit meeting specific standards. The issues impeding the achievement of curriculum quality standards are called ‘curriculum viability inhibitors’ [2]. Together, the presence of current inhibitors in the curriculum and the degree to which relevant standards are met make up the ‘viability indicators’, which determine the curriculum viability [3]. Many questionnaires reportedly measure attainment of quality standards in different areas of the curriculum. For instance, DREEM, AMEET, HELES [3,4,5] and JHLES [6] measure the educational environment, and AIM measures the implementation of assessment [7]. Yet we did not find any questionnaires that measure the inhibitors of the curriculum. Knowledge of inhibitors is particularly useful for reviewers when an existing curriculum needs to be renewed. Curriculum developers can also consider the inhibitors during the process of curriculum development, taking preventive measures to design a curriculum that has minimal issues when implemented.

Inhibitors of curriculum quality can also be explored through interviewing the stakeholders about different aspects of curriculum. However, that requires ample time and data analysis and involves perception of a rather small number of respondents compared to survey questionnaires. Certain tools developed by accreditation bodies use open-ended qualitative questionnaires to solicit views of medical educationalists or members of medical education departments [8]. Although medical educationalists are curriculum experts in a general sense, they may not be expert in viability inhibitors of a specific curriculum perceived and practiced by medical students and teachers at large. Therefore, there is a need to develop questionnaires that can easily be interpreted by all stakeholders involved in identifying inhibitors. The aim of this study is therefore to develop and establish the validity and reliability of student and teacher questionnaires measuring viability inhibitors.

In an earlier study, a scoping review on curriculum viability indicators showed 37 standards and 19 inhibitors [2]. Thirteen studies dealt with standards, but only two studies described both standards and inhibitors. Thus, a Delphi study was conducted to develop consensus on curriculum viability inhibitors among experts [3].

The main stakeholders of the curriculum in a medical college are teachers, students, and educational managers. Though educational managers have a significant stake in the implementation and development of the curriculum, the curriculum is mainly implemented by the teachers and experienced by the students. Accordingly, this study addresses the following questions covering the steps of development and validation of a questionnaire [9]: (1) What items in a teacher and student questionnaire are relevant to measure curriculum viability inhibitors according to medical education experts (Expert validation)? (2) What is the content validity of the teacher and student questionnaires? (3) How do teachers and students interpret the items in the teacher and student questionnaire (Response Process Validity)? And (4) what are the construct validity and reliability of the questionnaires?


Study design and settings

Development and validation of the curriculum viability inhibitor questionnaires comprised two main phases, as shown in Fig. 1. The first phase was the development of questionnaires and getting qualitative expert feedback to refine them. The second phase was establishing the content validity, response process validity, construct validity, and reliability of the questionnaires.

Fig. 1
figure 1

Phases of the study. Phase 1 and 2 of the study that show development and validation of the teacher and student questionnaires measuring curriculum viability inhibitors

Defining and measuring the inhibitors that constitute the theoretical constructs in the questionnaires will help an educational institution find the issues that hamper the attainment of a healthy curriculum and hence to develop ‘treatments’ for improving curriculum viability. Some of these theoretical constructs include irrelevant curriculum content, low quality assessment, lack of social interaction, and lack of sharing best practices. Table 1 shows all the 12 theoretical constructs with their descriptions.

Table 1 Inhibitors and their definitions

This study was approved by the Institutional Review Committee at Riphah International University (Appl. # Riphah/IRC/18/0394). Written informed consent was taken from all the participants.

The study duration was from October 2019 to July 2020. It was conducted involving medical education experts, students, and teachers from various institutions; the details of which have been provided in phase 1 and 2 in the relevant sections.

Phase 1

In this phase, answering our first question, the authors developed the first version of the teacher and student questionnaires based on literature review, and refined the questionnaires after receiving qualitative feedback from expert medical educationalists.

Development and qualitative content validation of teacher and student questionnaires

Participants, materials and procedure

Out of 27 experts who were invited based on their qualifications (at least Master’s in medical education or equivalent qualification) and experience in medical education (more than 5 years), 21 (77%) responded and provided feedback on the first version of the questionnaire, with comments on the constructs and related items.

The first version of the teacher questionnaire had 62 items measuring 12 constructs, whereas the student questionnaire had 28 items measuring 7 constructs.

The first author (RAK) developed the items for measuring each inhibitor based on a scoping review [2] and a consensus-building Delphi study amongst a group of experts [3]. The co-authors (AS, UM, MAE, and JJM) then refined the questionnaire before sharing it with medical education experts through e-mail. The experts were asked to provide qualitative feedback on the questionnaire items to improve their clarity and relevance to the inhibitor if needed, and also to comment on deletion or addition of items.

Data analysis

The feedback was initially analysed by the first author by organizing the comments on the items. The changes in the items suggested by experts were made based on the criteria: (1) item easy to understand, (2) relevant to the construct, (3) avoid duplication or similar meanings, (4) minimize grammatical and formatting errors, and (5) avoid double-barreled statements. The questionnaire was then shared with co-authors for their feedback and consensus on modifications to the items.

Based on the expert feedback, items were reworded for clarity and grammatical inaccuracies or deleted if found not relevant to the construct or having a meaning very similar to another item. Some items were shifted to another construct if they were not found suitable for their current construct. When multiple suggestions were given for a single item, the commonly suggested modification was used and was finalized by the discussion and agreement of the authors.

Phase 2

This phase comprised of three steps: (1) establishing the content validity, (2) response process validity, (3) construct validity, and reliability of the questionnaires.

Step 1: establishing the content validity of teacher and student questionnaires

Participants, materials and procedure

To rank the items for content relevance and clarity, 19 out of 21 (90.5%) medical education experts from Phase 1 participated in Phase 2.

The revised questionnaire (version 2) based on the feedback from the medical education experts; for teachers had 60 items measuring 12 constructs (see Additional file 1: Appendix A), for students, it had 28 items measuring 7 constructs (see Additional file 1: Appendix B). For both questionnaires, Likert scales were used to measure the relevance and clarity of the items. For relevance we used: 4 = very relevant, 3 = quite relevant, 2 = somewhat relevant, and 1 = not relevant. For clarity, we used: 3 = very clear, 2 = item needs revision, and 1 = not at all clear.

The questionnaire version 2 was sent via email to 21 experts who had previously provided feedback in Phase 1, with a request to respond within 3 weeks. They were asked to score the items on the Likert scales and provide feedback to improve the items further. Out of 21 participants, 19 responded. The forms sent by 5 participants were incomplete and they were requested to send the completed forms. Only two participants complied, hence a total of 16 complete forms were included in the study.

Data analysis

To establish content validity, quantitative and qualitative data were analysed. For the quantitative component, the content validity index (CVI) for the individual items (I-CVI), and of the scale (S-CVI) were calculated [9], based on the scores given by the experts.

I-CVI was calculated as the number of experts in agreement divided by the total number of experts, and S-CVI was determined by calculating the average of all CVI scores across all the items. To calculate I-CVI, the relevance ratings of 3 or 4 were recoded 1, and items ranked 1 or 2 were recoded as 0. For each item, the 1 s were added and divided by the total number of experts to calculate the I-CVI.

To improve the clarity of the items where a 3-point Likert scale was used, the content clarity average was calculated. The average clarity of an individual item was calculated by adding the sum of all the values given to the item divided by the total number of items. Average clarity above 2.4 (80%) was considered to be very clear [10].

The comments provided by the experts were categorized into general comments for the questionnaire and specific comments for the items. Based on these comments, the items were modified.

Step 2: establishing response process validity through cognitive interviews

Cognitive interviewing is a technique that validates the understanding of items in a questionnaire by the respondents.

Participants, materials and procedure

Interviews were held with 6 teachers, 3 each from basic and clinical faculty to have representation from basic and clinical sciences, and 3 students from the final year MBBS as they have the maximum exposure to the curriculum .

In version 3, the teacher questionnaire had 53 items measuring 12 constructs, and the student questionnaire had 23 items measuring 7 constructs. We used a combination of ‘think aloud’ and ‘verbal probing’ techniques [9].The participants were asked to read the item silently and think aloud what came to their mind after reading it [11]. In verbal probing, we asked scripted and spontaneous questions after the participant had read an item [12]. We combined the verbal probing and think-aloud techniques, as ‘think aloud’ acts as a cue for respondents, to yield additional information on the quality of the items as explained in the procedure section below.

Test interviews were conducted with 1 co-author, 1 teacher, and 1 student using Zoom ( to identify possible issues related to combining think-alouds and verbal probing. The time participants needed to answer the items in the questionnaire was also determined. The average cognitive interview lasted approximately 60 min for 27 items in the teacher questionnaire and 50 min for 23 items in the student questionnaire. We also piloted cued retrospective probing [13], in which the primary researcher replayed the recorded think-aloud to the participant and explored the items with scripted and spontaneous probes. We found that it yielded no extra benefit in providing a cue as compared to the combination technique and also required more time.

The protocols regarding cognitive interviews for the study were planned based on the pilot interviews as they require a sustained concentration on behalf of the participants [14]. Hence for the teacher questionnaire, we divided the 53 items in the questionnaire between 2 participants whereas the student questionnaire did not require division as it had only 23 items. To increase the credibility of the interview technique and reduce bias, another researcher (UM) was also present during each interview.

Data analysis

Analytic memos were created based on the think-aloud and verbal probing. These memos were coded into the following categories: (1) items with no problems in understanding, (2) items with minor problems in understanding, and (3) items with major problems in understanding [15]. These categories were assigned independently by RAK and UM. Items that required more clarity were reworded and further refined through review from the remaining co-authors (AS, MAL, and JVM). The details of the response process validity for the purpose of reproducibility are provided in the Additional file 1: Appendix C.

Step 3: establishing reliability and construct validity

Participants, materials and procedure

Based on the adequate sample size (minimum of 10 participants per item) reported in the literature, our target sample was 520 teachers for 52 items and 230 final-year medical students for 23 items [16, 17] in the respective questionnaires. A total of 575 teachers from 77 medical colleges and 247 final-year students from 12 medical colleges filled out the questionnaire. We selected those teachers who were currently involved in teaching and had been involved in implementing or developing the curriculum. Curriculum involvement was described as the development of module or course and teaching, assessing, and managing it. Final-year medical students were recruited, as they have the maximum experience of the curriculum. The designation, academic qualification, experience of teaching, experience in medical education, and type of curriculum practiced is shown in Table 2. Out of the 575 teachers, 526 provided complete responses, whereas 245 out of 247 students provided complete responses.

Table 2 Participant Demographics for confirmatory factor analysis of teacher questionnaire (N = 526)

The fourth version of the teacher questionnaire had 52 items measuring 12 constructs, and the student questionnaire had 23 items measuring 7 constructs. The items had to be scored on a 5-point Likert scale: 1 = strongly disagree, 2 = somewhat disagree, 3 = neither agree nor disagree, 4 = somewhat agree, and 5 = strongly agree. The items were shuffled so that they were not grouped by the hypothesized constructs. We also shuffled the answer options in a few items and informed the respondents. We did this so that questions were carefully read and answered by the respondents to encourage response optimizing and prevent satisficing [18,19,20].

A pilot study of the questionnaire was conducted with 20 teachers and 15 medical students to ensure the smooth working of the Qualtrics link ( and resolve any difficulty browsing through the questionnaire. No issues were reported by the participants. To maximize the response, we shared the questionnaire link through different sources. The link was sent to the Deans and Directors of medical education of the colleges through emails. They were also shared with the master’s in health professions students in their WhatsApp Groups. The invitation message stressed the formative purpose and use of the evaluations and the confidential and voluntary character of participation. To encourage participation, e-mail reminders were sent on Day-5 and Day-10, apart from reminders through WhatsApp to the Directors of medical education departments.

Data analysis

To ascertain the internal structure of the questionnaire, internal consistency was calculated through Cronbach’s Alpha. Then, we conducted confirmatory factor analysis (CFA) as we had specific expectations regarding (a) the number of factors (constructs/subscales), (b) which variables (items) reflect given factors, and (c) whether the factors correlated [21].

The questionnaires were evaluated using SPSS version 26 and AMOS version 26. Regarding internal consistency, Cronbach’s alpha of between .50 to .70 was considered a satisfactory internal consistency for the scale and subscales [22,23,24]. Corrected item correlation test (CITC) was calculated for the items of the subscales that had low internal consistency. CITC in the range of .2 to .4 was considered an acceptable value to retain the item [25, 26].

Construct validity was established via CFA. For the goodness-of-fit of the measurement model, we measured the absolute, incremental, and parsimonious fit indices. Absolute fit indices assess the overall theoretical model against the observed data, incremental or comparative fit indices compare the hypothesised model with the baseline or minimal model, whereas the parsimonious fit model index assesses the complexity of the model [27, 28]. The indices used for absolute fit are root mean square error of approximation (RMSEA) < .05 as a close fit, < .08 as an acceptable fit [29], and goodness-of-fit index (GFI) > .90 as a good fit [30]. For incremental fit, the indices considered acceptable are comparative fit index (CFI) > .90, adjusted goodness of fit index (AGFI) > .90, Tucker Lewis Index (TLI) > .90 [31], and normed fit index > .90 [32]. For parsimonious fit, Chi-square difference (χ2/df) < 5.0 is considered acceptable [4, 33].


Phase 1: development of the questionnaires

Based on the feedback provided by experts on the first version of the teacher’s questionnaire, 5 of 62 items were deleted as they were being duplicated; 43 items were modified because they required rewording for clarity based on incorrect grammar, formatting errors, and understandability; and 3 new items were added. The result was the next version having 60 items, as shown in Table 3.

Table 3 Modifications done in different versions of the teacher and student questionnaires

Regarding the student’s questionnaire, 22 of 28 items were modified while 6 items were not changed. Among the 22 items modified, 21 items were reworded for lack of clarity and grammatical inaccuracies (Table 3).

Phase 2: establishing the validity and reliability of the questionnaires

Content validity index and content clarity average of the teacher’s questionnaire

Out of 60 items, 4  items had a CVI less than .70 and were removed, 3 items had a CVI between .70 and .79; they were modified according to the qualitative feedback of the experts and retained. The remaining items had a CVI higher than .79. However, the experts indicated that 4 items were similar in meaning to other items and were therefore also removed. The third version of the questionnaire thus had 53 items. Overall scale content validity (SCVI/AVG) of the questionnaire was .95.

Out of 53 items, 7 had a content clarity average (CCA) of 3 (100% clarity), 38 between 2.75 and 2.93, and 12 between 2.56 and 2.68. The average clarity of the scale was 2.81. Based on the qualitative feedback, 36 items in the questionnaire were again reworded for clarity, consistency, and grammatical inadequacies (see Additional file 1: Appendix A).

Content validity index and content clarity average of the student questionnaire

Out of 28 items, 2 items had a CVI less than .70 and were hence removed. Among the remaining 26 items, 3 items had a CVI between .75 and .79. Two items were retained after modification according to the expert feedback; however, 1 item was removed because of its similarity to another item. Twenty-three items had a CVI higher than .79. All items were retained except for 2 items that had a similar meaning as other items. Overall, 5 items were deleted. Version 3 of the questionnaire had 23 items with an SCVI of .94.

Regarding the content clarity, out of 23 items, 2 items had a CCA of 2, 18 had a CCA from 2.75 to 2.93 while three had a CCA from 2.46 to 2.68. The average clarity of the scale was 2.88 (see Additional file 1: Appendix B).

Response process validity of Teacher’s questionnaire through cognitive interviews

Table 3 shows that after establishing the content validity, 53 items remained in the questionnaire. Out of the 53 items, 42 items were found to be easily understood by the participants and required no change. Ten items needed more clarification and hence were explained in more detail by adding examples. One item was deleted as its content was also repeated in the subsequent items.

Response process validity of Student’s questionnaire through cognitive interviews

Twenty-three items were tested for response process validity. Sixteen required no change as they had no ambiguities, whereas 7 items were modified by adding examples to them.

Establishing the construct validity and reliability of the questionnaires

The KMO and Bartlett’s test of sphericity for teacher and student questionnaires were .942 and .879, which indicated an adequate sample size for factor analysis. The reliability of the items before conducting CFA was found to be .941 and .870 for the teacher and student questionnaires, respectively, hence no items were removed [34]. A one-factor model was generated for both models, which was found not to have a good fit. Afterwards, 12- and 7-factor models, as hypothesized by the authors based on published literature [2, 3] and expert validation, were developed and analysed. These models were reduced to 11 and 6 factors after the deletion of items and the use of modification indices to achieve an acceptable model. Goodness of fit indices were established for these models, however factor correlations higher than 1 were found between the constructs. To correct this, closely related factors were combined. For example, ‘irrelevant curriculum content’ and ‘low-quality assessment’ had a high factor correlation (> 1). They were combined to form a new factor ‘Educational Program’. Tables 4 and 5 show the final teacher questionnaire with 25 items measuring 6 constructs, and the student questionnaire with 14 items measuring 3 constructs, along with the Cronbach’s alpha of the subscale and Cronbach’s alpha if deleted of the item. The CITC of items of ‘disciplinary culture’ was .25, and of ‘institutional culture’ were in the range of .22 to .29. The final versions of valid and reliable teacher and student questionnaires are given in Additional file 1: Appendix D and Appendix E that can be used for assessment of curriculum viability.

Table 4 Teacher questionnaire (final version) with Cronbach’s alpha if deleted
Table 5 Student questionnaire (final version) with Cronbach’s alpha if deleted

Table 6 shows the goodness-of-fit for these models, reported through ChiSq/df, RMSEA, CFI, NFI, TLI, GFI, and AGFI. Reliabilities of the teacher and student questionnaires were, in order, .901 and .834.

Table 6 Models and Confirmatory factor analysis indices

This represented parsimonious, absolute, and incremental fit for our models, shown through sequential equation models in Figs. 2 and 3, respectively. The figures show 6- and 3-factor models with 25 and 14 items, respectively, for the teacher and student questionnaires with all factor correlations being below 1.

Fig. 2
figure 2

Sequential Equation Model for Teacher Questionnaire. The figure shows factor loadings, factor co-relations and good for fit indices (parsimonious, absolute, and incremental fit) for six factor model containing 25 items. Abbreviations used: EP = Educational Program, DC = Disciplinary Culture, SI = Social Interaction, IP = Institutional Policies, CP = Communication Practices, FI = Faculty Involvement, AGFI = Adjusted goodness of fit index, CFI = Comparative fit index, GFI = Goodness-of-fit index, NFI = Normed fit index, RMSEA = root mean square error of approximation, TLI = Tucker Lewis Index, χ2/df = Chi-square difference

Fig. 3
figure 3

Sequential Equation Model for Student Questionnaire. The figure shows factor loadings, factor co relations and good for fit indices (parsimonious, absolute, and incremental fit) for three factor model containing 14 items. Abbreviations used: EP = Educational Program, IC = Institutional Culture, SR = Student Requirements


The main objective of the study was to develop two valid and reliable questionnaires that can measure curriculum viability inhibitors, so that curriculum reviewers, developers, and implementers can use these questionnaires to identify the inhibitors in the implemented curriculum based on the feedback of faculty and students.

Many questionnaires that measure teacher and student perceptions about educational environments have been reported in the literature, [5, 7, 35, 36] but not on curriculum viability inhibitors explicitly. Through this study, we have developed two valid and reliable questionnaires that collectively identify curriculum viability inhibitors. The teacher questionnaire in our study covers the educational content and assessment, faculty involvement, institutional policies, social interaction, disciplinary culture, and communication practices. In comparison with the ‘Assessment of medical education environment by Teachers’(AMEET) questionnaire [4, 37], our questionnaire covers a wider range of areas of the curriculum. The AMEET addresses the educational environment in areas like perception of teaching, learning activities, students’ learning and collaborative atmosphere, and professional self-perception. Though it covers the educational environment in detail, it does not focus on social interaction, institutional policies, communication practices and faculty involvement relevant to the inhibitors of the curriculum. Regarding the student’s perception about the medical education curriculum, questionnaires that measure learning environments include the Health Professions Learning Environment Survey (HELES) [5], Johns Hopkins Learning Environment Scale (JHLES) [6], and Dundee Ready Educational Environment Measure (DREEM) [35]. These questionnaires focus on the learning environment of the institution. For instance, DREEM addresses the students’ perception of learning, teachers, atmosphere, and students’ academic self-perceptions and social self-perceptions. However, the student questionnaire in our study focuses specifically on the curriculum viability inhibitors that affects the curriculum such as irrelevant curriculum content and low-quality assessment. In addition, it also addresses issues such as student requirements, presence of strong disciplinary cultures and lack of social interaction. Also, student questionnaire in our study has two common constructs with the teacher questionnaire.

This study shows that teachers and students have their own perceptions of the same curriculum as reported by Konings etal [38]. Eight items under two constructs (Educational program and Institutional culture) related to learning outcomes, curricular content, assessment, disciplinary culture, and social interaction are identical in the teacher and student questionnaires developed in our study. Thus, these questionnaires will inform program evaluators about the congruence or disagreement between students and teachers in these areas. In case of congruence, responses will strengthen the diagnosis of curriculum inhibitors; however, a differing opinion will require further investigation, such as qualitative inquiry based on interviews or focus group discussions with the faculty on the areas where a differing opinion has been reported.

A main strength of our study was the extensive method of developing the questionnaires as per the guidelines and steps reported in the literature [9, 27, 29, 33, 39,40,41]. It also became clear that having two different questionnaires for students and teachers is necessary. Another strength of our study was that the teacher respondents in our study belonged to 77 medical colleges with varied experience, from junior to senior academic positions and involved in teaching different curriculum (Table 2).

Analysis of internal consistency using Cronbach’s α showed an acceptable level of internal consistency for the total scales (.89 and .83 for teacher and student questionnaires, respectively) and subscales (.67 to .76) identified from the confirmatory factor analysis (Figs. 2 and 3) for the ‘educational program’, ‘social interaction’, ‘institutional policies’, ‘communication practices’, ‘faculty involvement’ for the teacher questionnaire and ‘educational program’ and ‘student requirements’ for the student questionnaire. This is consistent with the alpha values reported in the literature [24, 42,43,44]. Two of the subscales ‘disciplinary culture’(2 items) in the teacher questionnaire and ‘institutional culture’ (4 items) in the student questionnaire had low internal consistency in the range of .41and .46, respectively. However, subscale with value less than .40 (Cronbach’s α = 0.37) has been retained in a questionnaire if it was unidimensional with fewer number of items [45], which was the case for the two subscales (Tables 4 & 5) in our study. Furthermore, values of Cronbach’s alpha less than 0.7 are common for one-dimensional scales with less than 10 items and have been justified in the literature [46,47,48]. In addition, regarding both these sub-scales in our study, they were an important measure of discipline and social activities regarding the institutional culture. Hence another reason to retain the items in these subscales was to maintain the content validity [46, 49]. Also the corrected item-to-total correlation (CITC) for all items in these subscales was > 0.2, which confirmed that each item belonged to its corresponding subscale [25, 26]. CITC is another measure of internal consistency and values between .2 to .4 are indicative that the items in the subscales are good measure of the corresponding construct [26, 50].

The study was not without limitations. We recruited participants in a ratio of 1:10 for the items in a questionnaire, which is considered adequate-to-good for the sample size. However, it is generally accepted that a larger sample size is better [17]. The sample size in ratios of 1:20 has been recommended [51]. Recruiting more participants may have yielded even better models. Another limitation of our study is that the confirmatory factor analysis was conducted in medical schools of mainly one country. However, teachers and students were from 77 and 12 medical colleges, respectively, experiencing different models of curricula. It is therefore expected that these questionnaires will be valid and reliable for different models of curriculum.

We advocate using these two questionnaires to identify issues in a curriculum that inhibit the achievement of quality standards. We further recommend that construct validity of the questionnaires be established in other countries, especially where the need for translation of the questionnaires will be required. To allow for difference in opinion of student and teachers about certain areas of the curriculum, we suggest further research to identify the reasons and their solutions for this difference in opinion, which can be a foundation for improving these questionnaires.


We have developed valid and reliable teacher and student questionnaires that can be used to identify the inhibitors of curriculum viability. These questionnaires can be used by medical colleges to identify the inhibitors that hamper the achievement of quality standards. This will help in proposing solutions to address the inhibitors and improve the quality of the curriculum and will be preventive in nature to prepare for possible issues.

Availability of data and materials

The data generated and analysed during the study are available on request.



Assessment Implementation Measure


Dundee Ready Educational Environment Measure


Health Education Learning Environment Measure


Johns Hopkins Learning Environment Scale


Doctor of Philosophy


  1. Pugsley L, Brigley S, Allery L, MacDonald J. Making a difference: researching master's and doctoral research programmes in medical education. Med Educ. 2008;42(2):157–63.

    Article  Google Scholar 

  2. Khan RA, Spruijt A, Mahboob U, van Merrienboer JJG. Determining 'curriculum viability' through standards and inhibitors of curriculum quality: a scoping review. BMC Med Educ. 2019;19(1):336.

    Article  Google Scholar 

  3. Khan RA, Spruijt A, Mahboob U, Al Eraky M, van Merrienboer JJG. Curriculum viability indicators: A Delphi study to determine standards and inhibitors of a curriculum. Eval Health Prof. 2020;163278720934164.

  4. Shahid R, Khan RA, Yasmeen R. Establishing construct validity of AMEET (assessment of medical educational environment by the teachers) inventory. JPMA. 2019;69(34).

  5. Rusticus SA, Wilson D, Casiro O, Lovato C. Evaluating the quality of health professions learning environments: development and validation of the health education learning environment survey (HELES). Eval Health Prof. 2019;163278719834339.

  6. Shochet RB, Colbert-Getz JM, Wright SM. The Johns Hopkins learning environment scale: measuring medical students’ perceptions of the processes supporting professional formation. Acad Med. 2015;90(6):810–8.

    Article  Google Scholar 

  7. Sajjad M, Khan RA, Yasmeen R. Measuring assessment standards in undergraduate medical programs: development and validation of AIM tool. Pak J Med Sci. 2018;34(1):164–9.

    Article  Google Scholar 

  8. LCME. Standards, publications, & notification forms: LCME; 2020. p. 1–17.

    Google Scholar 

  9. Artino AR Jr, La Rochelle JS, Dezee KJ, Gehlbach H. Developing questionnaires for educational research: AMEE guide no. 87. Med Teach. 2014;36(6):463–74.

    Article  Google Scholar 

  10. Yusoff MSB. ABC of content validation and content validity index calculation. Educ Med J. 2019;11(2):49–54.

    Article  Google Scholar 

  11. Willis GB, Artino AR Jr. What do our respondents think we're asking? Using cognitive interviewing to improve medical education surveys. J Grad Med Educ. 2013;5(3):353–6.

    Article  Google Scholar 

  12. Rodrigues IB, Adachi JD, Beattie KA, MacDermid JC. Development and validation of a new tool to measure the facilitators, barriers and preferences to exercise in people with osteoporosis. BMC Musculoskelet Disord. 2017;18(1):1–9.

    Article  Google Scholar 

  13. Van Gog T, Paas F, Van Merriënboer JJ, Witte P. Uncovering the problem-solving process: cued retrospective reporting versus concurrent and retrospective reporting. J Exp Psychol Appl. 2005;11(4):237–44.

    Article  Google Scholar 

  14. Blair J, Brick PD. Methods for the analysis of cognitive interviews. In: Proceedings of the Section on Survey Research Methods, vol. 2010. Alexandria: American Statistical Association; 2010. p. 3739–48.

    Google Scholar 

  15. Haeger H, Lambert AD, Kinzie J, Gieser J. Using cognitive interviews to improve survey instruments. In: 2012: Association for Institutional Research Annual Forum; 2012.

    Google Scholar 

  16. Bentler PM, Chou C-P. Practical issues in structural modeling. Sociol Methods Res. 1987;16(1):78–117.

    Article  Google Scholar 

  17. Mundfrom DJ, Shaw DG, Ke TL. Minimum sample size recommendations for conducting factor analyses. Int J Test. 2005;5(2):159–68.

    Article  Google Scholar 

  18. Keusch F, Yang T. Is satisficing responsible for response order effects in rating scale questions? In: Survey Research Methods: 2018; 2018. p. 259–70.

    Google Scholar 

  19. Krosnick JA. Response strategies for coping with the cognitive demands of attitude measures in surveys. Appl Cogn Psychol. 1991;5(3):213–36.

    Article  Google Scholar 

  20. Krosnick JA. Questionnaire design. In: The Palgrave handbook of survey research: Springer; 2018. p. 439–55.

    Chapter  Google Scholar 

  21. Thompson B. Exploratory and confirmatory factor analysis: American Psychological Association; 2004.

    Google Scholar 

  22. Altman D. Practical statistics for medical research. London, England [Google Scholar: Chapman and Hall; 1991. p. 404.

    Google Scholar 

  23. Streiner D, Norman GR, Cairney J. Health measurement scales: a practical guide to their development and use. Aust NZJ Public Health. 2016.

  24. Taber KS. The use of Cronbach’s alpha when developing and reporting research instruments in science education. Res Sci Educ. 2018;48(6):1273–96.

    Article  Google Scholar 

  25. Everitt B, Skrondal A. The Cambridge dictionary of statistics. Cambridge: Cambridge University Press; 2010.

    Book  Google Scholar 

  26. Cohen RJ, Swerdlik M, Sturman E. Psychological testing and assessment: an introduction to tests and. Measurement. 2004.

  27. Alavi M, Visentin DC, Thapa DK, Hunt GE, Watson R, Cleary M. Chi-square for model fit in confirmatory factor analysis. J Adv Nurs. 2020;76(9):2209–11.

    Article  Google Scholar 

  28. Ishiyaku B, Kasim R, Harir AI. Confirmatory factoral validity of public housing satisfaction constructs. Cogent Bus Manag. 2017;4(1):1359458.

    Article  Google Scholar 

  29. Loda T, Erschens R, Nikendei C, Giel K, Junne F, Zipfel S, et al. A novel instrument of cognitive and social congruence within peer-assisted learning in medical training: construction of a questionnaire by factor analyses. BMC Med Educ. 2020;20(1):214.

    Article  Google Scholar 

  30. Forza C, Filippini R. TQM impact on quality conformance and customer satisfaction: a causal model. Int J Prod Econ. 1998;55(1):1–20.

    Article  Google Scholar 

  31. Hopwood CJ, Donnellan MB. How should the internal structure of personality inventories be evaluated? Personal Soc Psychol Rev. 2010;14(3):332–46.

    Article  Google Scholar 

  32. Islam MN, Furuoka F, Idris A. The impact of trust in leadership on organizational transformation. Glob Bus Organ Excell. 2020;39(4):25–34.

    Article  Google Scholar 

  33. Marsh HW, Hocevar D. Application of confirmatory factor analysis to the study of self-concept: first-and higher order factor models and their invariance across groups. Psychol Bull. 1985;97(3):562–82.

    Article  Google Scholar 

  34. Kawakami N, Thi Thu Tran T, Watanabe K, Imamura K, Thanh Nguyen H, Sasaki N, et al. Internal consistency reliability, construct validity, and item response characteristics of the Kessler 6 scale among hospital nurses in Vietnam. PLoS One. 2020;15(5):e0233119.

    Article  Google Scholar 

  35. Roff S. The Dundee ready educational environment measure (DREEM)—a generic instrument for measuring students’ perceptions of undergraduate health professions curricula. Med Teach. 2005;27(4):322–5.

    Article  Google Scholar 

  36. Bari A, Khan RA, Rathore AW. Postgraduate residents’ perception of the clinical learning environment; use of postgraduate hospital educational environment measure (PHEEM) in Pakistani context. J Pak Med Assoc. 2018;68(3):417–22.

    Google Scholar 

  37. Shehnaz SI, Premadasa G, Arifulla M, Sreedharan J, Gomathi KG. Development and validation of the AMEET inventory: an instrument measuring medical faculty members’ perceptions of their educational environment. Med Teach. 2015;37(7):660–9.

    Article  Google Scholar 

  38. Könings KD, Seidel T, Brand-Gruwel S, van Merriënboer JJ. Differences between students’ and teachers’ perceptions of education: profiles to describe congruence and friction. Instr Sci. 2014;42(1):11–30.

    Article  Google Scholar 

  39. Scantlebury K, Boone W, Kahle JB, Fraser BJ. Design, validation, and use of an evaluation instrument for monitoring systemic reform. J Res Sci Teach. 2001;38(6):646–62.

    Article  Google Scholar 

  40. Kim H, Ku B, Kim JY, Park YJ, Park YB. Confirmatory and exploratory factor analysis for validating the phlegm pattern questionnaire for healthy subjects. Evid Based Complement Alternat Med. 2016;2016:2696019.

    Google Scholar 

  41. Al Ansari A, Strachan K, Hashim S, Otoom S. Analysis of psychometric properties of the modified SETQ tool in undergraduate medical education. BMC Med Educ. 2017;17(1):56.

    Article  Google Scholar 

  42. Koohpayehzadeh J, Hashemi A, Soltani Arabshahi K, Bigdeli S, Moosavi M, Hatami K, et al. Assessing validity and reliability of Dundee ready educational environment measure (DREEM) in Iran. Med J Islam Repub Iran. 2014;28:60.

    Google Scholar 

  43. Yusoff MSB. Stability of DREEM in a sample of medical students: A prospective study. Educ Res Int. 2012;2012:509638.

    Article  Google Scholar 

  44. Field A. Discovering statistics using IBM SPSS statistics. 5th ed: Sage; 2018.

    Google Scholar 

  45. Itani L, Chatila H, Dimassi H, El Sahn F. Development and validation of an Arabic questionnaire to assess psychosocial determinants of eating behavior among adolescents: a cross-sectional study. J Health Popul Nutr. 2017;36(1):1–8.

    Article  Google Scholar 

  46. Cortina JM. What is coefficient alpha? An examination of theory and applications. J Appl Psychol. 1993;78(1):98–104.

    Article  Google Scholar 

  47. Sijtsma K. On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika. 2009;74(1):107–20.

    Article  Google Scholar 

  48. Schmitt N. Uses and abuses of coefficient alpha. Psychol Assess. 1996;8(4):350–3.

    Article  Google Scholar 

  49. Loewenthal KM, Lewis CA. An introduction to psychological tests and scales: Psychology Press; 2018.

    Book  Google Scholar 

  50. Piedmont RL. Inter-item Correlations. In: Michalos AC, editor. Encyclopedia of Quality of Life and Well-Being Research. Dordrecht: Springer Netherlands; 2014. p. 3303–4.

    Chapter  Google Scholar 

  51. Hair J, Anderson RE, Tatham RL, Black W. Multivariate data with readings. US America: Prentice Hall Inc; 1995.

    Google Scholar 

Download references


The authors thank all the students, teachers, and experts for their valuable time and contribution. We are also extremely thankful to Ms. Pamela Walter from Scott Memorial library, Jefferson University, Philadelphia, USA for her comments on the academic writing that helped in improving the manuscript.


There were no funding or grants from any source for the study.

Author information

Authors and Affiliations



RAK, AS, UM, and JVM conceived and designed the study. RAK did the data collection and initial analysis. RAK, AS, and UM conducted cognitive interviews. JVM, AS, UM, and MAL helped in preparing the manuscript by providing feedback. The authors read and approved the final manuscript.

Authors’ information

Rehan Ahmed Khan is an Assistant Dean Medical Education and Professor of Surgery at Riphah International University, Pakistan. His interests include curriculum innovation, implementation, and evaluation. He has done a Master’s in medical education from the University of Glasgow and PhD in medical education from Maastricht University.


Annemarie Spruijt is an Assistant professor at Utrecht University who has a background in veterinary medicine and did her PhD in medical and veterinary education. She takes a special interest in curriculum design, improving the quality of medical and veterinary education, and in small-group learning.


Usman Mahboob is Director of the Institute of Health Professions Education & Research (IHPER) at the Khyber Medical University, Pakistan. He is a medical doctor by profession and did his PhD in health professions education from the University of Glasgow, UK. His research interests are professionalism, approaches to teaching and learning, and curriculum development.


Mohamed Al Eraky is an Assistant Professor of Medical Education and Director of Academic Initiatives at the Vice-Presidency for Academic Affairs, Imam Abdulrahman Bin Faisal University, Saudi Arabia.

ORCID 0000–0003- 2015-7630.

Jeroen J. G. van Merrienboer is a full professor of Learning and Instruction and Research Director of the School of Health Professions Education at Maastricht University, the Netherlands. His research focuses on instructional design, the use of ICT in education, and the development of professional competencies.


Corresponding author

Correspondence to Rehan Ahmed Khan.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Institutional Review Committee at Riphah International University (Appl. # Riphah/IRC/18/0394). Written informed consent was taken from all the participants.

Consent for publication

Not Applicable.

Competing interests

Dr. Usman Mahboob (co-author) is a member of the editorial board of BMC Medical Education. The remaining authors declare that they have no competing interests. The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Appendix A. Teacher questionnaire and its modification based on content validity. Appendix B. Student Questionnaire and its modification based on content validity. Appendix C. Response process validity. Appendix E. Student questionnaire (Final version).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Khan, R.A., Spruijt, A., Mahboob, U. et al. Development and validation of teacher and student questionnaires measuring inhibitors of curriculum viability. BMC Med Educ 21, 405 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Curriculum
  • Standards
  • Evaluation
  • Viability inhibitors
  • Construct validity