Study design
This study was an observational design. The REporting of studies Conducted using Observational Routinely-collected Data (RECORD) Checklist was utilized to guide the reporting of this study.
Data source
The data used in the study, the most recently organized results from the CAPTE AAR, was obtained from CAPTE-accredited DPT programs that graduated students in 2017. The CAPTE AAR provided institutional and general information about the accredited DPT programs, including characteristics pertaining to the institution, program, and faculty. The Institutional Review Board of the University of North Carolina (ID #18–3059) determined that the data does not constitute human subjects research. Thus, informed consent was waived under federal regulations [45 CFR 46.102 (d or f) and 21 CFR 56.102 (c)(e)(l)].
Institutions information
The institutional data comprised the 203 DPT programs within the US that were accredited at the time of the study. Identifiers were masked for the institutions in the data set to conceal their identities, such that investigators could not link an entry to a specific institution. Selective data about each institution was provided based on the core faculty for the DPT program, defined by the AAR as “those individuals appointed to and employed primarily in the program … with the responsibility and the authority related to the curriculum” [16]. The AAR includes the number of peer reviewed publications, amount of NIH funding (to include the total across all departments receiving money and years), and number of faculty members holding grants, all of which represent research productivity. All data was program-level information, as no individual student data was provided.
Variables used in the modeling
The CAPTE AAR provided a comprehensive list of variables that had the possibility to be included in this analysis. However, because we were interested in the variables that could be indicators of research productivity, only those reflective of institutional, program, and faculty characteristics were considered. An additional glossary describes these terms in more detail (see Additional file 1 [2, 17]).
Independent variables
Variables were chosen that were a) reflective of the institutions, b) reflective of the institutions’ program format, and c) reflective of faculty characteristics. These variables were separated into institutional, program and faculty characteristics for interpretation of the analyses.
Institutional characteristics
The institutional characteristics included four variables: Carnegie classification (research universities/doctorate and research, all others [other health professional schools, medical schools, and medical centers and masters colleges and universities]), private status, traditional institution type, and student body size (whole institution). All variables were reported as categorical data, dichotomized as follows: a) Carnegie classification (research universities/doctorate and research versus all others), b) private status (private versus public), c) traditional institution status (traditional [Academic Health Science Center, Liberal Arts College (4-year), and Liberal Arts University] versus all others [Proprietary, Osteopathic Medical School, Professional and Technological University]) and d) student body size (> 10,000 versus < 10,000).
Program characteristics
The program characteristics included nine variables: > 6 year program format, number of terms, total program length, number of credits, classroom education hours, hybrid curriculum, operating budget, total number of courses, and square footage of research space. Program format, operating budget and type of curriculum were reported as categorical data, dichotomized as follows: a) > 6 year program format (> 6 years, 6 years), b) operating budget (interquartile range upper 25%, interquartile range lower 75%) and c) curriculum model (traditional versus all others [hybrid, systems-based, problem-based, modified problem-based, guide-based]). Number of terms, total program length, number of credits, classroom education hours, total number of courses and square footage of research space were reported as continuous data by mean and standard deviation (SD).
Faculty characteristics
The faculty characteristics included four variables: total number of vacancies, faculty turnover, faculty to student ratio, and total full-time equivalents. All four variables were reported as continuous data by mean and SD.
Dependent variables (outcome variables of interest)
In order to describe the scholarly culture of the institutions in the dataset from CAPTE, the outcomes representing a proxy for productivity were considered, including peer reviewed publications, National Institutes of Health (NIH) funding, and total number of faculty within the DPT program who reported they had grants were each standardized per core faculty member. Stratification was performed for these three outcomes to separate the top 25% from the bottom 75%, with the intent to capture the highest performing research-intensive institutions in the upper quartile [18]. To our knowledge, no single value in the literature provides a meaningful cutoff to represent research productivity. Indeed, only a few physical therapy programs report high productivity, and within the AAR a median score of ‘zero’ is present for publications and NIH funding. Consequently, instead of splitting the variables at the median [19, 20], we elected the same method (quartile rank/25–75%) used in previous medical research [21,22,23]. These three outcomes were used to associate research productivity to variables in the dataset that might modify the outcomes.
Statistical analysis methods
All of the analysis used in this study was performed in using Statistical Package for the Social Sciences (SPSS) version 25.0 (IBM Corp. Armonk, NY, USA). Descriptive statistics (mean/SD and frequencies) were used to summarize the institutional, program and faculty characteristics (both for dependent and independent variables). To assess multicollinearity in the modelling a correlation analyses were performed between the 17 independent variables. Correlation was defined as a negligible positive or negative correlation (r = 0.00 to 0.30), low positive or negative correlation (r ≥ 0.30 to 0.50), moderate positive or negative correlation (r ≥ 0.50 to 0.70), high positive or negative correlation (> 0.70 to 0.90) and very high positive or negative correlation (> 0.90 to 1.0) [24, 25]. We decided to remove variables with correlations ≥0.70 to not influence models. A p-value < 0.05 was considered significant.
A comparative analysis was performed to relate the independent variables with the outcomes reflective of research productivity. Univariate logistic regression analyses were performed between the three outcome measures and each of the 17 independent variables. For each univariate analysis of the independent variables, p-values, odds ratios (OR), 95% confidence intervals (CI) and the percentage of explained variance (Nagelkerke’s R2 values) were reported. An OR measures the association between an outcome and exposure to a particular variable, with values > 1.0 indicating that exposure is associated with higher odds of the outcome [26]. Nagelkerke’s R2 values are similar to an R-squared value and indicate the power of the model [27]. Associations in univariate analyses with significant p-values were considered in a multivariate backward stepwise logistic regression analyses for each of the three outcome measures of research productivity. In each of these three multivariate models, a p-value < 0.05 was considered significant. We adopted the criterion proposed to Harrell et al., equivalent to a minimum of 10 to 20 events per variable for logistic regression [28, 29].