Data source
We obtained all-payer discharge data from January 1, 2003, through December 31, 2008, via the Nationwide Inpatient Sample (NIS) of the Healthcare Cost and Utilization Project of the Agency for Healthcare Research and Quality. The NIS—the largest source of all-payer hospital discharge information in the United States—contains data from approximately 7 million to 8 million hospital stays per year in 1000 hospitals in over 30 states [4]. The hospitals sampled can vary from year to year but the sample approximates 20% of US community hospitals including large university hospitals and smaller regional facilities. The database provides information regarding patient demographics, socioeconomic factors, admission profiles, hospital profiles, state codes, discharge diagnoses, procedure codes, total charges, and vital status at hospital discharge. Along with other hospital discharge databases, the NIS has been used to review trends in surgical care and outcomes [5], volume outcome relationships [6], and disparities in care [7]. A data use agreement is held by the Agency for Healthcare Research and Quality, and our study was considered exempt by the Lahey Clinic Institutional Review Board.
The American Hospital Association (AHA) Annual Survey of Hospitals database was obtained in order to determine facility structural characteristics, service lines, staffing, and the presence of resident trainees at each hospital. The AHA database contains hospital-specific information on over 6000 hospitals and over 450 health-care systems, including 700 data elements [8]. The purpose of the AHA database is to generate a comprehensive and inclusive overview of hospitals while permitting the tracking of hospital performance over time. AHA data have been extensively used to study hospital-based outcomes [9], hospital policies [10], and reimbursement [11].
Study population
All patients discharged during the time frame sampled were included (both medical and surgical patients). We used the elective variable to exclude all patients with an admission for elective reasons and included only those patients with nonelective admission [4]. Thus, patients with emergency and urgent indications for admission were included in our study.
Admission day
The data set permits identification of admission day as a weekend or weekday. We recorded this variable as admitted during a weekend (i.e., Saturday or Sunday) or a weekday (i.e., Monday through Friday) [1, 4].
Covariates
Our analysis adjusted for the following covariates: age, sex, race, income level, payer, major diagnostic categories (subgroupings of diagnosis-related groups),1 and the Charlson comorbidity index score. Age was included as a continuous variable. Sex was entered as a dichotomous variable. Race was divided into white, black, Hispanic, Asian or Pacific Islander, Native American, or other. Income level was categorized into quartiles per estimated median household income of residents in the patient’s zip code [4]. The median income quartiles are classified as follows: $0 to $38 999, $39 000 to $47 999, $48 000 to $62 999, and $63 000 or more [4].
Payer was recorded as follows: Medicare, Medicaid, private including health maintenance organization, self-pay, no charge, or other [4]. Major diagnostic categories were used to adjust for diagnoses and reflect larger groupings of diagnostic-related groups made available in the provided data set and downloadable for review from the US Department of Health and Human Services, Centers for Medicare and Medicaid Services [12]. Major diagnostic categories have been used to evaluate hospitalization risk [13], mortality risk [14], and other outcomes [15]. We also evaluated comorbidity with the Deyo modification of the Charlson comorbidity index [16]. Briefly, we ascertained the presence of 17 comorbid conditions and then weighted them according to the original report. An elevated Charlson comorbidity index score has been demonstrated to correlate with higher mortality rate [17].
Hospital bed size categories were obtained from the American Hospital Association Annual Survey of Hospitals and based on the number of short-term acute care beds.
Staffing
Staffing levels were obtained from the American Hospital Association Annual Survey of Hospitals [8]. We analyzed the role of full-time registered nurses and full-time physicians on mortality by developing ratios of either nurse or physician per hospital bed. We categorized these two variables into tertiles, low, medium, or high.
Presence of resident trainees
The teaching status of the hospital was obtained from the American Hospital Association Annual Survey of Hospitals [8]. Presence of resident trainees was categorized into tertiles. Given that half of all facilities had no residents, this was the lowest tertile. The middle tertile included 1–26 resident trainees. The highest tertile included greater than or equal to 27 resident trainees.
Outcome
The data set permits identification of vital status at the time of discharge. The variable is coded as died during hospitalization or did not die during hospitalization. Deaths that occurred after discharge are not identifiable from our data set [4].
Statistical analysis
Statistical analyses were performed using SAS statistical software, version 9.2 (SAS Institute Inc, Cary, North Carolina). We analyzed univariate associations with patient admission day (weekend vs. weekday) using t tests for continuous variables and χ2 tests for categorical variables. Results were considered statistically significant at p < 0.05, and all statistical tests were 2-tailed. We included all covariates in our regression model. The analyses were conducted with and without missing data. To confirm results, we performed imputation of missing data using the multiple imputation procedure from SAS Institute Inc [18]. Imputation substitutes missing values with plausible values that characterize the uncertainty regarding the missing data [19, 20]. This process results in valid statistical inferences that properly reflect the uncertainty due to missing values, for example, confidence intervals with the correct probability coverage. The multiply imputed dataset was then analyzed by using standard logistic regression for the complete data.
We tested for interactions between staffing levels, resident trainees and admission day on mortality in the regression analysis.