A multicenter mixed-effects model for inference and prediction of 72-h return visits to the emergency department for adult patients with trauma-related diagnoses

Objective Emergency department (ED) return visits within 72 h may be a sign of poor quality of care and entail unnecessary use of healthcare resources. In this study, we compare the performance of two leading statistical and machine learning classification algorithms, and we use the best performing approach to identify novel risk factors of ED return visits. Methods We analyzed 3.2 million ED encounters with at least one diagnosis under “injury, poisoning and certain other consequences of external causes” and “external causes of morbidity.” These encounters included patients 18 years or older from across 128 emergency room facilities in the USA. For each encounter, we calculated the 72-h ED return status and retrieved 57 features from demographics, diagnoses, procedures, and medications administered during the process of administration of medical care. We implemented a mixed-effects model to assess the effects of the covariates while accounting for the hierarchical structure of the data. Additionally, we investigated the predictive accuracy of the extreme gradient boosting tree ensemble approach and compared the performance of the two methods. Results The mixed-effects model indicates that certain blunt force and non-blunt trauma inflates the risk of a return visit. Notably, patients with trauma to the head and patients with burns and corrosions have elevated risks. This is in addition to 11 other classes of both blunt force and non-blunt force traumas. In addition, prior healthcare resource utilization, patients who have had one or more prior return visits within the last 6 months, prior ED visits, and the number of hospitalizations within the 6 months are associated with increased risk of returning to the ED after discharge. On the one hand, the area under the receiver characteristic curve (AUROC) of the mixed-effects model was 0.710 (0.707, 0.712). On the other hand, the gradient boosting tree ensemble had a lower AUROC of 0.698 CI (0.696, 0.700) on the independent test model. Conclusions The proposed mixed-effects model achieved the highest known AUC and resulted in the identification of novel risk factors. The model outperformed one of the leading machine learning ensemble classifiers, the extreme gradient boosting tree in terms of model performance. The risk factors we identified can assist emergency departments to decrease the number of unplanned return visits within 72 h.


Introduction
Emergency departments across the USA are continually working on improving the quality of care as measured by health outcomes of patients, overall patient experience, and reduction in cost to both patients and facilities. These emergency department (ED) facilities are seeing annual increases in patient census that may impact the quality of care [1][2][3]. This increase in ED utilization is coupled with existing issues of overcrowding to exacerbate the challenges of providing a high quality of care and reducing both morbidity and mortality [4][5][6][7][8][9][10][11]. These challenges are complex and multifaceted and are further worsened by return visits to the ED that are avoidable. Consequently, the rate of return visits to the ED within 72 h of a previous discharge is being used as a metric for quality of care in the ED [12][13][14][15]. Return visits to the ED may be reflective of poor quality of care but may also be caused by latent illnesses and misdiagnoses [16], unrelated new problems [17], perceived inability to access timely follow-up care, and patient uncertainty or fear about disease progression [18].
Patients with trauma/injuries have particularly high rates of potentially unnecessary return visits, with over 43.1% of corresponding revisits estimated as being avoidable in this group [12]. It is therefore important to understand and address the factors associated with return visits within this population. Several attempts have been made to address this issue, including studies focused on the role of patient demographics and socioeconomic status, mode of transportation, and level of trauma activation [19]. Further attempts have been made with a focus on patients with head injuries [20].
In this study, we specifically explored new variables in search of novel risk factors associated with ED returns for patients with trauma-related codes as captured by the International Classification of Diseases, Tenth Revision (ICD-10-CM) codes of S00-T79 (injury, poisoning and certain other consequences of external causes) and V00-Y99 (external causes of morbidity). There have been no comprehensive studies on the prediction of ED return visits among patients with trauma/injuries. Existing studies analyzed risk factors for presentation to the ED after the discharge of trauma patients from the hospital [21] and the effect of head trauma on risk of ED return within 72 h [20]. The objective of this study is to address this important problem and explore novel risk factors of ED revisits and design a corresponding prediction model. We compared the performance of advanced statistical methods and a high accuracy machine learning algorithm to determine the optimal classification model. We provide an assessment of model performance with recommendations on the potential implementation of the corresponding predictive models in the ED. The new variables we considered include several measures of current and past healthcare resource utilization that have been found to be associated with the related problem of hospital readmission [22][23][24].

Methods
This study was approved by CHOC Children's Hospital Institutional Review Board (IRB 180857).

Study design and setting
The data source for the study is the Cerner Health Facts database (referred to as Health Facts DB from here on). The Health Facts DB consists of data captured by the Cerner Corporation from over 100 US healthcare systems and over 650 facilities (in 2018) that is aggregated and organized into consumable datasets to facilitate research and reporting. It consists of clinical database tables that include information on ED visits, diagnoses, and medications. The descriptive and predictive multicenter models developed in this study were built using a subset of data from the database based on a priori inclusion criteria. An extensive analysis of a prior version of the database has been conducted with recommendations on its use [25].

Selection of participants, measurements, and outcomes
We retrieved emergency department admission on patients 18 years or older from ED facilities in the USA from the Health Facts DB. We included EDs that contributed to the key database tables for the study (encounters, diagnoses, and medications tables) and have seen a large number of patients (set a priori at 10,000). These inclusion criteria ensured both the exclusion of potentially noisy data and the inclusion of large sample centers. We included multiple index encounters and revisits within 72 h for individual patients, and each encounter that was itself a revisit within 72 h was treated as an index encounter for estimating subsequent revisit to the ED. We included demographic variables as well as proxies for socioeconomic status, prior ED and hospital utilization variables, diagnoses, and the total number of medications administered during the ED visit.

Analysis
We categorized the ages of the patients based on the distribution of readmission rates by age, as shown in Fig. 1. We excluded very sparse variables (defined a priori as having less than 1000 responses) to prevent issues with statistical separation [26]. Sparse outcomes or analysis of rare events requires exact statistical tests [27].
We assessed multicollinearity by estimating the generalized variance inflation factor [28,29] (GVIF) of the variables. In a stepwise process, we excluded the variable with the highest GVIF and reassessed multicollinearity until the GVIF of all variables kept is below 4-a rule of thumb threshold based on the previous studies [23]. We randomly split the data into two: 50% for model and the other 50% for evaluating model performance. We implemented a mixed-effects logistic regression model and gradient boosting tree ensemble [30,31]. We conducted variable selection on the random intercept model using stepwise minimization of the Akaike Information Criteria and grid search for hyperparameter tuning on the gradient boosting algorithm. We assessed model performance using the area under the receiver operator characteristic curve (AUROC) and sensitivity and positive predictive value at a specificity of 0.90. Analyses were carried out using Apache Spark [32,33], the R Statistical Computing Programming Language [34], and Python [35].

Characteristics of study subjects
A total of 128 ED facilities met the inclusion criteria resulting in 2.2 million patients and 3.2 encounters. Each facility contributed data from different periods of time between 2000 and 2017, and the average number of years of data from the facilities is 7.6 years with a standard deviation of 2.7 years. There were 64.8, 23.8, and 11.4% of patients with ages 18-49, 50-69, and 70 years or older, respectively. Note that the process of deidentification included a requirement to specify the age of patients older than 90 years as 90 to reduce the possibility of reidentification of patients by age. Patient sex consisted of 50.9% males, 49.0% females, and the remaining of unknown sex, while 68.3, 17.5, and 14.2% were Caucasian, African American or Black, and other races and ethnicities. The overall rate of 72-h return visit to the ED is 0.037.
There were 1.6 million encounters in the training dataset after splitting the data into two halves. In Tables 1 and 2, we provide the summary statistics on the training dataset, which includes all 57 variables we considered during model development. We excluded  variables capturing surgical procedures on the endocrine, hemic/lymphatic, and mediastinum/diaphragm systems due to sparsity and potential problems with statistical separation and multicollinearity.

Main results
Our results indicate that the highest risk factors attributable to the type of trauma/injuries include certain early complications of trauma such as embolisms and traumatic compartment syndrome (ICD 10 CM: T79); burns and corrosions (T20-T32); certain effects of external causes such as hypothermia, asphyxiation, and abuse (T66-T78); poisoning due to medical and biological substances (T36-T50); and injuries to the head (S00-S09). Patients with early complications of trauma have 120% increase in odds of return visit; patients with burns and corrosions, effects of external causes (such as hypothermia, asphyxiation, and abuse), poisoning, and injuries to the head have an   Table 3 with corresponding odds ratios and 95% confidence interval. We also found that certain injuries/traumas are associated with reduced odds of a return visit. Injuries to the shoulder and upper arm (S40-S49), injuries to the ankle and foot (S90-S99), injuries to the thorax (S20-S29), and effects of foreign body entering through natural orifice (T15-T19) have 5, 5, 8, and 14% decrease in the odds of a return visit. The effect of other trauma variables not captured in Table 2 did not achieve statistical significance.
In addition to these findings on patient demographics, proxies for socioeconomic status, proxies for healthcare utilizations, and certain comorbidities were associated with the risk of a return visit. Older patients have increased odds of a return visit. There is a 20% increase in odds of return visits for male patients compared to female patients. African American/Black patients, as well as patients of Hispanic origins, have 5 and 13% drop in the odds of a return visit compared to Caucasians. Patients with health insurance type other than commercial have increased odds of a return visit. Compared to patients discharged from the ED within the first hour, patients with ED length of stay between 1 and 12 h and those with greater than 12 h length of stay have 35 and 74% increase in odds of a return visit. The number of previous hospitalizations, previous ED visits, and previous return visits to the ED within the last 6 months were all risk factors of a subsequent return visit to the ED. Patients with previous hospitalizations have 26 to 61% increase in odds, those with previous ED visits have 30 to 114% increase in odds, and those with previous return visits have 38 to 301% increase in odds of a subsequent return visit. Furthermore, when a patient experiences a return visit and is discharged home, the odds of a subsequent return visit increases by 50%.
Lastly, patients with comorbidities relating to the circulatory system (I00-I99), the nervous (G00-G99) systems, or arising from complications of surgical and medical care (T80-T88) have 3, 9 and 36% increase in the odds of a return visit respectively.
The AUROC of the mixed-effects model was 0.710 (0.707, 0.712), while the AUROC of the gradient boosting tree ensemble (the machine learning algorithm) was lower at 0.698 CI (0.696, 0.700). In Table 4, we express the performance of the model at specificities between 55 and 95% inclusive. We suggest three risk strata: high risk for patients with predicted probabilities greater than 0.0604 (nearly twice the overall rate of return visits), moderate-risk patients with predicted probabilities between 0.0417 and 0.0604, and low-risk patients with predicted probabilities less than 0.0417 (just slightly higher than the baseline risk). We expect over 50% of all at-risk patients (for return visits to the ED within 72 h) to be captured in the high and moderate risk strata with an overall number needed to evaluate (NNE) of 12.

Limitations
There are, however, some limitations in the data/database used. Proper analyses of the reasons patients return to the ED were not considered given the multi-center nature of the dataset, the absence of clinical notes, and the large sample sizes. We relied on diagnostic codes often riddled with data entry errors and inconsistency of use between providers and institutions. These limitations have a lesser impact as the size of the overall dataset increases. Consequently, we believe that these limitations may have a negligible impact on inference as a result of the very large sample sizes used. Furthermore, variations in clinical care across different EDs in the USA are compensated for using a mixed-effects model with the EDs as random intercepts.

Discussion/conclusion
ED return visits within 72 h of discharge in adults may be a result of poor quality of care, poor patient education on the use of ED, poor social determinants of health, and complex psychological/psychosocial influences. The underlying causal factors for these return visits have not been formally established, so we rely on statistical associations in order to better identify high-risk patients. In some cases, return visits are unpreventable, such as a patient returning for reasons unrelated to the initial visit, unforeseen deterioration of health unrelated to the quality of care received during the initial visit, and patient misuse of the ED, among others [16][17][18]21]. But identification of factors associated with high risk of return visits may help in the identification of high-risk patients for targeted intervention, especially in the presence of scarce and expensive clinical resources for such interventions. In this study, we used mixed-effects regression to explore new variables in search of novel risk factors associated with ED returns for adult patients visiting the ED for trauma (or trauma-related conditions). We assessed the effect of the type of trauma, demographics, and proxies for socioeconomic status, prior ED and hospital utilization variables, diagnoses, and the total number of medications administered during the ED visit.  Our results indicate that both non-blunt and blunt traumas to certain regions of the body are associated with increased odds of a return visit to the ED. The non-blunt traumas include early complications of trauma (such as air/fat embolism, traumatic shock, and traumatic compartment syndrome), burns and corrosions, trauma due to external causes such as hypothermia or asphyxiation, and poisoning resulting by medicaments and biological substances. Factors such as poisoning/adverse effects of medications and nonmedicinal sources and trauma due to external causes (such as hypothermia and asphyxiation) may require a more strategic approach to be impactful.
Blunt traumas to certain body regions of the head, hand, knee, legs, and abdomen/lower back/external genitals were associated with increased odds of a return visit. On the one hand, injuries to the head are often serious and/or alarming due to the potential for death, traumatic brain injury, concussion, and post-concussive syndrome. The gravity or morbidity associated with head injuries may result in higher odds for return visits among patients discharged home. Potential improvement in the quality of care and post-discharge follow-up of patients with head injuries may be achieved with a model such as the mixed-effects model we developed here by careful design of intervention protocols on the discharge education of patients with head injuries. On the other hand, injuries to the hand and regions of the legs may impede mobility and dexterity and are easy to aggravate in the attempt to return to routine daily activities. This is a case where education of patients on the need for rest as well as the risk of a return visit may be helpful. Regardless of the cause, patients with these risk factors may benefit the most from education, social services interventions, and interventions aimed at ameliorating the effects of poor social determinants of health. We note that misdiagnosis captured and left uncorrected in the EMR would be captured in the data used for the study. But the size of the data would guarantee that misdiagnoses are ignorable noise in the study.
The result on prior healthcare utilization variables indicates that patients who may be suffering from complex chronic conditions and/or who have easier access to the healthcare system have higher odds of a return to the ED after the index visit. Patients with chronic conditions are likely to be more educated about the healthcare system (due to frequent utilization), and their return visits are expected to be due to exacerbation of health and unexpected complications due to underlying conditions. These patients are likely to have the highest proportion of unpreventable revisits within 72 h. But this also calls to question the proper management of their chronic conditions and the role of primary care physicians in chronic disease management.
Results on demographics and social determinants indicate that patients from higher socioeconomic families (as captured by health insurance type) have a higher risk of a return visit. While the reason for this association is not clear, we surmise that the use of the ED may be associated with having the means to pay, to be transported, and to spend time away from work or other daily activities. This means that there may be challenges to access of care of patients from lower socioeconomic status. We found a sex difference in the risk of returning to the ED with male patients more likely to have a return visit as well as older patients (compared to their younger peers). The result on difference in sex is expected under the assumption that male patients are more likely to engage in physical (or more physically strenuous) activities that may result in exacerbation of injuries. The result on difference in risk due to age (with older patients more likely to return to the ED) does not lend itself to easy explanations even though we expect that older patients have more complex conditions. We would also expect that older patients are more likely to be admitted to the hospital from the ED, and more care may be taken by providers before discharge home directly from the ED.
The mixed-effects model can be used to rank patients on the predicted probability of returning after discharge. Patients who rank most at-risk for a return visit can be intervened on in any number of the following ways. First, identified risk factors may guide more detailed evaluation of the patient at the initial encounter. Second, additional discharge instructions may be provided based on patient conditions, factors that may result in deterioration of health after discharge, and more detailed postdischarge plans to mitigate unnecessary utilization of the ED. Third, the pre-discharge discussion may facilitate the transition to primary care providers and identify those patients who do not have adequate primary care access. Fourth, proper post-discharge phone calls in cases where appropriate may resolve many issues without unneeded visits while identifying those who need prompt reassessment. These four intervention opportunities are expensive on resources, but targeted interventions based on patients the model predicts to be most at-risk may provide the most impact in the improvement of the overall care of patients.
These risk factors, coupled with the high predictive power of the mixed-effects model (as measured by its AUROC of 0.710), indicate that the model may possess strong clinical utility. The mixed-effects model performed better than the machine learning model, most likely due to the appropriateness of a mixed-effects model in this data/study settings. Most studies on ED return visits have low model performance due to the complexity of reasons for return visits (which may include non-clinical factors not captured in the EMR). Consequently, attention should be paid to the novel variables used in the model in an attempt to improve on existing models. Our findings include novel variables on the type of trauma as well as various patterns of past healthcare utilization. We believe that this model would serve great clinical utility and may help in the identification of proper intervention protocols to reducing unnecessary utilization of the ED. We believe it is of importance in 3 ways: (1) as a simple indication that there are simple risk factors (certain type of traumas) for which the risk of a return visit is high. This informs the ER provider, but the result is a long list that we would not expect providers to memorize among all the important facets for patient care. (2) The findings and corresponding models are meant to be implemented in an electronic and automated system within the EMR. This way, providers do not need to memorize or recall any of the results of the study unless a patient is at high risk of a return visit. Such automated system would include the risk factors contributing to the patient's high risk. And (3) this work provides incremental addition to literature from which other investigators and researchers can build on. Authors' contributions LE, WF, and DG conceived of the study. EY, LE, and CR conducted statistical and machine learning development. WF and DG provided clinical interpretation and guidance. All authors contributed to the drafting of the manuscript and its revisions. The author(s) read and approved the final manuscript.

Funding None
Availability of data and materials The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Ethics approval and consent to participate
This study was approved by the CHOC Children's Institutional Review Board (IRB 180857).

Consent for publication
All authors approve the submission of the article.