Development of a Clinical Decision Rule for the Early Safe Discharge of Patients with Mild Traumatic Brain Injury and Findings on Computed Tomography Brain Scan: A Retrospective Cohort Study

International guidelines recommend routine hospital admission for all patients with mild traumatic brain injury (TBI) who have injuries on computed tomography (CT) brain scan. Only a small proportion of these patients require neurosurgical or critical care intervention. We aimed to develop an accurate clinical decision rule to identify low-risk patients safe for discharge from the emergency department (ED) and facilitate earlier referral of those requiring intervention. A retrospective cohort study of case notes of patients admitted with initial Glasgow Coma Scale 13–15 and injuries identified by CT was completed. Data on a primary outcome measure of clinically important deterioration (indicating need for hospital admission) and secondary outcome of neurosurgery, intensive care unit admission, or intubation (indicating need for neurosurgical admission) were collected. Multi-variable logistic regression was used to derive models and a risk score predicting deterioration using routinely reported clinical and radiological candidate variables identified in a systematic review. We compared the performance of this new risk score with the Brain Injury Guideline (BIG) criteria, derived in the United States. A total of 1699 patients were included from three English major trauma centers. A total of 27.7% (95% confidence interval [CI], 25.5–29.9) met the primary and 13.1% (95% CI, 11.6–14.8) met the secondary outcomes of deterioration. The derived clinical decision rule suggests that patients with simple skull fractures or intracranial bleeding <5 mm in diameter who are fully conscious could be safely discharged from the ED. The decision rule achieved a sensitivity of 99.5% (95% CI, 98.1–99.9) and specificity of 7.4% (95% CI, 6.0–9.1) to the primary outcome. The BIG criteria achieved the same sensitivity, but lower specificity (5%). Our empirical models showed good predictive performance and outperformed the BIG criteria. This would potentially allow ED discharge of 1 in 20 patients currently admitted for observation. However, prospective external validation and economic evaluation are required.


Introduction
O ver 1.4 million patients annually attend emergency departments (EDs) in the UK following head injury, of which 95% have a normal or mildly impaired conscious level at presentation-Glasgow Coma Scale (GCS) score of 13-15. 1 The majority of ED computed tomography (CT) scans for diagnosing TBI are conducted in these patients with apparently mild injury. In this group, the prevalence of brain injuries, skull fractures, and intracranial bleeding is 7%, while only 1% of CT scans identify life-threatening TBI. 2 The management of patients with mild TBI and injuries identified by CT imaging is controversial. Some centers advocate that all patients should be admitted under specialist neurosurgical care and undergo repeat CT imaging. 3,4 The Brain Injury Guideline (BIG) criteria, a consensus-derived risk tool currently used in some centers in the United States, advocate the discharge of selected GCS 13-15 patients from the ED with injuries on CT (Supplementary Material 1). 5 We recently published a systematic review of predictors of deterioration in this cohort identifying some single factors associated with deterioration, but there was no good empirical evidence to guide post imaging management in this group. 4 In England, national (National Institute of Health and Clinical Excellence) TBI guidelines recommend that patients with TBI identified by CT are admitted to the hospital. 1 However, they do not define which injuries are clinically significant and which patients benefit from specialist neurosurgical care. Other guidelines used internationally also recommend routine hospital admission for this group. 4 There has been a paucity of research to inform the admission and referral decisions for these TBI patients with apparently mild injuries, but abnormalities on CT scan. 6 Prediction modeling may help identify low-risk patients who could be safely discharged from the ED. Modeling may also facilitate earlier identification of patients requiring neurosurgical intervention.
The study aims were to: 1. Estimate the prevalence of clinically important deterioration in GCS 13-15 patients with traumatic CT abnormalities. 2. Develop prediction models for patient deterioration that could be used to inform hospital admission and specialist referral. 3. Compare the performance of an empirically derived prediction model with the BIG criteria.

Study design
We conducted a retrospective cohort study using case-note review of TBI patients presenting to the ED between 2010 and 2017 at three major trauma centers in England: Hull University Teaching Hospital NHS Trust, Salford Royal NHS Foundation Trust, and Addenbrooke's Hospital (Cambridge University Hospitals NHS Foundation Trust). A detailed study protocol has previously been published. 6 The study was conducted and is reported in accordance with international guidelines for prognostic research. 7

Study population
Population selection. Within each study center ED, CT brain scan requests and reports were screened to identify patients with traumatic findings presenting between 2010 and 2017. Patients were matched to case records and if meeting the inclusion criteria data were extracted on patient deterioration outcomes and candidate predictors (see below).

Inclusion criteria
Patients ‡16 years of age with a presenting GCS 13-15 who attended the ED after acute TBI and had injuries reported on CT brain scan were included. The latter was defined as: skull fractures, extradural haemorrhage, subdural haemorrhage with an acute component, intracerebral haemorrhage, contusions, subarachnoid hemorrhage, and intraventricular hemorrhage. Intracerebral, intraventricular, and subarachnoid hemorrhages were considered traumatic in etiology when a mechanism of injury or injuries indicating trauma were recorded.

Exclusion criteria
Patients were excluded where: a non-traumatic cause of intracranial hemorrhage was indicated, pre-existing CT abnormality prevented determining whether acute injury had occurred, and patients transferred from other hospitals.

Outcomes
Primary outcome. Deterioration up to 30 days after ED attendance was used, which was a composite including: death attributable to TBI, neurosurgery, seizure, a drop in GCS >1, intensive care unit (ICU) admission for TBI, intubation, or hospital readmission for TBI. Where reason for death, ICU admission, or readmission was unknown, it was attributed to TBI deterioration.
Secondary outcome. A composite measure indicating need for neurosurgical specialist admission was used, including: neurosurgery, ICU admission for TBI, or intubation up to 30 days after ED attendance.

Predictors
Pre-injury anticoagulant and -platelet therapy were combined in a variable with two categories: 1) no therapy and 2) use of either or both medications (exploratory multi-variable modeling indicated they had similar effect sizes). Comorbidity was measured using the trauma modified Charlson comorbidity index. 8 Rockwood Frailty Scale scores were assigned to patients >50 years of age using information in the case notes and data collapsed into established categories. 9,10 Supplementary Material 2 outlines how injuries described in written CT reports were categorized. Injury severity was coded using the Abbreviated Injury Scale (AIS), injury size, and presence of midline shift or mass effect. AIS codes were mapped to the Marshall classification using the method described by Lesko and colleagues and the description of midline shift. 11 An additional category of severity of up to two injuries with a combined maximal diameter <5 mm was added. TBI severity, as measured by the Marshall classification, 11 was assessed for inclusion in the final model alongside type of hemorrhage, contusion or skull fracture present, and total number of injuries. This allowed the independent predictive value of each of these components of the CT scan to be simultaneously assessed.

Sample size
A sample size requirement of 2000 patients was calculated using an estimated prevalence of deterioration of 10%. 6 Interim analysis found the actual prevalence of deterioration to be around 25%. Therefore, the target was revised to 1700 patients, equating to 425 events and allowing 42 candidate factors to be assessed on the basis of 10 events per factor. 12

Statistical analysis
Model selection. The primary and secondary outcomes of deterioration were modelled as binary variables using logistic regression. 13 We used stepwise selection to find the smallest number of candidate explanatory variables that accurately predict deterioration. Tables 1 and 2 summarize how candidate variables were included in modeling. For each candidate predictor, an unadjusted odds ratio was calculated. This category corresponds to Marshall Classification VI (volume >25 mL) and corresponds to a need for surgical evacuation by the Marshall Classification.
The extent of missing data on each candidate variable is shown in Table 1. Where medication use was undocumented, it was taken to indicate no pre-injury use. For other variables, we assumed missing data occurred at random. Twenty-five imputed data sets were created (based on missing data in around 25% of cases) using chained equations including all candidate variables and outcomes in the ICE STATA package (StataCorp LP, College Station, TX). 14 The midiagplots STATA function was used to compare the distributions of observed and imputed data. 15 Where continuous variables were non-normally distributed and implausible, imputed values were generated; predictive mean matching was used. 14 Model selection was performed using multi-variable backward elimination with a statistical significance threshold of 0.1. All candidate predictors were initially included and imputed data sets combined using Rubin's rules at each stage of model selection. For candidate continuous variables, rather than assume a linear relationships, the best predictive form was explored with the MFPMI function using backward elimination for fractional polynomial functions in multi-variable modeling. 16,17 Fractional polynomials were limited to 2 degrees of freedom when predicting the secondary outcome.
Model performance. Model fit was assessed using the Briers score averaged across imputed data sets. 18 A score of 0 implies perfect prediction and 0.25 no predictive value.
Model discrimination (how well patients with and without deterioration were distinguished) was assessed by the C-statistic, measured by combing estimates across imputed data sets using Rubin's rules. 17,19 Calibration measures how well predictions made by models match observations. 13 The calibration slope of selected predictors was calculated in each imputed data set and averaged.
Sensitivity analysis. Model selection and evaluation of model performance were repeated in patients with complete data.
Internal validation. Models tend to perform better on data from which they are derived (overfitting). 13 Bootstrap internal validation with 100 bootstrap samples was performed in each imputed data set to calculate the average optimism. Model selection was repeated in each bootstrap sample, and performance of models selected was subtracted by performance in the original data set. 20,21 The pooled average difference in the calibration slope between the bootstrap samples and original data was averaged across imputed data sets. This was subtracted from the original averaged calibration slope to estimate the shrinkage factor. The shrinkage factor was applied to the derived model coefficients to adjust for optimism. 13 The C statistic was adjusted for optimism using the same method.
Mild traumatic brain injury risk score development and comparison to the Brain Injury Guideline criteria. To use our prognostic model for making clinical decisions, we derived a risk score using optimism-adjusted coefficients. 22 To make the risk score clinically interpretable, coefficients were standardised and rounded. 22 Individual patient risk scores were calculated. A risk score for ED discharge was proposed based on the trade-off between risk of deterioration in a discharged patient and number of patients admitted for observation.
Sensitivity and specificity of the proposed discharge score and of the BIG criteria to deterioration were calculated and compared in patients with complete data for both criteria.
Ethics NHS Research Ethics Committee Approval was granted by West of Scotland REC 4 reference: 17/WS/0204. As a retrospective case review conducted by members of the direct care team, consent was not required.
were admitted to ICU or were intubated (secondary outcome). A total of 72 patients had deaths attributable to TBI. A total of 471 patients had data missing from at least one candidate variable. Table 2 summarizes the univariable associations between candidate variables and the primary outcome. Supplementary Material 3 presents the distributions of imputed data.

Model selection
The equivalent of 41 candidate factors were assessed in multivariable modeling to predict patient deterioration, and 34 factors were assessed in modeling to predict need for neurosurgical re-ferral. The selected model predicting the primary outcome is presented in Table 2 and the secondary outcome in Table 3. Supplementary Material 4 presents a complete case sensitivity analysis. Table 4 summarizes measures of model performance. The models predicting the primary and secondary outcomes had Briers scores of 0.16 and 0.09, respectively. The model predicting composite deterioration (primary outcome) had an optimism-adjusted C-statistic of 0.75, and the model predicting need for specialist neurosurgical admission had an optimism-adjusted C-statistic of 0.85. The trade-off between the sensitivity and specificity of these models is shown in the receiver operating characteristic curves in Supplementary Material 5.

Model performance
The mild traumatic brain injury risk score Table 5 presents the weighted risk score derived from our prognostic model predicting deterioration. Hemoglobin, although a statistically significant predictor in multi-variable modeling, was not included given that, because of the small effect size and range of abnormal values, inclusion did not improve performance (Supplementary Material 6). Based on the trade-off between sensitivity and specificity, a patient risk score of 0 was used as a threshold for ED discharge. Patients at this cutoff had the following characteristics: initial GCS 15, single simple skull fracture or hemorrhage <5 mm, up to two extracranial bony or organ injuries not requiring hospital admission, not anticoagulated/taking antiplatelets, no cerebellar/ brain stem injuries, and normal neurological examination (Table 5). Patients with a risk score of 1-5 had a 17.5% risk of deterioration, and patients with a risk score >5 had 54% risk of deterioration (Supplementary Material 7).
The performance of the BIG criteria and our risk score were assessed in the 1569 patients with complete data for both classification systems. A threshold of 0 in our risk score achieved a sensitivity of 99.5% (95% CI, 98.1-99.9) and specificity of 7.4% (95% CI, 6.0-9.1) to the primary outcome. The BIG criteria for discharge achieved the same sensitivity for deterioration, but lower specificity, although the confidence intervals overlap and this may be due to chance. Table 6 summarizes the characteristics of the false negatives (patients meeting the discharge threshold who deteriorated) in both approaches. No patients recommended for discharge by either criteria died or required neurosurgery, but 1 patient recommended for discharge by the BIG criteria required intubation. The BIG criteria would have allowed discharge of 57 patients (3.6%) compared to 87 patients (5.5%) with our risk score.

Summary
To our knowledge, this is the first UK study to report the risk of deterioration in all initial mild TBI patients with traumatic injuries reported on CT brain scan and study internationally to develop a prognostic model and risk tool for avoiding unnecessary hospital admissions. We also report the first independent validation of the BIG criteria.
The estimated prevalence of deterioration was 27.7%. Our prognostic models for composite measures of deterioration had optimism adjusted C statistics of 0.75 and 0.85, indicating good discrimination between patients with and without deterioration or need for neurosurgical care.
Using our risk score, derived from the prognostic model, to hypothetically direct need for hospital admissions we identified that it would appear safe to discharge from the ED patients who are fully conscious with no focal neurology (GCS 15)-not taking anticoagulant or -iplatelet medication who have a single simple skull fracture or hemorrhage <5mm (not cerebellar or brainstem) on CT brain scan and up to two extracranial bony or organ injuries not requiring hospital admission (risk score 0). This derived decision rule achieved a sensitivity of 99.5% and specificity of 7.4% for  b Injuries exclude superficial lacerations and abrasions, and a significant extracranial injury is defined as any injury requiring inpatient care.
deterioration. Categorization of patients for discharge using the BIG criteria achieved the same sensitivity, but a lower specificity. The model predicting need for neurosurgical admission (based on risk of an interventional outcome) found higher age and frailty reduces risk. This probably reflects clinical selection of patients, with frail older patients less likely to undergo invasive interventions.

Strengths
We believe this is the largest multi-center cohort study undertaken to estimate the prevalence of a composite measure of deterioration in this population. 4 The study was powered to develop a prognostic model predicting this outcome. Candidate predictor factors were selected a priori on the basis of existing literature. 6 We followed established techniques for handling missing data, prognostic modeling, and adjusting for optimism. 7,13,16,23 Unlike risk stratification systems based solely upon CT findings, [24][25][26] we have assessed a range of additional patient characteristics, test results, and other clinical factors for deterioration for inclusion in our model so as to achieve the maximum predictive accuracy. Our risk score is the first empirically derived scoring system which can to be used to inform admission decisions in this TBI population and incorporates both patient characteristics and other clinical risk factors alongside CT findings.

Limitations
Because of the resource implications of conducting a prospective study, we pragmatically chose a retrospective study design. Around 25% of patients had missing data, but given that these data were mainly missing through poor recording or missing notes, and therefore missing at random, imputation techniques were valid. Documentation inaccuracies may have introduced random error, but are unlikely to have introduced systematic bias.
We classified TBI severity using information in written CT reports by using AIS coding to map to a modified Marshall classification. Poor reporting of the size of injuries and extent of mass effect meant most injuries were classified as equivalent to Marshall classification II. Better systematic and standardized reporting may have allowed TBI severity to be better classified and improved the performance of the derived models. We were unable to assess whether using other scoring systems to classify TBI severity, such as the Stockholm, Helsinki, or NeuroImaging Radiological Interpretation System scoring systems, would improve the performance of the derived model. [24][25][26] Unlike with the Marshall classification, there is no validated way to map between AIS coding and these classification systems. However, type of injury was considered for inclusion in the model, alongside the Marshall classification and number of injuries.
Outcomes were limited to those recorded in hospital records, which may mean that patient deterioration in the community was missed. However, this is unlikely, and a check in Hull of deaths recorded in patients eligible for entry on the national trauma registry (linked to the office of national statistic mortality reporting) found no missed deaths.
We only assessed the predictive value of routinely collected factors. We could not assess the potential predictive value of using nonroutinely collected variables identified in our review 6 or biomarkers.
Although we have internally validated our derived models, they have not been externally validated. There is debate about the best way to combine imputation of missing data and internal validation bootstrapping techniques. 21 We chose to bootstrap within imputations because of lower computational complexity. This has been shown, in simulation studies, to provide accurate estimates of the shrinkage factor. 21 Other studies 27 found imputing within bootstraps better adjusts for optimism, and therefore, despite adjusting for overfitting, our models may perform less well when applied to new data.
The lower prevalence of the secondary outcome than expected means our study may not be adequately powered to derive a model accurately predicting this outcome.

Comparison previous literature
The estimated prevalence of clinical deterioration at 27.7% was higher than previously reported. In our review, we found the pooled prevalence of clinical deterioration to be around 10%. 4 This reflects differences in study design; previous studies used narrower outcome definitions, such as neurological deterioration or ICU intervention, 4 while we used a wide composite primary outcome aimed at encompassing need for hospital admission. We assessed an unselected GCS 13-15 population, while previous studies often restricted their inclusion criteria on the basis of GCS scores, injury severity, admitting inpatient specialty, and medication use. 6 Research assessing prognostic factors in this TBI population have frequently used sample sizes based on convenience and lacked the statistical power to assess potential predictors simultaneously. 4,28 Our study was sufficiently powered to assess over 40 candidate variables in multi-variable modeling. Previous research found that initial GCS, type of brain injury, anticoagulation, and age were the strongest predictors of adverse outcomes in this population. 4 In our multi-variable model, all these factors were also found to be predictors of deterioration.
Studies evaluating the BIG criteria in the level 1 trauma center in the United States, where it is routinely applied, found that around 10% of patients met the criteria for ED discharge and no patient that met these criteria had adverse outcomes. 5,29 In our cohort, 4% of patients met the criteria for ED discharge and 2 of these patients deteriorated. Our study cohort was, on average, older and had a lower GCS than studies previously assessing the BIG criteria, which may account for the difference in performance.

Implications
Internationally, and particularly in the United States, there is wide variation in admission practices in this group with a range of specialist admission and discharge criteria used on the basis of limited evidence. 5,[30][31][32] Accurate risk prediction has the potential to help rationalize admission decisions in this group. Between April 2014 and June 2015, around 11,000 TBI patients were admitted to specialist neurosurgical centers in the UK and over 50% of these patients had mTBI. 33 Currently, all patients with TBI identified by CT imaging are admitted to the hospital. Therefore, despite the low specificity of our model and the high false-positive rate, application of our model could improve clinical care by reducing unnecessary hospital admissions and thereby save health service resources and reduce patient inconvenience.
Our risk tool demonstrated good predictive sensitivity (99.5%) to our primary outcome at the proposed threshold for ED discharge. This would have allowed the discharge of 87 of 1569 patients (5.5%). At this sensitivity, a negative predictive value of 97.7% was achieved (an approximately 1 in 50 chance of a discharged patient deteriorating). This may not be clinically acceptable, but no patient recommended by our risk score for discharge died or required neurosurgery or an ICU intervention. One patient recommended for discharge had a report indicating a possible second lesion and therefore may have been admitted in clinical practice. The BIG criteria achieved the same sensitivity (99.5%) to the primary outcome, but its lower specificity means that clinical application would result in fewer patients being discharged.
The high predictive accuracy of our model for the secondary outcome (area under the curve = 0.85) suggests that it could be used to inform neurosurgical admissions in this population. The acceptable level of risk of requiring invasive intervention for a patient admitted under a non-specialist team is unknown and is likely to vary between centers. The lower prevalence of this outcome means that the estimated model may be less accurate, and we regard this as a starting point for further research. Both our prognostic model and the BIG criteria should be validated prospectively before they could be used in clinical practice. A prospective study design would address the weaknesses in outcome collection highlighted earlier, including assessing the pre-dictive value of CT severity classification systems other than the Marshall classification system, and allow the inclusion of nonroutinely collected prognostic factors, including biomarkers. Improved systematic reporting of CT scans could possibly increase the predictive accuracy of our model and further increase the performance of our risk tool. 25,34 Economic evaluation is also required to comprehensively assess the implication for both patient outcomes and resource use of using the model.

Conclusion
This is the first study to empirically derive a prognostic model for patients with mTBI and injuries identified by CT imaging and independently validate the BIG criteria. Our empirically derived risk tool performed better than the BIG criteria and could be used to safely discharge from the ED 1 in 20 patients currently routinely admitted for observation. Both our prognostic model and the BIG criteria now require prospective external validation and economic evaluation.