Skip to main content

External validation of a prediction model for surgical site infection after thoracolumbar spine surgery in a Western European cohort



A prediction model for surgical site infection (SSI) after spine surgery was developed in 2014 by Lee et al. This model was developed to compute an individual estimate of the probability of SSI after spine surgery based on the patient’s comorbidity profile and invasiveness of surgery. Before any prediction model can be validly implemented in daily medical practice, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort.


We included 898 consecutive patients who underwent instrumented thoracolumbar spine surgery.

To quantify overall performance using Nagelkerke’s R2 statistic, the discriminative ability was quantified as the area under the receiver operating characteristic curve (AUC). We computed the calibration slope of the calibration plot, to judge prediction accuracy.


Sixty patients developed an SSI. The overall performance of the prediction model in our population was poor: Nagelkerke’s R2 was 0.01. The AUC was 0.61 (95% confidence interval (CI) 0.54–0.68). The estimated slope of the calibration plot was 0.52.


The previously published prediction model showed poor performance in our academic external validation cohort. To predict SSI after instrumented thoracolumbar spine surgery for the present population, a better fitting prediction model should be developed.


Surgical site infection (SSI) after spinal fusion can have devastating consequences and morbidity that may yield substantial physical limitations with a distinct decrease in quality of life and overall increased health care costs [1]. SSIs can be difficult both to diagnose and to treat. One or more operative debridements combined with prolonged antibiotic treatment may be necessary to eradicate the infection [1,2,3,4].

In spine surgery, a relatively high incidence of SSIs of up to 12% is observed, depending on diagnosis, surgical approach, the use of spinal instrumentation, and the complexity of the procedure [5,6,7,8]. Prior research identified several factors associated with an increased risk of SSI: advanced age, obesity, diabetes, smoking, malnutrition, and prolonged duration of surgery [5, 6, 9,10,11]. Most of these risk factors are quantified as relative risk or odds ratio. These values are difficult to use in clinical workup before operation to estimate the risk for postoperative SSI and personalize decision-making on individual patient characteristics.

A prediction model is an appropriate tool for shared decision-making during workup to evaluate the individual risk of SSI after spinal surgery and possibly to prevent SSI and its devastating consequences by taking measures before and during surgery [1]. Lee et al. developed a prediction model for SSI after spine surgery that was derived from a surgical spine register of the USA (the Spine End Results Registry). This model was developed to compute an individual estimate of the probability of SSI after spine surgery based on the patient’s comorbidity profile and invasiveness of surgery [11].

A prediction model is most valuable when it is generally applicable. However, before any prediction model can be validly implemented in daily medical practice, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort. To the best of our knowledge, the prediction model of Lee et al. has never been externally validated. The aim of the present study was to externally validate the prediction model by Lee et al. in a Western European cohort of patients who received instrumented thoracolumbar spine surgery.


Study population

For the external validation, we used the data from a prospective cohort of patients > 18 years who underwent instrumented spine surgery from January 1999 up to January 2016 in the Maastricht University Medical Centre.

All operations were performed by three experienced orthopedic surgeons specialized in spine surgery. In some cases, neurosurgeons participated in the operation. All patients underwent an instrumented posterior (posterolateral or interbody) spinal fusion of the thoracolumbar spine with or without an additional procedure (anterior fusion or release, spinal decompression, removal of instrumentation, tumor resection or (partial) corpectomy).

Patients were followed for a minimum of 1.5 year after the index operation to monitor all complications and outcomes of the procedure. All complications, extensive demographics, comorbidity, and surgical details were recorded by collecting data out of all electronic and paper records of the patients. For the preexisting medical comorbidities that were used in the prediction model of Lee et al. (congestive heart failure, diabetes, rheumatoid arthritis), we used the following definition:

Congestive heart failure—a proven decrease of ejection fraction of the heart on ultrasonography and all conditions that decrease the ejection fraction of the heart, including myocardial infarction, angina pectoris, and mitral valve disease in medical history

Diabetes mellitus—insulin-dependent and insulin-independent diabetes mellitus

Rheumatoid arthritis—rheumatoid arthritis, ankylosing spondylitis or psoriatic arthritis that had been officially diagnosed by a rheumatologist

We calculated the surgical invasiveness index (SII), as used by Lee et al. for all patients. This index is a validated instrument with a range from 0 to 48 points and contains the sum of six weighted surgical components: number of levels anterior decompressed, anterior fused, anterior instrumented, posterior decompressed, posterior fused, and posterior instrumented. The weight for each component represents the number of vertebral levels at which each respective component has been performed [12].

The primary outcome of interest was SSI. The diagnosis of surgical site infection in our patient cohort was based on the CDC (Centre for Disease Control and prevention) criteria [13] and the Dutch national PREZIES (prevention of hospital infections through surveillance) network [14]. An SSI was considered to be deep if it presented at the site of the operation with involvement of the subfascial tissues. This definition is independent of return to the operating room for irrigation and debridement, in contrast to the definition of SSI used by Lee et al. who defined SSI as an infection requiring return to the operating room. We included all deep infections, even those we did not treat with a re-operation because of terminal illness. All patients had an outpatient appointment at 1 year after the index operation to be registered as “SSI” or “No SSI.”

Statistical analysis

For predictor variables that were incomplete, we used stochastic regression imputation. This ensures all observed data can be used for the analysis, preventing a potentially considerable loss of statistical precision. We used predictive mean matching to draw the values to be imputed.

Prediction model

The prediction model of Lee et al. was based on the data of the Spine End Results Registry (SERR). This is a prospectively collected registry for all surgical spine patients at the University of Washington and Harborview Medical Center who underwent surgery from January 1, 2003, to December 31, 2004. This cohort included 1745 patients. One thousand five hundred thirty-two patients were included and were followed for adverse events. Seven hundred thirty-eight (48%) patients consented to provide detailed questionnaires of their risk factors. In 794 (52%) patients, some information about their risk factors, such as smoking status and alcohol use, were missing, as the data for these patients were found either by notification from hospital staff or by medical record review.

The prediction model consisted of seven predictor variables, i.e., body mass index (BMI) classified as normal (18.5 ≤ BMI < 25.0), underweight (BMI < 18.5), overweight (25.0 ≤ BMI < 30.0), and obese (BMI ≥ 30) (in the original article, it was not clear whether a BMI of 30.0 would classify as overweight or obese, so we included it in the obese range), diagnosis group (degenerative, trauma, or other), SII score, congestive heart failure (yes or no), diabetes (yes or no), rheumatoid arthritis (yes or no), and age.

In order to derive the prediction formula, we needed regression coefficients, including the intercept. These parameters were not published in the manuscript nor could they be retrieved from the website, or from the authors. Therefore, we took the natural logarithm of the odds ratios presented in the manuscript. These can be used to compute a risk score that ranks patients according to their risk but that does not yield the probability of an SSI. In addition, we used our own cohort to estimate the intercept so that the average predicted probability is exactly the same as the frequency of SSI. After obtaining all regression coefficients, including the intercept, we computed each individual’s probability of an SSI using the standard logistic regression formula.

Prediction model performance

We quantified the external validity of the prediction model by computing measures of overall performance, discriminative ability, and calibration. To quantify overall performance, we computed Nagelkerke’s R2 statistic. Nagelkerke’s R2 is a pseudo-R2 measure for binary outcomes.

The prediction models’ discriminative ability was quantified as the area under the receiver operating characteristic (ROC) curve (AUC). It can be interpreted as the proportion of randomly drawn pairs in which the one developing an SSI has a higher predicted probability than the individual not developing an SSI. It can range between 0.5 and 1.0. The higher, the better the prediction model’s discriminative ability. As a sensitivity analysis, we computed the AUC on our sample after excluding deep infections that we did not treat with a re-operation as they would not have been regarded as events according to the definition in the study by Lee et al.

Calibration refers to the agreement between predicted and observed probabilities. We visually inspected the calibration plot to assess whether the prediction model over- or underestimates actual risk for certain risk-based subgroups and computed the calibration slope which ideally should be 1.


The cohort was comprised of a total of 949 patients. Fifty-one patients were excluded: 9 patients were diagnosed before the index operation with an infection after previous back surgery and 42 patients were excluded because there is too little information to be imputed. We included a total of 898 participants for the external validation, of whom 60 (6.7%) were subsequently diagnosed with an SSI, including two deep infections not treated with a re-operation because of terminal illness. Table 1 shows baseline characteristics of all patients included in the study. The predictor variable with the highest number of missing values in our dataset before imputation was BMI (52 missing, or 5.7%). All other predictor variables were completely observed. After imputation, all records could be used for the analysis.

Table 1 Baseline characteristics of all patients included in the study

The back-transforming of the odds ratios published by Lee et al. and the estimation of the intercept based on the present cohort yielded the following formula for the prediction of the probability of an SSI after spinal surgery:

Probability of SSI after spinal surgery = 1/(1 + e−LP), in which LP = − 3.73 + 1.12*CHF + 0.74*diabetes + 0.70*rheumatoid arthritis + 0.06*SII + 0.002*age + 0.48*trauma − 0.09*other + 0.79*underweight − 0.14*overweight + 0.34*obese.

For example, the probability to develop an SSI after spinal surgery for a 65-year-old overweight male, who has no comorbidities, who will be operated upon due to trauma, and who has an SII score of 10:

LP = − 3.73 + 1.12*0 + 0.74*0 + 0.70*0 + 0.06*10 + 0.002*65 + 0.48*1 − 0.09*0 + 0.79*0 − 0.14*1 + 0.34*0 = − 2.66. Hence, the probability of SSI after spinal surgery = 1/(1 + e+ 2.56) = 0.065 = 6.5%.

Prediction model performance

This model was subsequently externally validated. The overall performance was poor: Nagelkerke’s R2 was only 0.01, indicating poor predictive strength. The AUC of the model by Lee et al. applied to our cohort was 0.61 (95% confidence interval (CI) 0.54–0.68), indicating only mediocre discriminative ability (see Fig. 1). Only two patients had a deep infection but were not subsequently re-operated because of terminal illness. In the sensitivity analysis in which we excluded them from the analysis, the AUC did not differ substantially; the AUC was 0.62 (95% CI, 0.55–0.69).

Fig. 1
figure 1

ROC curve of the prediction model by Lee et al. used to predict SSI

The calibration plot is shown in Fig. 2. The risks of patients at high risk (say, 20% or higher) are on average severely overestimated, as indicated by the fact that the curve lies far beneath the 45° line of perfect calibration. For example, of all patients who had an estimated probability of SSI of about 30%, only 10% actually developed SSI. The estimated slope of the calibration plot was 0.52 compared to an ideal value of 1.

Fig. 2
figure 2

Calibration plot of the prediction model by Lee et al. used to predict SSI


We externally validated a previously published prediction model for SSI after spine surgery after back-transforming the published ORs and estimating an intercept specific for our site. The prediction model performed poorly on overall fit, discriminative ability, and calibration. Often, previously developed models perform worse than expected on future patients, especially on patients from different settings. One explanation could be that there is a significant difference in the rate of SSI between our cohort (6.7%) and the cohort of Lee et al. (4.3%), which may have been caused by a difference in patient population. In contrast to the cohort of Lee et al., we solely included “instrumented” spinal procedures that are known to have a higher infection rate, as seen in the literature [15]. Lee et al. included patients of the Spine End Results Registry (SERR). In this database also, patients without instrumentation were included [16]. The average SI score in our sample was 1.8 points higher compared to the sample of Lee et al. Probably our procedures were more invasive because we solely included “instrumented” procedures and more long-trajectory fusion procedures (e.g., scoliosis). Cizik et al. concluded that surgical invasiveness is the strongest risk factor for SSI after spine surgery, even after adjusting for medical comorbidities, age, and other known risk factors [16]. Lee et al. included a higher percentage of men (57%) than we did in our population (49%). It has been reported that female sex is a predictor of surgical site infection after spine surgery [17, 18]. The mean age was approximately the same (49.5 vs. 52.2 years; SD 16.1) between the two populations just as the mean body mass index (27.7 vs. 26.1, SD 4.7). Also, the diagnosis for the index operation is more or less the same. 54.9% in our population had a degenerative condition for treatment (de novo degenerative scoliosis, degenerative spinal cord compression disorder, or one- or two-level degenerative lumbar disc disease) followed by 22.1% trauma, as compared to 64.7 and 24.3% of the population of Lee et al., respectively. All operations were performed using a posterior approach and in 2.8% combined with an anterior approach. This is in contrast to the population of Lee et al., where in 58.7% a posterior approach was used and in 22.8% a combined approach.

In both studies, there is the possibility of underdiagnosis of surgical site infection because of patients that may have been treated elsewhere for SSI without recording in the database. In our study, this would have been only possible in cases with an SSI more than 1 year after the index operation, because we registered the infection status of all patients at 1 year follow-up on the outpatient clinic.

A limitation of this external validation is the potential lack of similarity of definitions of predictor variables. Despite several mail attempts by our study group, the authors of the prediction model were not able to inform us about their methods. In addition, the incidence of preexisting medical comorbidities as used in the prediction model of Lee et al. (congestive heart failure, rheumatoid arthritis, and diabetes) could not be compared because these were not further specified in the article. A second limitation is the sample size of our cohort. Even though the absolute size is quite large, the number of events (SSI) is only 60. A study suggests using at least 100 events and 100 non-events for an external validation study [19]. Therefore, our results may be less precise.

In prior research, more risk factors were identified to increase the risk of SSI after spine surgery than used in the prediction model of Lee et al. In our opinion, some of these factors would be important to include in a model for SSI following (instrumented) spinal surgery of the thoracolumbar spine: smoking, alcohol use, and previous spine surgery [5, 6, 20]. These factors are important in shared decision-making and communication with patients undergoing spinal surgery because some of these factors, such as smoking behavior, can be adapted during workup.


The model presented by Lee et al. shows poor predictive performance in our cohort of Western European patients undergoing instrumented spinal surgery. For valid and accurate prediction of SSI after instrumented spine surgery in an academic center, a better prediction model should be developed, preferably with more, and better defined risk factors earlier described in literature for a patient population that is better comparable with the population in our academic spine center. After the development of such a prediction model, this should also be externally validated in similar populations to use it as a broad and more general model. A valuable tool for validations of new models could be high-volume national and international registry data to compare factors such as diagnosis, operations, comorbidity, and incidence of infection in large patient populations, because of the low incidence of SSI in spine surgery.



Area under the receiver operating characteristic curve


Body mass index


Centre for disease control and prevention


Congestive heart failure


Linear prediction


Medical ethics committee


Odds ratio


Prevention of hospital infections through surveillance


Receiver operating characteristic


Standard deviation


The Spine End Results Registry


Surgical invasiveness index


Surgical site infection


United States of America


Medical Research Involving Human Subjects Act


  1. Godil SS, et al. Comparative effectiveness and cost-benefit analysis of local application of vancomycin powder in posterior spinal fusion for spine trauma: clinical article. J Neurosurg Spine. 2013;19(3):331–5.

    Article  PubMed  Google Scholar 

  2. Fang XT, Wood KB. Management of postoperative instrumented spinal wound infection. Chin Med J. 2013;126(20):3817–21.

    PubMed  Google Scholar 

  3. Collins I, et al. The diagnosis and management of infection following instrumented spinal fusion. Eur Spine J. 2008;17(3):445–50.

    Article  PubMed  Google Scholar 

  4. Chen SH, et al. Postoperative wound infection after posterior spinal instrumentation: analysis of long-term treatment outcomes. Eur Spine J. 2015;24(3):561–70.

    Article  PubMed  Google Scholar 

  5. Schimmel JJ, et al. Risk factors for deep surgical site infections after spinal fusion. Eur Spine J. 2010;19(10):1711–9.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  6. Fang A, et al. Risk factors for infection after spinal surgery. Spine (Phila Pa 1976). 2005;30(12):1460–5.

    Article  Google Scholar 

  7. Weinstein MA, McCabe JP, Cammisa FP Jr. Postoperative spinal wound infection: a review of 2,391 consecutive index procedures. J Spinal Disord. 2000;13(5):422–6.

    Article  PubMed  CAS  Google Scholar 

  8. Sierra-Hoffman M, et al. Postoperative instrumented spine infections: a retrospective review. South Med J. 2010;103(1):25–30.

    Article  PubMed  Google Scholar 

  9. Glotzbecker MP, et al. What’s the evidence? Systematic literature review of risk factors and preventive strategies for surgical site infection following pediatric spine surgery. J Pediatr Orthop. 2013;33(5):479–87.

    Article  PubMed  Google Scholar 

  10. Ho C, Sucato DJ, Richards BS. Risk factors for the development of delayed infections following posterior spinal fusion and instrumentation in adolescent idiopathic scoliosis patients. Spine (Phila Pa 1976). 2007;32(20):2272–7.

    Article  Google Scholar 

  11. Lee MJ, et al. Predicting surgical site infection after spine surgery: a validated model using a prospective surgical registry. Spine J. 2014;14(9):2112–7.

    Article  PubMed  Google Scholar 

  12. Mirza SK, et al. Towards standardized measurement of adverse events in spine surgery: conceptual model and pilot evaluation. BMC Musculoskelet Disord. 2006;7:53.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Mangram AJ, et al. Guideline for prevention of surgical site infection, 1999. Hospital infection control practices advisory committee. Infect Control Hosp Epidemiol. 1999;20(4):250–78. quiz 279-80

    Article  PubMed  CAS  Google Scholar 

  14. Geubbels EL, et al. An operating surveillance system of surgical-site infections in the Netherlands: results of the PREZIES national surveillance network. Preventie van Ziekenhuisinfecties door surveillance. Infect Control Hosp Epidemiol. 2000;21(5):311–8.

    Article  PubMed  CAS  Google Scholar 

  15. Subramanyam R, et al. Systematic review of risk factors for surgical site infection in pediatric scoliosis surgery. Spine J. 2015;15(6):1422–31.

    Article  PubMed  Google Scholar 

  16. Cizik AM, et al. Using the spine surgical invasiveness index to identify risk of surgical site infection: a multivariate analysis. J Bone Joint Surg Am. 2012;94(4):335–42.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Wang T, et al. Factors predicting surgical site infection after posterior lumbar surgery: a multicenter retrospective study. Medicine (Baltimore). 2017;96(5):e6042.

    Article  Google Scholar 

  18. Lieber B, et al. Preoperative predictors of spinal infection within the National Surgical Quality Inpatient Database. World Neurosurg. 2016;89:517–24.

    Article  PubMed  Google Scholar 

  19. Vergouwe Y, et al. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005;58(5):475–83.

    Article  PubMed  Google Scholar 

  20. Chaichana KL, et al. Risk of infection following posterior instrumented lumbar fusion for degenerative spine disease in 817 consecutive cases. J Neurosurg Spine. 2014;20(1):45–52.

    Article  PubMed  Google Scholar 

Download references

Availability of data and materials

The data that support the findings of this study are available from the corresponding author on reasonable request.

Author information

Authors and Affiliations



Bd’A and DJ selected and completed all the data. SvK and DJ analyzed the patient data. SvK did the statistical analyses for the external validation. PW and DJ interpreted the results of the analyses. PW, DJ, and SvK wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Daniël M. C. Janssen.

Ethics declarations

Ethics approval and consent to participate

The Medical Ethics Committee (METC) azM/UM Maastricht confirmed on October 10, 2016, that the Medical Research Involving Human Subjects Act (WMO) does not apply to this study, with reference number 16-4-177. An official approval of this study by the Medical Ethics Committee is not required.

Consent for publication

Written consent from all participants to enter into the study.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Janssen, D.M.C., van Kuijk, S.M.J., d’Aumerie, B.B. et al. External validation of a prediction model for surgical site infection after thoracolumbar spine surgery in a Western European cohort. J Orthop Surg Res 13, 114 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: