Accuracy of MRI diagnosis of early osteonecrosis of the femoral head: a meta-analysis and systematic review

Objective To evaluate the overall diagnostic value related to magnetic resonance imaging (MRI) in patients with early osteonecrosis of the femoral head. Methods By searching multiple databases and sources, including PubMed, Cochrane, and Embase database, by the index words updated in December 2017, qualified studies were identified and relevant literature sources were also searched. The qualified studies included prospective cohort studies and cross-sectional studies. Heterogeneity of the included studies were reviewed to select proper effect model for pooled weighted sensitivity, specificity, and diagnostic odds ratio (DOR). Summary receiver operating characteristic (SROC) analyses were performed for meniscal tears. Results Forty-three studies related to diagnostic accuracy of MRI to detect early osteonecrosis of the femoral head were involved in the meta-analysis. The global sensitivity and specificity of MRI in early osteonecrosis of the femoral head were 93.0% (95% CI 92.0–94.0%) and 91.0% (95% CI 89.0%–93.0%), respectively. The global positive likelihood ratio and global negative likelihood ratio of MRI in early osteonecrosis of the femoral head were 2.74 (95% CI 1.98–3.79) and 0.18 (95% CI 0.14–0.23), respectively. The global DOR was 27.27 (95% CI 17.02–43.67), and the area under the SROC was 93.38% (95% CI 90.87%–95.89%). Conclusions This review provides a systematic review and meta-analysis to evaluate the diagnostic accuracy of MRI in early osteonecrosis of the femoral head. Moderate to strong evidence indicated that MRI appears to be significantly associated with higher diagnostic accuracy for early osteonecrosis of the femoral head.


Background
Avascular Necrosis of Femur Head (ANFH), or osteonecrosis of the femoral head, is a pathologic process, which was first seen in the weight-bearing area of the femur. The stress can lead to bone trabecular structure injury (microfracture) and influence the repair process of the femur, and if not managed timely, it leads to the collapse and deformation of the femur. With many etiological factors, ANFH results from interruption of blood supply to the bone and then leads to ischemic necrosis. ANFH can be divided in traumatic ANFH and non-traumatic ANFH with the non-traumatic ANFH further dividing into steroid-induced and alcoholic non-traumatic ANFH and so on. The timely treatment of early ANFH could promote the recovery the disease. However, in the late stage, it results in femur collapse, loss of hip function, and a very poor outcome that affects the quality of life. Therefore, the early diagnosis of ANFH is of great significance [1][2][3].
Several methods for early diagnosis of ANFH have been proposed, including MRI, SPECT, CT, X-ray, DSA, and laser Doppler with different characteristics. MRI has been characterized as being non-invasive, rapid and high sensitive, and commonly used by many clinicians [4][5][6]. Furthermore, MRI has been used in many studies in the diagnosis of early ANFH. Therefore, in this paper, a systematic review and meta-analysis of all qualified studies were performed to explore the diagnosis accuracy of MRI in early ANFH.

Search strategy
The following electronic databases were searched from their inception to December 2017: The Cochrane, PubMed, Embase database, for all the qualified trails that analyze the diagnostic accuracy of MRI of early osteonecrosis of the femoral head. Other related articles and reference materials were also identified for additional available studies. The literatures were searched independently by two investigators, and a third investigator was involved to reach an agreement.

Study selection
The studies that met the following criteria were included in our review: (1) prospective cohort study or cross-sectional study; (2) the research objects are patients suspected with early osteonecrosis of the femoral head without other serious diseases; (3) the studies provided the data of true positive (TP), false positive (FP), false negative (FN), and true negative (TN); and (4) the publications were only available in English and Chinese.
The studies that met the following criteria were excluded in our review: (1) repeat publications, or shared content and results; (2) case report, theoretical research, conference report, systematic review, meta-analysis, expert comment, and economic analysis; (3) the outcomes were not relevant; and (4) two or more results of the TP, FP, FN, and TN were zero.

Data extraction and quality assessment
Two independent investigators extracted the following data based on predefined criteria. Differences were settled by discussion with a third reviewer. The analyses data were extracted from all the included studies and consisted of two parts: basic information and main outcomes. The first part was about the basic information: the author name, the sample size, the percentage of male, and the age. The second part was the clinical outcomes. A 2 × 2 contingency table was constructed for each selected study; the results corresponding to the gold standard and MRI were selected as positive or negative. The data included true positive (TP), false positive (FP), false negative (FN), and true negative (TN). In studies in which one single cell in the 2 × 2 contingency table had a value of 0, 0.5 were added to all of the cells for calculation. Sensitivity, specificity, and likelihood ratio were calculated respectively, and the diagnostic odds ratio (DOR) was used as the measure of diagnostic accuracy. A DOR value of 1 indicates a test without discriminatory power, and the higher the DOR value is, the greater the degree of relevance of the assessed diagnostic test. The studies were performed by two reviewers independently. Any arising difference was resolved by discussion.

Statistical analysis
All statistical analyses were performed in the STATA 10.0 (TX, USA). Chi-squared and I 2 tests were used to assess the heterogeneity of clinical trial results and determine the analysis model (fixed-effects model or random-effects model). When the chi-squared test P value was ≤ 0.05 and I 2 test value was > 50%, it was defined as high heterogeneity and assessed by random-effects model. When the chi-squared test P value was > 0.05 and I 2 tests value was ≤ 50%, it was defined as acceptable heterogeneity data and assessed by fixed-effects model. For further assessment of heterogeneity, diagnostic threshold analysis was performed based on the correlation (Spearman's) between the logit of sensitivity and the logit of [1-specificity]. When a threshold effect occurs, the sensitivity and specificity of the investigated study exhibits negative correlation (or a positive correlation between sensitivity and [1-specificity]). Therefore, a strong positive correlation between sensitivity and [1-specificity] suggests the presence of a threshold effect. When heterogeneity caused by threshold effect was observed, a summary receiver operating characteristic (SROC) curve was plotted. This method was appropriate given that the global sensitivity and specificity values were overestimated. In such cases, analysis of the ROC panel points, as well as analysis of the SROC curve, was recommended. Deeks' Funnel Asymmetry Plot was used to identify the publication bias.

Characteristics of included studies
A total of 2092 articles were searched by the indexes. After screening the titles and abstracts, 1986 articles were excluded, leaving 106 articles for further selection. During full-text screening, 63 articles were excluded due to the following criteria: unqualified outcomes [7], theoretical research or review [8], and has non clinical outcome [9]. At last, 43 studies  with 3133 hips were involved in the final meta-analysis. The selection process was presented in Fig. 1. The main characteristics of the included studies were summarized in Table 1. The basic information included number of hips, age, and gender.

Diagnostic accuracy
All the included studies reported the results of the accuracy of MRI of early osteonecrosis of the femoral head. Based on the correlation (Spearman's R = − 0.209, P = 0.589) between the logit of sensitivity and the logit of [1-specificity], there was no threshold effect.
Based on the chi-squared test (Q = 125.33, P = 0.000) and I 2 tests (I 2 = 66.5%), heterogeneity was high, so we    chose random-effects model to analyze the positive likelihood ratio, and the global positive likelihood ratio was 2.74 (95% CI 1.98-3.79, Fig. 4). Therefore, a positive MRI result was increased by 2.74-fold in the odds of an accurate diagnosis of patients who actually had early osteonecrosis of the femoral head. Based on the chi-squared test (Q = 69.58, P = 0.005) and I 2 tests (I 2 = 39.6%), with low heterogeneity, we chose the fixed-effects model to analyze the negative likelihood ration. The global negative likelihood ratio was 0.18 (95% CI 0.14-0.23, Fig. 5), indicating the use of MRI, which was close to zero. Specifically, the odds of a false-positive result were increased by only a factor of 0.18. Based on the chi-squared test (Q = 59.71, P = 0.037) and I 2 tests (I 2 = 29.7%), heterogeneity was low, so we chose the fixed-effects model to analyze the DOR, with the global DOR being 27.27 (95% CI 17.02-43.67, Fig. 6). And the odds of a positive MRI result were 27.27-fold higher among individuals with early osteonecrosis of the femoral head compared to those without the disease. The area under the SROC was 93.38% (AUC = 93.38%; 95% CI 90.87%-95.89%, Fig. 7), indicating high accuracy.

Conclusions
Several systematic reviews and meta-analysis have been published concerning the diagnostic accuracy of MRI of early osteonecrosis of the femoral head. Li et al. [51] found that the sensitivity and specificity of MRI were 95%(95% CI 94-96%) and 77%(95% CI 70-83%), respectively. Moreover, the DOR was 31.89%(95% CI 17.32-58.70%), and the AUC under the SROC was 0.9166. MRI was associated with high diagnostic accuracy in the patients with suspected early ANFH. Song et al. [52], who included 21 articles, reported that MRI was more effective than CT in diagnosing ANFH. Significant statistical difference was identified between them (OR, 0.13; 95% CI 0.03-0.51). Su et al. [53], who included 8 studies of 515 patients, found the ANFH positive rate between CT and MRI was statistically significant (OR, 0.12; 95% CI 0.04-0.33), so as the early stage positive rate (OR, 0.45; 95% CI 0.26-0.78). Therefore, MRI appears to be a promising diagnostic tool for avascular necrosis of the femoral head.
However, there were several limitations in this analysis: (1) differences in the inclusion and exclusion criteria for Fig. 6 Forest plot showing the diagnostic odds ratio of MRI of early osteonecrosis of the femoral head patients, (2) different patients with previous disease and treatments were unavailable, (3) all the included studies were from English and Chinese articles, which may be the source of bias, (4) the fluency of technicians between different studies varied, and (5) pooled data were used for analysis, and individual patients' data were unavailable, which limited a more comprehensive analysis.
In summary, in this systematic review and meta-analysis, MRI as a diagnostic method is associated with higher accuracy for detecting ANFH. More studies and randomized controlled trails with high-quality and large samples are warranted for further evaluation. Ethics approval and consent to participate Not applicable.