Objectives:
While stereotactic radiosurgery (SRS) is often an efficacious treatment for brain metastases (BM), it carries a significant risk of radiation necrosis. A major challenge to the management of patients with BMs after SRS is the lack of non-invasive diagnostic and surveillance methods to distinguish radionecrosis from recurrent disease without a surgical biopsy. We therefore aimed to design a deep ensemble learning model to distinguish radiation necrosis in BM patients showing post-SRS radiographic progression. The model integrates patient clinical features and genomic profiles to differentiate radionecrosis from true recurrence using standard post-SRS follow-up MR images, offering a non-invasive strategy to guide appropriate treatment selection.
Methods:
We assessed 90 BMs from 62 non-small cell lung cancer (NSCLC) patients, with 27 BMs manifesting biopsy-confirmed post-SRS local recurrence. For each patient, clinical features, including patient age, BM location, SRS prescription, and genomic features, including 7 NSCLC driver mutations, were collected. We first analyzed the 3-month post-SRS high-resolution T1+c volume: a 3D volume-of-interest (VOI) centered on each BM was determined based on the SRS V60% isodose volume. A bespoke deep neural network (DNN) resembling the U-net's encoding path was then trained for radionecrosis/recurrence prediction using the 3D VOI. Prior to the binary prediction output, latent variables in the DNN are extracted as 1024 deep features. The ensemble learning model features two sub-models with the same DNN architecture: in each sub-model, the extracted 1024 deep features were fused with clinical features (‘D+C’ sub-model) or with genomic features (‘D+G’ sub-model). To overcome the dimensionality mismatch problem that often arises when fusing data from various sources, we employed a vector-growing encoding scheme known as positional encoding (PE) for the optimized feature space sizes. Following this, the post-fusion feature in each sub-model yielded a logit result (i.e., radionecrosis/recurrence) after fully connected layers. The ensemble's final output was the synthesized result of these two sub-models’ logits via logistic regression. The model training was conducted with an 8:2 train/test split, and we developed 10 model versions for robustness evaluation. Performance metrics, including sensitivity, specificity, accuracy, and ROC, were evaluated in a comparison study against 1) the DNN result using image-only deep features; and 2) ‘D+C’ and ‘D+G’ sub-model results using post-fusion feature from two sources.
Results:
The deep ensemble model delivered commendable results on the test set: ROC AUC=0.88±0.04 sensitivity = 0.83±0.16, specificity = 0.85±0.08, and accuracy = 0.84±0.04. This surpassed the image-only DNN result (AUC: 0.71±0.05, sensitivity: 0.66±0.32), the 'D+C' result (i.e., deep feature-clinical feature fusion) (AUC: 0.82±0.03, sensitivity: 0.64±0.16), and the 'D+G' result (i.e., deep feature-genomic feature fusion) (AUC: 0.83±0.02, sensitivity: 0.76±0.22).
Conclusion(s):
This innovative radiogenomic deep ensemble model effectively differentiates BM radionecrosis from recurrence using 3-month post-SRS T1+c MR images. This breakthrough underscores the potential applications of artificial intelligence in clinical decision-making tools for BM management. The potential implications of this model in clinical settings warrants further investigation.