Purpose
Cervical cancer is the fourth most frequently diagnosed cancer, and the fourth leading cause of cancer-related deaths in women around the world [1]. External beam radiation therapy (EBRT) and brachytherapy (BT) are effective treatments for cervical cancer [2, 3]. The American Brachytherapy Society (ABS) recommends post-operative adjuvant BT for non-radical surgery, close or positive margins, large or deeply invasive tumors, and parametrial or vaginal involvement [4]. Several reports supported that vaginal cuff BT boost was associated with a reduced recurrence rate in post-operative setting of high-risk patients with early-stage cervical cancer [5, 6]. Currently, the application of 3D image-guided BT (IGBT) allows adaptive treatment planning process, and presents more advantageous, compared with conventional two-dimensional (2D), image-based method [7]. Meanwhile, applicator reconstruction is critical step during the procedure of IGBT treatment planning [8].
The accuracy of applicators reconstruction has significant impact on dosimetric results of IGBT treatment plan because of steep dose gradients [9-11]. In general, applicators reconstruction are mostly performed manually by physicists. The localization process is a time-consuming part in IGBT workflow [12], and always suffers from subjective variability. Therefore, it is strongly needed to achieve automatic applicator reconstruction in IGBT workflow to ensure treatment planning accuracy, consistency, and efficiency.
In recent years, deep learning (DL)-based frameworks have been applied in radiotherapy (RT) and achieved superior results, including automatic segmentation of target volume and organs at risk [13], automatic RT planning [14], and prediction of irradiation toxicity and prognosis [15]. The advantage of DL is the ability to recognize novel scenes by automatically extracting labeled features through learning of generalized features in training samples [16-18]. Deep learning plays a major role in brachytherapy [19]; some studies have focused on the automatic applicators reconstruction in IGBT workflow based on DL methods [20-27]. However, geometric metrics and subjective assessment were always selected to evaluate the performance of DL models in previous studies, with few studies reporting dosimetric differences in auto-reconstruction of applicators.
The purpose of the present study was to evaluate the accuracy of DL model in automatic reconstruction of metallic interstitial needles in patients with post-operative cervical cancer using both geometric and dosimetric metrics in IGBT workflow.
Material and methods
The work flowchart of this study is illustrated in Figure 1, and three key steps summarized the procedure. Firstly, the experienced physicist annotated metallic needles as the standard applicators reconstruction. Secondly, the DL-based model was trained and verified by the standard data, and automatic reconstructions of interstitial needles was generated. Thirdly, the accuracy of DL-based auto-reconstruction was assessed using geometric metrics and dosimetric differences.
Data annotation
Data of seventy post-operative cervical cancer patients collected between August, 2021 and July, 2022 were used in this study. In all patients, three interstitial needles were employed, with a prescription dose of 12-30 Gy (6 Gy/fraction). CT images of 70 patients were reconstructed with 512 × 512 matrix size and 3 mm slice thickness using a Philips Brilliance Big Bore CT scanner system (Philips Healthcare, Best, The Netherlands). The number of CT slices ranged from 69 to 118. Interstitial needles were delineated manually by an intern physicist using Oncentra treatment planning system, version 4.3 (Elekta AB, Stockholm, Sweden), and named 1, 2, and 3. To establish standard delineation, all the manual needles were evaluated and approved by a senior physicists.
Deep learning-based auto-reconstruction
We presented an adaptive DL model based on nnU-Net (no-new-Net) to reconstructed needles for post-operative cervical cancer BT. Figure 2 depicts the design concept for DL model used in the study. The nnU-Net [28] defines dataset fingerprint and pipeline fingerprint. Pipeline fingerprints are classified into three categories, including blueprint, inferred, and empirical parameters. A 2D U-Net, 3D U-Net, and 3D U-Net cascade are the three U-Net configurations that nnU-Net generates by default. 2D U-Net and 3D U-Net are input full resolution images, while 3D U-Net cascade firstly uses a low resolution image for coarse segmentation, and then applies a full resolution image for fine segmentation.
In this work, we mainly focused on the 3D U-Net for reconstruction of the applicator due to 3D-CT images with high resolution of metallic needles in our datasets. The 70 patients were separated into training, validating, and testing data in the ratio of 50 : 10 : 10. To improve image contrast and improve the interstitial needles display, histogram equalization processing on CT scans from training and validating data was performed using digital image processing software.
Geometric assessment
Geometric correctness of the interstitial needles was compared using DSC, 95% HD, and JC [29]. DSC and JC calculated spatial overlap between two regions as follows:
DSC = 2 |A ∩ B| / (|A| + |B|),
JC = |A ∩ B| / |A ∪ B|,
where A and B are manually segmented regions or auto-segmented regions based on DL. For the complete overlap, the values of DSC and JC are 1. For the incomplete overlap, the values of DSC and JC are close to 0.
Hausdorff distance (HD) was used to quantify the accuracy of digitized needle trajectories. In order to exclude outlier distance values, 95% HD was chosen to indicate the largest surface-to-surface separation among the closest 95% of surface points. Unit of HD95 was mm. The smaller the HD95, the better the segmentation. Hausdorff distance (HD) was computed as:
HD = max (h (A, B), h (B, A)),
With h defined as h (A, B) = max a∈A min b∈Bd (a, b), where a and b are the points on the surfaces of A and B.
Dosimetric comparison
Oncentra treatment planning system was applied to compute and optimize BT plans (original plan), based on standard manual needles. Radio-active source position and time were migrated from the original plans to the automatic reconstruction needles to generate DL plans. Dose volume histogram (DVH) was used to investigate the dosimetric difference between original plans and DL plans. For high-risk clinical target volume (HR-CTV), we mainly focused on D90%, D98% (D98 and D90 were doses to 98% and 90% of HR-CTV volume, respectively). For organs at risk (OARs), we mainly focused on D2cc, D1cc, and D0.1cc. Dose values of D2cc, D1cc, and D0.1cc represented 2 cc, 1 cc and 0.1 cc volumes of OARs that received the maximum dose, respectively. OARs included the bladder, rectum, sigmoid colon, and small intestine.
Statistical analysis
IBM SPSS statistics software (version 26.0, IBM Inc., Armonk, NY, USA) was used for statistical analysis, where mean ± standard deviation (SD) was applied for presenting and summarizing the results. Wilcoxon’s paired non-parametric signed-rank test was used to compare the dosimetric difference between two methods, p < 0.05 indicated that the difference was statistically significant. Spearman’s correlation analysis was applied to assess the relationships between geometric metrics and dosimetric difference.
Results
Evaluation of geometric metrics
The geometric accuracy of DL auto-reconstruction of metallic needles is presented in Figure 3. Automatic reconstruction produced the results for three needles, with average DSC value of 0.88 ±0.03, 0.89 ±0.02, and 0.9 ±0.02, respectively; 95% HD of 0.77 ±0.12 mm, 0.73 ±0.13 mm, and 0.71 ±0.07 mm, respectively; and JC of 0.81 ±0.04, 0.8 ±0.34, and 0.81 ±0.03, respectively.
Evaluation of dosimetric metrics
Table 1 demonstrates the comparisons of dosimetric parameters between two methods using Wilcoxon’s paired non-parametric signed-rank test. There were no statistically significant dosimetric differences for all of the BT planning structures (p > 0.05). Figure 4 illustrates 3D views of three metallic needles for manual and automatic reconstructions. The reconstructions of applicators with DL model were in good agreement with the manual approach. Examples of dose distributions from manual and DL-based methods are shown in Figure 5.
Table 1
Correlation analysis between geometric and dosimetric metrics
The results of Spearman correlation analysis between geometric metrics and dosimetric metrics (Δdose) are presented in Table 2. The correlation analysis demonstrated weak link between all of the dosimetric difference and its geometric metrics in the BT planning structures.
Table 2
Discussion
Brachytherapy is the key aspect of treatment for post-operative cervical cancer with high-risk factors. 3D-IGBT technology can produce the optimal dose distribution in target regions, and decrease the radiation dose to healthy tissues [30]. However, IGBT procedures with increasing real-time steps require more technical and manpower resources. The applicator reconstruction is one of the real-time steps in the design of 3D-IGBT plan, and the reconstruction accuracy is always dependent on the experience and subjective assessment of physicists. This highlights the importance of a rapid and accurate reconstruction method, which would improve the IGBT workflow through automation. In this work, we proposed a DL-based model for automatic needle reconstruction during CT image interstitial IGBT treatment planning.
Various applicators are used for implantation before IGBT planning, such as intra-uterine and ovoid tubes, vaginal applicator, ring applicator, interstitial needle, etc. Using different methods or DL models would generate different reconstruction results for a specific applicator. We summarized auto-segmentation results for applicators in brachytherapy from other published literature. The comparison of DSC and HD for different methods is presented in Table 3. Image thresholding and density-based clustering were applied to segment the tandem and ovoids applicator, and HD was ≤ 1 mm [31]. A DSD-U-Net model [26] was proposed to reconstruct the intra-uterine and ovoid tubes, and achieved average DSC value of 0.92. A U-Net model [32] was used to automatically segment Fletcher applicator with average DSC value of 0.89. A 2D U-Net algorithm [33] was tested to reconstruct the needles, with average DSC value of 0.59 and HD value of 4.2 mm, based on MR images. Two phases DL-based segmentation and object-tracking algorithms were adopted to reconstruct the interstitial needles in CT-guided prostate brachytherapy. In a study [34], DSC between the network output and the ground truth was 0.95. In the present work, the nnU-Net model was trained and reconstructed the metallic needles with average DSC value of 0.89, and 95% HD value of 0.74 mm based on CT images. Compared with the type of tandem and ovoid applicator, the type of needle applicator is more difficult for location due to its slender shape (about 2 cc) in 3D images. Peroni [35] reported a range of DSC values that generally denoted good agreement depending on structure volume, such as the agreement of DSC value of 0.4-0.6, when the structure volume is 1-5 cc. Evidently, our DL model obtained superior geometric accuracy of needles reconstruction. This mainly benefits form the cross-validation of nnU-Net, and achieves the best ensemble during training process.
Table 3
Author(s) [Ref.] | Methods or DL models | Image type | Applicators | Enrolled patients | Results |
---|---|---|---|---|---|
Deufel et al. [31] | Image thresholding and density-based | CT | Tandem and ovoid | 10 patients from Mayo Clinic for testing | DSC not used; HD ≤ 1 mm |
Zhang et al. [26] | DSD-U-Net | CT | Tandem and ovoid | 91 cases from Tianjin Medical University Cancer Institute and Hospital; 32 internal cases for testing | DSC: 0.92; HD: 2.3 mm |
Hu et al. [32] | U-Net | CT | Fletcher | 60 cases from Sichuan Cancer Hospital; 10 independent cases for testing | DSC: 0.89; HD: 1.66 mm |
Shaaer et al. [33] | 2D U-Net | MRI | Interstitial plastic needles | 20 cases from Odette Cancer Centre; Odette Cancer Centre | DSC: 0.59; HD: 0.42 mm |
Mohammad Mahdi et al. [34] | Two-phase DL models | CT | Interstitial plastic needles | 25 cases from Shohada-eTajrish Educational Hospital; 5 internal cases for testing | DSC: 0.95; HD not used |
Our method | nnU-Net (3D) | CT | Interstitial metal needles | 70 cases from Women’s Hospital in China; 10 internal cases for testing | DSC: 0.89; HD: 0.74 mm |
Dosimetric evaluation is necessary for automatic reconstruction in IGBT workflow. Yoganathan et al. [36] demonstrated the importance of dosimetric evaluation over geometric evaluation for an automatic problem in cervical cancer BT. Schindel et al. [37] reported that the reconstruction uncertainty could cause dosimetry change greater than 10% for MRI-based BT. Therefore, we compared the dose distribution between standard original BT plan and DL plan for every planning structure. Wilcoxon signed-rank test indicated no significant dosimetric differences in HR-CTV and OARs between the two methods. Meanwhile, Spearman correlation analysis showed weak link between geometric metrics and dosimetric differences. This might prove the automatic reconstructions of metallic needles are an alternative to the manual operation.
In this work, we investigated the performance of DL-based automatic reconstruction of metal needles in post-operative cervical cancer patients treated with IGBT. Furthermore, automatic method would improve the accuracy and efficiency, and decrease the uncertainties in adaptive IGBT process. Moreover, the application of intelligent methods may promote the development of BT, and auto-reconstruction of applicator is one of the essential tasks in the component of fully automatic IGBT plan.
There are still several limitations in this study. First, this auto-reconstruction approach may not be suitable for other situations, such as vaginal applicator or Fletcher applicator. The reason was mainly caused by single-training dataset in our DL-based model, and increasing the amount of training data including various applicators in IGBT workflow could make the DL model more robust. Second, the model was developed and evaluated based on CT images. The ability of DL-based model lacks evaluation of other imaging modalities. For different imaging settings, re-training of the DL model is recommended to ensure similar performance.
Conclusions
This study has demonstrated that our DL-based reconstruction method can be used to precisely localize metal interstitial needles in post-operative cervical cancer IGBT with 3D-CT images. The proposed automatic approach can reduce the variability and relieve physicists from the labor-intensive tasks.