Pairwise machine learning-based automatic diagnostic platform utilizing CT images and clinical information for predicting radiotherapy locoregional recurrence in elderly esophageal cancer patients

Dataset and preparationInclusion criteria and study population

The ethical approval for this study was obtained from the ethics committee of the Fourth Affiliated Hospital of Hebei Medical University. According to the definition of the World Health Organization, we employed an age cutoff of 75 years to classify patients as elderly individuals. Because ESCC accounted for almost 90% of all EC instances in China [13], we limited our analysis to patients with ESCC diagnosed by pathology. The following selection criteria were applied: (1) Age 75 or above. (2) Eastern Cooperative Oncology Group performance status (ECOG PS) of 2 or less. (3) No prior history of cancer. (4) Radiation therapy was administered to each patient for the first time. (5) Absence of distant organ metastasis except for supraclavicular lymph node metastasis. (6) Absence of severe lung, heart, or liver disorder. The exclusion criteria are (1) Diagnosis of esophageal fistula with accompanying esophageal stent implantation. (2) Receipt of low-dose palliative radiotherapy. (3) Receipt of preoperative or postoperative adjuvant radiotherapy. (4) Poor visualization quality on CT images.

This study collected the medical records data of ESCC patients over 75 who underwent radical radiotherapy at the Fourth Affiliated Hospital of Hebei Medical University between January 2017 and December 2019, and 130 eligible patients were enrolled in the study, with age range 75–90. The clinical stage was determined based on the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) classification scheme, 8th edition [14].

Treatment

89 patients among the subjects accepted radiotherapy alone, 28 received concurrent chemoradiotherapy, and 13 received a sequential chemoradiotherapy scheme. Three-dimensional conformal or intensity-modulated radiotherapy was used to carry out all treatment plans. The protocols should be followed regarding dose restrictions for the organs at risk and the definition of the radiation target volume. The total group’s Planning Target Volume (PTV) and Gross Tumor Volume (GTV) prescription radiation doses ranged from 50.0 to 64.0 Gy (median 60 Gy). PTV was given 1.8–2.0 Gy/fraction, while GTV was given 1.95–2.15 Gy/fraction, 5 times weekly. The physiotherapist completed the treatment plan as needed, and a senior physician approved it. The following was the specific chemotherapy treatment scheme [5]: TS-1, cisplatin combined with paclitaxel, and cisplatin combined with 5-fluorouracil. The final choice of the chemotherapy treatment plan was mainly due to the results of the expert decision and their treatment intention. Among the concurrent chemotherapy patients, 22 patients chose T-S1, 6 patients chose the TP scheme (paclitaxel combined with cisplatin), and among the sequential chemotherapy patients, 7 patients received the TP scheme, 4 patients chose TS-1, and 2 patients chose the FP scheme (5-fluorouracil combined with cisplatin).

Image processing

The CT images for each patient were reviewed using itk-snap software (http://www.itksnap.org). A radiation therapist with 15-year experience in esophageal cancer (EC) imaging (A.D.Z.) reviewed all the image and delineated the outline of esophageal cancer layer by layer, and the air tissue in the esophagus is removed in the pre-treatment contrast-enhancement CT images. To evaluate inter-class agreement, a total of 20 patients were randomly selected from the entire cohort, and independent segmentation was performed by an radiologist with 15-year experience (Y.L.). Radiomics parameter extraction and image pre-processing were performed using the pyradiomics package (version 2.12; https://pyradiomics.readthedocs.io/en/2.1.2/). After normalization, resampling, and quantization in pre-processing, quantitative features based on the original image were derived from the Region of Interest of each patient. The extraction of features encompasses various categories, namely first-order statistics (first order), shape eigenvalues (shape), and texture features, which include gray level co-occurrence matrix, gray level run length matrix, gray level region size matrix, gray level difference co-occurrence matrix, and neighborhood gray level difference matrix. Clinical risk factors included as below: ECOG, age, sex, history of alcohol and tobacco, family history, length of the tumor, location of the tumor, and volume of the tumor, T stage, N stage, supraclavicular lymph node, TNM stage, PTV dose, GTV dose, whether received chemotherapy, maximal wall thickness (MWT) before RT, node size (NS) before RT.

Model establishment

To enhance the accuracy and stability of the model, we optimized algorithms in data standardization, dimensionality reduction, and feature value screening. Various approaches were compared during the modeling process. For data standardization, we compared algorithms such as normalization to a unit, normalization to 0-center, and normalization to a unit with 0-center. Regarding dimensionality reduction, the effectiveness of principal component analysis (PCA) and Pearson correlation coefficients (PCC) methods were evaluated. In the feature screening stage, we compared the impact of multivariate analysis of variance (ANOVA), recursive feature elimination, and Relief methods on the model. The best combination scheme was then determined to establish the model. Finally, 10 classifiers including support vector machine, linear discriminant analysis, logistics regression, naive Bayes, etc. were compared. The whole training process was as follows: Firstly, all data sets were divided into training sets and test sets according to a ratio of 7:3, and then a fivefold cross-validation method was used to train the model on the training set. During training, the training data set was randomly divided into five subsets (folds) of approximately equal size. Secondly, the model was sequentially trained on four folds and validated on the remaining one, rotating until each fold has served as the validation set. In each fold, the model’s performance metrics (like accuracy, area under curve) were recorded. After completing the above steps for each fold, the performance metrics were averaged across all five folds to obtain a single estimation of model performance. Finally, the established model’s performance was evaluated with the initially generated test set.

To enhance the accuracy and robustness of the model within a small sample set, we employed a pairwise machine learning analysis based on metric learning. This method relied on assessing the similarity between typical cases (templates) and other cases to predict LR following RT. In this analysis, seven representative cases were selected, comprising both experienced and non-experienced LR instances. Subsequently, these cases were paired with other samples to calculate the distance metrics. The pairs among the same group were called “positive pairs,” and the pairs among different groups called “negative pairs” (Formula 1). Finally, according to the classification results of positive and negative pairs and the label categories of the template, the final sample category was confirmed by a voting scheme (Formula 2 and 3).

$$}}_(\text)\text}= }}_}-}}_} \left(\text\in \,3,...7\};\text=1, 2, 3\dots \dots \text\right)$$

(1)

$$}_}=\text\left(}_}+(1-}_})\right) (\text=1,\text\dots \dots \text)$$

(2)

$$}_}=\left\ if }_} <0.5 and }_}\in 1(0) , then }_}\in 0(1)\\ if }_} >0.5 and }_}\in 1(0) , then }_}\in 1(0)\end\right.$$

(3)

where, $}}_}$ and $}}_}$ represent eigenvalue vector of the $I$th template and the $i$th sample, $}}_}$ represents eigenvalue vector of the positive pair, $}}_}$ represents eigenvalue vector of the negative pair, $}_}$ and $}_}$ represent predicted probability of positive pair and negative pair, $}_}$ represents to the average probability that a sample eventually belongs to the positive sample pair, $}_}$ represents the label of a template, $}_}$ represents the class attribute of a sample. According to the diagnostic performance, the optimal model was determined. The whole modeling process was shown in Fig. 1.

Fig. 1

The flowchart of data preprocessing and model establishing. A Manual delineation of the esophageal cancer and 3D view; B The extraction of radiomics eigenvalues, including the first-order eigenvalues, shape and texture eigenvalues; C Sample-data paring and the establishment of the model; D Evaluation of the model

Statistical analysis

To determine the accuracy and the repeatability of the delineated tumor volume, the average Dice and the inter-class correlation coefficient (ICC) about the volume, surface area, maximum diameter length, minimum diameter length and other morphological indicators of the two delineated tumors were calculated respectively. The baseline differences in clinical characteristics between the training set and testing set were evaluated using statistical tests such as the Chi-squared test, Fisher’s exact test, or Mann–Whitney U-test. Additionally, a decision curve analysis (DCA) was performed on the testing dataset to assess the clinical utility of the model, quantifying the net benefits across various threshold probabilities. To evaluate the model fit, the goodness of fit was assessed using the Hosmer–Lemeshow test. The performance of the model was assessed via the area under the receiver operating characteristic curve (AUC). A p value of < 0.05 was considered statistically significant. The statistical analysis is performed using the keras, pingouin and pROC packages based on python 3.10 and R language (V4.2.1) respectively.

View original article

ABDOMINAL RADIOLOGY

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Pairwise machine learning-based automatic diagnostic platform utilizing CT images and clinical information for predicting radiotherapy locoregional recurrence in elderly esophageal cancer patients

Comments (0)