This multicenter retrospective study was conducted at 21 institutions. The study was approved by the institutional review board of Osaka University (approval number: 22171, approval date: July 26, 2022) and the participating hospitals and was performed in accordance with the guidelines outlined in the Declaration of Helsinki.
We used the data of consecutive EGC patients who were treated with surgery, ESD with additional surgery, or ESD alone between 2010 and 2021 and were histologically confirmed as having endoscopic curability C-2. EGC was defined as an adenocarcinoma limited to the mucosa or submucosa, irrespective of LNM [17]. Exclusion criteria were as follows: special histological types of gastric cancer (e.g., neuroendocrine neoplasms, carcinoma with lymphoid stroma, adenocarcinoma of the fundic gland type [18, 19]), esophagogastric junction cancer, synchronous advanced cancer (in the stomach or other organs), synchronous EGC with endoscopic curability C-2, postoperative stomach, and missing data. Patients in the surgery group who had undergone preoperative chemotherapy were excluded. For the ESD-alone group, patients with follow-up periods < 3 years, not including patients who died of known causes within that time, or those who received adjuvant chemotherapy after ESD alone were excluded. Finally, cases with no lymphadenectomy in a surgical procedure (i.e., only local resection) were excluded even if the patients could be followed up for ≥ 3 years, because local resection of the stomach was described as an investigational treatment in Japanese gastric cancer treatment guidelines [6] and was not commonly performed. There was thus a possibility of taking an unusual course of events during the surveillance.
Definition of endoscopic curability C-2After endoscopic or surgical resection, histopathological evaluation was performed according to the Japanese classification system at each institution [17]. Specimens resected by ESD were sectioned at 2 mm intervals, whereas surgically resected specimens were sectioned at 5 mm intervals. Lymphovascular involvement was first examined by hematoxylin and eosin staining, and in cases with inconclusive findings, immunohistochemical staining was added.
Resected EGC was defined as under the curative state when it was resected in one piece, had no cancer-positive margins or lymphovascular involvement, and had one of the following conditions: (i) mucosal differentiated cancer with no ulceration; (ii) mucosal differentiated cancer with ulceration, ≤ 30 mm in diameter; (iii) undifferentiated, mucosal cancer without ulceration, ≤ 20 mm in diameter; or (iv) shallow (< 500 μm from the muscularis mucosae) submucosal differentiated cancer, ≤ 30 mm in diameter.
Otherwise, the resected EGC was considered to be in a state of endoscopic curability C (noncurative). If a positive horizontal margin was the only noncurative factor, it was categorized as endoscopic curability C-1. Other conditions were categorized as endoscopic curability C-2, and we only included patients with this histopathological character. The above-mentioned definition for the endoscopic curability C-2 was based on Japanese gastric cancer treatment guidelines [6].
Data collectionThe following data were collected: age, sex, tumor location, size, histological type, invasion depth, histopathological ulceration, lymphatic involvement, and vascular involvement. Histological types were classified as follows: (i) well-differentiated tubular adenocarcinoma (tub1); (ii) moderately differentiated tubular adenocarcinoma (tub2); (iii) papillary adenocarcinoma (pap); (iv) poorly differentiated adenocarcinoma (por); (v) signet-ring cell carcinoma (sig); and (vi) mucinous adenocarcinoma (muc). When more than one histological type was present in the tumor, the first two dominant histological types were collected in descending order (tub2 > tub1). Well-differentiated tubular adenocarcinoma (tub1), tub2, and pap were categorized as differentiated types, and por, sig, and muc were categorized as undifferentiated types. If the lesion had both types of cancer components, it was regarded as a mixed type. Invasion depth was classified into three categories: tumor limited to the mucosa (M), tumor invading the submucosa to a depth of < 500 μm from the muscularis mucosae (SM1), and tumor invading the submucosa to a depth ≥ 500 μm (SM2). Vertical margins were also investigated in patients who underwent ESD (with or without additional surgery). For the ESD-alone group, the development of metastatic recurrence in the lymph nodes and/or other organs during follow-up was also surveyed. Data were obtained from the medical records of each participating institution between August 2022 and December 2022.
Definitions of outcomeThe outcome selected to develop the ML model was LNM. For the surgery or ESD with additional surgery groups, it was defined as the presence of histologically identified metastases in the resected lymph nodes. For the ESD-alone group, it was defined as the development of metastatic recurrence in the lymph nodes and/or other organs diagnosed by computed tomography during follow-up. When patients in the ESD-alone group did not develop metastatic recurrence during a follow-up period of ≥ 3 years, LNM was considered negative. Patients with follow-up periods < 3 years were excluded from the ESD-alone group, except for those who died of known causes.
Development of the ML modelWe created two datasets: a training cohort used to build the ML model and a validation cohort used to compare the performance of the ML model with that of the eCura system. The former included all patient groups (surgery, ESD with additional surgery, or ESD alone), whereas the latter included only patients who underwent ESD (with or without additional surgery). The reasons for this were as follows: (i) the actual prediction target for our ML model and the eCura system were patients who underwent noncurative ESD, and (ii) in the eCura system, a positive vertical margin was set as a risk factor, which is assessable only in lesions resected by ESD. We randomly separated patients who underwent ESD (with or without additional surgery) into training and validation groups.
The ML model was constructed as a neural network with two hidden layers using Scikit-learn (https://scikit-learn.org), an ML library for Python. The training data were divided into four parts during the model training process, and parameter tuning was performed through fourfold cross-validation. We used the Adam optimizer for optimization. After parameter tuning, the first and second hidden layers comprised 6 and 18 nodes, respectively. The final inference model was an ensemble model (simple averaging) of the four models obtained through fourfold cross-validation. Hyperparameters of our ML model are listed in Supplementary Table S1.
For model development, we initially used age, sex, tumor location, lesion size, dominant histology, presence or absence of mixed-type histology, invasion depth, lymphatic involvement, vascular involvement, histopathological ulceration, vertical margin, and treatment method as input parameters. Through parameter tuning within the training dataset, the best predictions were achieved using the following seven factors: lesion size, dominant histology (tub2 or others), presence or absence of mixed-type histology, invasion depth (M, SM1, or SM2), lymphatic involvement (positive or negative), vascular involvement (positive or negative), and treatment method (surgery, or ESD with/without additional surgery). Most of our data were encoded as binary variable (i.e., 0 or 1) except for invasion depth and lesion size. For invasion depth, ordinal encoding was performed, such as 1 for SM2, 0.5 for SM1, and 0 for M. Lesion size was transformed to be in a range from 0 to 1 by dividing the raw data by 100.
Statistical analysisThe Chi-squared and Fisher exact tests were used to compare categorical data, and the Kruskal–Wallis and Mann–Whitney U tests were used to compare continuous data. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of the prediction models, and DeLong’s test was used to compare the AUCs. P values < 0.05 were considered statistically significant. Analyses were performed using JMP Pro version 16 (SAS Institute, Cary, NC, USA) or EZR version 1.61 (Saitama Medical Center, Jichi Medical University, Japan).
Comments (0)