An overview of this study is presented in Fig. 1. The dataset consisted of a development set and a test set. To develop a tissue segmentation model, 366 WSIs of ESD from 140 patients with EGC were retrospectively enrolled. The scanned WSIs were manually annotated using QuPath v0.2.0-m2 software [13] by S.A., a gastrointestinal histopathologist with 12 years of experience. For each WSI, two to 10 regions of interest, approximately 0.5-mm2 size, were selected (Fig. 2a). Most tumor areas and some non-tumor areas were included. A total of 2257 regions of interest were marked, and pixel-level annotation was performed for the tumor and muscularis mucosa (Fig. 2b) generating 83,839 patch images. An example of manually annotated tumor region is shown in Fig. 2. The development dataset was randomly divided in a 5:1 ratio into a training set and an internal validation set, and the tissue segmentation performance at the patch level was evaluated in the internal validation set.
Fig. 1Overview of the data and deep learning framework for tissue segmentation presented in this study. (a) Sample of training images with annotation for developing tissue segmentation model. Red annotation represents tumor, green annotation represents muscularis mucosa. (b) Description of dataset for tissue segmentation model development and dataset for WSI-level tumor detection evaluation and submucosal invasion detection evaluation. (c) Deep learning framework for tissue segmentation model. Feeding H&E image to tissue segmentation model, it will output three channel probability maps, each for tumor, muscularis mucosa and others class, the loss between the output channels and the ground truth input mask is backpropagated to the model to update its weights to conduct training process
Fig. 2An example of manually annotated tumor region. (a) A total of 2257 regions of interest is marked, and (b) pixel-level annotation is performed for the tumor (red) and muscularis mucosa (green) by a gastrointestinal pathologist
Next, a detection algorithm for tumor and submucosal invasion at the WSI level was developed (Fig. 3). Its performance was tested using another test dataset comprising 111 WSIs from 61 patients. It consisted of 76 tumor WSIs and 35 non-tumor WSIs. The presence of tumor and submucosal invasion in each WSI was annotated using QuPath v0.2.0-m2 software [13] by S.A., and the prediction of tumor and submucosal invasion by the algorithm was compared with human annotation. This study was approved by the Institutional Review Board of Samsung Medical Center (IRB no. 2021-02-146). The requirement for informed consent was waived by the institutional review board because the samples were anonymized.
Fig. 3Pipeline for WSI-level tumor detection and submucosal invasion detection. H&E WSI is first tiled into patches and input to the trained tissue segmentation model to predict tumor and muscularis mucosa. The output patches are then stitched together to generate WSI-level tissue segmentation. If there is tumor segment in the output WSI, it is considered tumor is detected. For submucosal invasion detection, we draw boxes with grids along the detected muscularis mucosa, if a lot of grids in a box is filled with tumor, it is considered as a submucosal invasion spot. The final overlayed WSI shows predicted tumor and muscularis mucosa overlayed on the H&E WSI
DatasetThe development and test cohorts were obtained from the Samsung Medical Center from ESD specimens received between January 2020 and June 2024. The ESD specimens were cut at 2-mm intervals and serially embedded. The number of sections per slide was typically three (range: 1–4). The average number of slides per patient varied from 3 to 26. In the development cohort, representative slides from each patient were selected. For the test cohort, either representative slides from the patients or all slides from some patients were used. The H&E slides were scanned using an APERIO AT2 (Vista, CA, USA) at 20 × magnification. All cases were histologically diagnosed as adenocarcinoma, and cases with pre-malignant lesions or other tumor types, such as neuroendocrine tumors, were excluded. The clinicopathological information from the dataset is presented in Table 1.
Table 1 Clinicopathologic information of datasetThe cases in the development set were randomly selected during the study period. Based on the Lauren classification, 82.1% were classified as intestinal type, and 7.9% as diffuse type. Since gastric cancers often exhibit heterogeneous histologic features, each WSI was reviewed for subcomponents. Among the 140 cases, poorly differentiated components (≥ 5%) were identified in 18.6%. There was mucosal cancer in 67.9% and submucosal invasion in 32.1%. The test set was intentionally enriched with tumors exhibiting submucosal invasion (70.5%). According to the Lauren classification, 91.8% were classified as intestinal type, and 1.6% as diffuse type. Poorly differentiated components (≥ 5%) were identified in 16.4%. Finally, slides from 10 patients were selected from the test dataset to evaluate the performance of the pathologists. There were three mucosal cancers, and seven with submucosal invasion. The three mucosal cancers were histopathologically well-differentiated tubular gastric foveolar adenocarcinomas.
Tissue segmentation model and training detailsThree widely used semantic segmentation models were selected (U-Net++ [14], DeepLabv3+ [15], and SegFormer [16]) to segment the gastric tissue H&E images into three classes: tumor, muscularis mucosa, and others. U-Net++ and DeepLabV3+ are convolutional neural network-based models, for which EfficientNet-B3 [17] was used as the backbone and SegFormer, a transformer-based model used MiT-B1 as its backbone. These models were implemented using the PyTorch framework [18] and Segmentation Models Pytorch library [19]. They were trained and evaluated on the patch-level dataset (Fig. 1b) to identify the best-performing model for further testing on detection of tumor and submucosal invasion at the WSI level.
The model-training process is illustrated in Fig. 1c. An H&E-stained image is fed into the tissue segmentation model to produce a three-channel probability map of the same size, with each channel representing a specific class. The model iteratively improves predictions by minimizing the loss between the output probability map and the input mask, which corresponds to the ground truth segmentation derived from human annotation. The pixel-wise categorical cross-entropy loss function is defined as:
$$\text=-\sum_^\sum_^\sum_^_\text}_$$
(1)
where \(H\) and \(W\) are the height and width of the image, \(C\) is the number of classes, \(_\) is a binary indicator (0 or 1) if the class label c is the correct classification for pixel (i, j), \(}_\) is the predicted probability that pixel (i, j) belongs to class c.
For model training, the RAdam optimizer [20] was used with a learning rate of 0.0001. The learning rate was reduced by 0.5 every 30 epochs, with a batch size of 32. Training of the models was stopped when there was no improvement in the validation loss. Common image augmentation techniques, such as blurring, rotation, flipping, color jitter, and cutmix [21] were applied to enhance the robustness and generalization ability of the model.
Submucosal invasion detection algorithmThe process of detecting submucosal invasion at the WSI level is shown in Fig. 3. First, the WSI was divided into patches matching the size used to train the tissue segmentation model. These patches were then inputted into the model to predict tumor and muscularis mucosa at the pixel level. To conserve memory and computational resources during post-processing, the predicted patch maps were scaled down and reassembled into a complete WSI-level map.
The algorithm for detecting submucosal invasion using the identified tumor and muscularis mucosa regions is described below. Submucosal invasion occurs when a tumor breaks through the muscularis mucosa and infiltrates the submucosa. The detection method focused on the extent to which the tumor is distributed within the muscularis mucosa. Specifically, if tumor is present in a substantial portion of the surrounding area at a given point along the muscularis mucosa, it is inferred that there is a high likelihood of submucosal invasion. The bottom-right image in Fig. 3 illustrates the concept of this algorithm. Three points are centered on the detected muscularis mucosa pixels, with a square grid representing the surrounding area. Spot “a” has no tumor around it; therefore, it cannot be an invasion spot. Spots “b” and “c” have tumor in the grids, but spot “b” has a higher proportion of grids with tumor compared to spot “c”. Therefore, spot “b” is more likely to be a submucosal invasion spot. According to the algorithm, if more than 50% of the grids around a spot have tumor, it is likely to be a submucosal invasion spot.
The detailed submucosal invasion detection algorithm is as follows. For each point on the detected muscularis mucosa, an \(n\times n\) grid is centered on that point. For each grid, we count the number of squares, m, where a tumor is detected. A higher \(m\) indicates a higher probability of submucosal invasion. The cutoff of \(m\), was defined as δ, and \(p\) was defined as the probability that a WSI contains submucosal invasion. Two approaches were used to calculate \(p\):
1.Max-based approach: The maximum \(m\) value in the WSI, denoted by \(_\), is used. The \(p\) is then calculated as:
$$p = \begin\displaystyle \frac}, & \text m_ > \delta, \\0, & \text m_ \leq \delta.\end$$
(2)
2.Mean-based approach: If there are \(k\) grids where \(_>\delta\), the probability \(p\) is calculated as:
$$p = \begin\displaystyle \left( \sum_i^k \frac \right)\! \bigg/ k, & \text k > 0, \\0, & \text k = 0.\end$$
(3)
Application of the deep learning model in histopathological diagnosisThe performance of the pathologists in the diagnosis using WSIs of slides from ESDs with and without deep learning model assistance was evaluated. Three pathologists (YJ. C., I. H, and JM. N) evaluated 57 digitally scanned WSIs of 10 ESDs from the test dataset and were requested to mark the tumor area and complete the diagnostic format used in routine diagnostic practice. With the assistance of the deep learning model, tissue segmentation was provided; a heat-map flagging tumor and muscularis mucosa over the WSI could be turned on and off in QuPath v0.2.0-m2 software [13]. The prediction value (presence/absence) for submucosal invasion was not provided. The second evaluation was performed after a 4-week wash-out period. Diagnostic accuracy and mean diagnostic time were compared.
Statistical methodsTo evaluate the performance of the deep learning models developed in this study, the Dice Coefficient was calculated to assess the accuracy of tumor and muscularis mucosa segmentation at the patch level. The area under the receiver operating characteristic (AUROC) curve values were computed to measure the diagnostic accuracy of the model for detection of tumor and submucosal invasion at the WSI level. The optimal thresholds for sensitivity and specificity were determined using Youden’s J statistics.
Paired t tests were used to compare diagnostic times, with and without the assistance of the deep learning model. Statistical analyses were performed using SPSS 26.0 (IBM, Armonk, NY, USA), with a significance level of P < 0.05.
Comments (0)