Machine learning enabled classification of lung cancer cell lines co-cultured with fibroblasts with lightweight convolutional neural network for initial diagnosis

Cell lines, patient-derived lung cancer cells, culture, and reagents

We obtained primary human dermal fibroblast (HDF-a), NCI-H460 (large cell carcinoma), A549 (adenocarcinoma), and H520 (squamous cell carcinoma) cell lines from ATCC. These cell lines were cultured in RPMI medium supplemented with 10% fetal bovine serum at 37 °C with 5% CO2. Patient-derived lung cancer cells, HCC 4087 and HCC 4190, were provided by UT Southwestern Medical Center. Both cells are classified as adenocarcinoma, confirmed via patient-derived xenograft (PDX) model and comprehensive genetic analysis: HCC 4087 harbors the KRAS oncogene, while HCC 4190 carries the EGFR oncogene. These cells were cultured in RPMI medium supplemented with 10% fetal bovine serum. For 3D printing of magnetic cell culture platform, we used Anycubic Photon Mono and Anycubic 3D Printer 405 nm UV-Curing Resin, purchased from Amazon. The convolutional neural network (CNN) was trained on two different on-device platforms without network or cloud connectivity: a laptop equipped with a Nvidia GeForce GTX 1050 with 4 GB of video RAM (Intel i5 9300, 8 GB system RAM) and a laptop equipped with AMD Ryzen 5800HS with 12 GB system RAM.

Lung cancer and fibroblast co-culture for cancer outgrowth collection and data augmentation for machine learningMagnetic co-culture device preparation and cell seeding

A 12-well magnet holder was created using 3D-printed UV resin. Magnets were inserted into the holder, which was then sterilized. Inside a biosafety cabinet, a 12-well plate was placed atop the magnet holder. Sterilized stainless-steel tubes were positioned over the magnets, within each well [17]. Each well was seeded with 5,000 cancer cells (H460, A549, H520, HCC 4087, or HCC 4190) in 3µL inside the tube and 20,000 HDF-a fibroblasts in 40µL outside the tube. Cells were incubated 37 °C with 5% CO2 for 3 h to allow cell attachment to the well plate. After incubation, the tubes were carefully removed using tweezers within the biosafety cabinet. Brightfield images were taken to confirm correct seeding and to record the initial state of each well (Day 0, Fig. 1c).

Fig. 1figure 1

Lung cancer growth collection, image modification, and augmentation for machine training. a Original images taken sequentially, as indicated by the numbers in the corner (2 × 2 grid at 2.5 × magnification). b Combined image after applying a stitching program. Scale bar = 1 mm. c Fluorescent image showing differentially stained cells prior to seeding: cancer cells (H460) in red and healthy HDF-a fibroblasts in green. d Original image resized to 512 × 512 pixels (RS) and then augmented by horizontal flipping (HF) and vertical flipping (VF). e Examples of original image rotations at 90°, 180°, and 270° for data augmentation

Culturing and imaging

Following the initial imaging, 1.5 mL of fresh RPMI medium supplemented with 10% FBS was added to each well. The cells were incubated for 9 days. Daily images were captured at a consistent time for each well. Images were taken in a 2 × 2 grid at 2.5 × magnification, centered on the seeding island formed by the tube (Fig. 1a). These images were stitched together using the stitching program developed by Preibisch et al. [18]. To ensure consistency, the images were centered and cropped to a uniform area using Adobe Photoshop.

Data augmentation for machine learning

To expand the dataset for machine learning, we performed data augmentation on the collected cancer outgrowth images. The initial dataset consisted of images from 86 wells (H460), 42 wells (A549), 44 wells (H520), 37 wells (HCC 4087), and 42 wells (HCC 4190). Due to the low number of images, we augmented the data by performing rotation and mirroring transformations. Each image was resized to 512 × 512 pixels and then flipped horizontally and vertically. The original and transformed images were rotated by 90, 180, and 270 degrees, creating a dataset 12 times larger than the original. This augmentation resulted in the following datasets for machine learning: H460 (8,292 augmented images for training and 1,919 images for validation), A549 (4,200 images for training and 839 images for validation), H520 (4,308 images for training and 959 images for validation), HCC 4087 (3,564 images for training and 792 images for validation), and HCC 4190 (3,372 images for training and 696 images for validation). The purpose of this extensive data augmentation was to enhance the generalization capability of the convolutional neural network (CNN) model, allowing it to better predict and forecast outcomes with new data (i.e., unseen images for validation). The chosen augmentations were justified by the radial and random outgrowth patterns of the cancer cells, ensuring that rotated configurations would still reflect natural cancer outgrowth behaviors (Fig. 1).

TinyVGG convolutional neural network parameters for cancer classification

The most basic structure of a CNN model includes the following layers, with the output of each layer feeding into the next: a convolutional layer, an activation function, a pooling layer, a flattening layer, a dense layer, and a loss function. We used the lightweight TinyVGG model [19] (shown in Fig. 2) because it is designed for image classification tasks while requiring fewer computational resources. This allows it to operate locally on a secured laptop without network or cloud connectivity and simplifies training by having fewer parameters to adjust.

Fig. 2figure 2

Schematic of the TinyVGG model architecture for lung cancer classification. The TinyVGG model for lung cancer classification is initialized with random weights. Training images, depicting lung cancer outgrowth over surrounding fibroblasts (e.g., 8292 augmented H460 images for training), are processed in batches over a designated number of epochs. The batch size refers to the number of images processed before updating the model’s weights. An epoch represents a complete pass through the entire training dataset. After each epoch, the validation set (e.g., 1919 augmented H460 images for validation) is used to assess the model’s accuracy on unseen data (Validation accuracy). The model iteratively optimizes its weights to minimize the training loss. The architecture includes multiple layers, with Conv2D denoting 2D Convolution layers

Convolution

The convolutional layers have customizable parameters such as output-channels, kernel size, stride length, and padding. These parameters create a number of filters equal to the input channels multiplied by the output channels, with a size determined by the kernel size, stride length, and padding. The kernels perform mathematical operations on the pixels using a matrix the size of the kernel dimensions. The kernel operates on a set of pixels and translates across the image by a number of pixels based on the stride value. Padding adds a layer of blank pixels around the edge of the image to retain image size and enhance edge effects. Without padding, the edges of an image would be reduced due to fewer kernel operations applying to them.

For example, a 512 × 512x1 pixel image input with 3 output channels, a kernel size of 3 × 3 pixels, a stride of 1 pixel, and padding of 1 pixel generates an output of 512 × 512x3. Changing the kernel size to 4 × 4 pixels and the stride to 2 pixels results in an output feature map of 256 × 256 pixels, halving the width and height dimensions of the output tensor. The feature maps are learned based on the model’s loss function, with more feature maps meaning more trainable parameters.

In our model, the parameters of kernel size, stride length, padding, and filters for the convolutional 2D layers are shown in Table 1. The Kernel size was chosen to give the largest feature maps. Layer 1 (first Conv2D in Fig. 2, first blue bar) uses a kernel size, stride, and padding to reduce image size by half (from 512 × 512x1 to 256 × 256x1) for model complexity reduction, while Layers 2–4 (2nd to 4th blue bars in Fig. 2) maintain the image size between input and output. Further reductions in size between Layer 2 and Layer 3 are due to the Max Pooling functions (orange bar in Fig. 2). Output channels are maintained at 10 to increase filters after the initial convolution (e.g., 10 to 100 filters) to capture higher-order features. H x W x C = Height x Width x Channel(s).

Table 1 List of utilized convolutional layer parameters for lung cancer classificationActivation

The activation function applies an element-wise operation to the feature maps of the convolutional layer and always follows a convolutional layer. Often, the activation function is assumed and not included in the visual representation of a CNN model. Common activation functions include ReLU (rectified linear unit), sigmoid, and hyperbolic tangent functions. The ReLU function sets all negative values to zero while directly transferring the positive values (y = x), and it is the function utilized in our model.

Pooling

Pooling layers (orange bars in Fig. 2) are used after convolutional blocks (multiple convolutional layers and activation functions) to reduce the computational load of training and mitigate overfitting. The inputs to a pooling layer are the kernel size and stride. Our model uses a 2 × 2 pixel kernel with a stride of 2, which reduces the output feature map to half the size for both width and height, ending with 25% of the starting data. The two common types of pooling layers are average pooling and max pooling. Average pooling takes the average of the values in the 2 × 2 kernel as the output for a single value, while max pooling takes the highest value of the kernel as the output. Max pooling, which preserves the most prominent features, is the pooling layer utilized in our model.

Flatten and dense layers

The flatten layer (gray bar in Fig. 2) transforms the 3D matrix of data into a 1D vector so that the dense layer (yellow bar in Fig. 2) can utilize the features found and identify patterns for cancer classification. The dense layer creates the parameters to be adjusted for the resulting output classification (H460 classification in Fig. 2). We use the Linear function in Pytorch, which creates a model of y = xAT + b, where x is the input value, y is the output value, and A, T and b are learnable weights, for each input value in the flattened tensor. Essentially, each pixel value has a customizable linear function associated with it that the model can tune during training. These sum together to form the overall output classification of the model given an input.

Loss function and accuracy

The loss function gauges the effectiveness of the weights in the dense layer. We used the Cross Entropy function, which compares the predicted class of an input image (based on the output of the dense layer) with the ground truth value (label) of the image class. The magnitude of the difference between the true classification value and the output of the algorithm is the “Loss”. The algorithm undergoes training sessions where the input data is fed into the model, and the weights are periodically adjusted to minimize the loss function. Accuracy is another common metric in classification that indicates how often the algorithm correctly classifies images. While accuracy is not related to the magnitude of classification errors, it is useful for diagnosing training issues and interpreting the acceptability of loss values.

Training and validation

Typically, the dataset is split for training and validation, with approximately 80% of the data set aside for training and the remainder for validation. The training dataset informs the algorithm’s weights, while the validation dataset serves as the optimization metric. This split helps reduce overfitting, as the validation dataset should be “unknown” to the algorithm regarding weighting and provides initial feedback on the model’s potential efficacy.

Comments (0)

No login
gif