-
Problem Description- Histopathological analysis is critical for cancer diagnosis, where pathologists examine tissue samples under a microscope to detect and classify cancerous cells. This process faces challenges like time constraints, expert dependency, and inter-observer variability. In 2020, lung and colon cancers affected 4.19 million people worldwide, resulting in 2.7 million deaths. The shortage of pathologists, especially in developing regions, creates bottlenecks in timely diagnosis. AI can transform this field by providing rapid, consistent preliminary assessments. Recent advancements in machine learning techniques have demonstrated promising results for histopathological image analysis. This project aims to develop an automated classification system that can accurately distinguish between benign and malignant tissues in lung and colon samples, potentially supporting early detection and reducing diagnostic delays that impact patient outcomes
-
Model Architecture- The model used is a custom Convolutional Neural Network (CNN) defined in
model.py, designed specifically for this histopathology classification task. The architecture consists of the following key components:- Input: Accepts images with the number of channels specified in
config.INPUT_CHANNELS. - Convolutional Blocks (4 Blocks): The core of the network comprises four sequential convolutional blocks. Each block follows a similar pattern:
- Two
Conv2dlayers with 3x3 kernels and padding=1. These layers extract features from the input. The number of output filters increases with each block (32 -> 32 in Block 1, 32 -> 64 in Block 2, 64 -> 128 in Block 3, 128 -> 256 in Block 4). BatchNorm2dlayers after each convolution to stabilize learning and improve convergence.ReLUactivation functions after each BatchNorm layer to introduce non-linearity.- A
MaxPool2dlayer (2x2 kernel, stride 2) at the end of each block to reduce spatial dimensions and provide some translation invariance. Dropoutlayers after each pooling layer, with increasing probability (p=0 for Blocks 1 & 2, p=0.15 for Block 3, p=0.2 for Block 4) to regularize the network and prevent overfitting in deeper layers.
- Two
- Flattening: After the final convolutional block, the feature maps are flattened into a 1D vector. The size of this vector is calculated dynamically based on the input image size (
config.IMG_SIZE) and the effect of the pooling layers. - Fully Connected Layers:
- A
Linearlayer maps the flattened features to an intermediate size of 512 nodes, followed by aReLUactivation. - A
Dropoutlayer with a higher probability (p=0.4) is applied before the final classification layer for further regularization. - A final
Linearlayer maps the 512 features to the number of output classes (config.NUM_CLASSES, which is 5). This layer outputs the raw logits for each class.
- A
- Input: Accepts images with the number of channels specified in
This architecture progressively extracts more complex features while reducing spatial dimensions and uses techniques like BatchNorm and Dropout to facilitate stable training and improve generalization on the histopathology images.
- Dataset: I have used the "Lung and Colon Cancer Histopathological Images" dataset.
- Download: It can be found on Kaggle or the original GitHub repo.
- Setup:
- Download the data.
- The Dataset folder inside is organised in such a way
Dataset/ ├── colon_image_sets/ │ ├── colon_aca/*.jpeg │ └── colon_n/*.jpeg └── lung_image_sets/ ├── lung_aca/*.jpeg ├── lung_n/*.jpeg └── lung_scc/*.jpeg
- Citation:
Borkowski AA, Bui MM, Thomas LB, Wilson CP, DeLand LA, Mastorides SM. Lung and Colon Cancer Histopathological Image Dataset (LC25000). arXiv:1912.12142 [eess.IV], 2019.
- Clone: Get the project files.
git clone https://github.com/anuvindpramod/project_anuvind_pramod.git cd project_anuvind_pramod - Dataset: Set up the
Datasetfolder as described above. - Environment: Create and activate a Python environment (e.g., using
condaorvenv). - Install: Use the provided
requirements.txtfile.pip install -r requirements.txt
I used a two-stage approach:
- Initial Learning: First, I tried to focus on getting the model to learn the basic patterns, so I tried an approach without any kind of strong regularization like augmentations, to check its capacity. (This will require manual setup if repeating).
- Fine-tuning: After the model was pre-trained in such a way, the model was then finetuned further carefully using the code in
train.py. This involved loading the weights from the best performing model from the pre-training step and then using them along with data augmentation (like flips and rotations defined indataset.py), using a lower learning rate, and employing early stopping to get the best generalization performance based on validation metrics. Checkpoints are saved in the_checkpointsdirectory.
- Train: Start the training process (uses settings from
config.py).python train.py
- Test: Evaluate the best model on the test set.
python test.py
- Predict: Run prediction on sample images placed in the
data/folder). The data/ folder has all 5 class folders with each class folder containing 10 images of that classpython predict.py
The performance of the model is evaluated using the following metrics (calculated in train.py's calculate_and_print_metrics function and used by test.py):
- Overall Accuracy: The proportion of correctly classified images out of the total.
- Precision, Recall (Sensitivity), F1-Score: Calculated per class, as macro averages, and as weighted averages. These metrics assess the model's ability to correctly identify positive instances and avoid false positives/negatives for each category.
- Confusion Matrix: A table showing the actual vs. predicted class counts, revealing specific misclassification patterns.
- Specificity: Calculated per class, measuring the model's ability to correctly identify true negative instances.
- AUC (Area Under the ROC Curve): Calculated using the One-vs-Rest (OvR) strategy with macro averaging, providing a measure of the model's ability to distinguish between classes across different thresholds.
The model was evaluated on a held-out test set comprising 5000 images (20% of the total dataset, 1000 images per class). The evaluation script (test.py) loaded the best weights saved during training (_checkpoints/best_val_weights.pth).
Key results on the test set are:
- Overall Accuracy: 0.9958 (99.58%)
- Macro Average F1-Score: 0.9958
- AUC (Macro OVR): 0.9999
Performance Highlights:
- The model achieved near-perfect classification for
lung_n,colon_aca, andcolon_nclasses (1.000 F1-score). - Performance for
lung_aca(F1: 0.9894) andlung_scc(F1: 0.9896) was also very high. - Specificity was excellent across all classes (ranging from 0.9958 to 1.0000).
- The confusion matrix showed minimal errors: the primary confusion was between
lung_acaandlung_scc(17lung_acapredicted aslung_scc, 4lung_sccpredicted aslung_aca).
These results indicate that the trained model generalizes very well to unseen data, achieving high accuracy and discriminative power across all five histopathological categories.
config.py: Key settings and hyperparameters.dataset.py: Handles loading and transforming the data.model.py: Defines the CNN architecture.train.py: Runs the model training loop.test.py: Evaluates the model on the test set.predict.py: Predicts classes for given image paths.interface.py: Standardizes names for grading compatibility._checkpoints/: Stores saved model weights.data/: The data/ folder has all 5 class folders with each class folder containing 10 images of that class, this will be used for running the predict.pyDataset/: Where the full dataset is placed.requirements.txt: List of Python packages used.README.md