Development of a Dual Magnification Deep Learning Model for Accurate Classification of Breast Cancer from Histopathological Images

Authors

  • Abdulkadiri Mohammed Jameel Department of Computer Engineering, Edo State University Iyamho, Edo State, Nigeria
  • Chukwuemeka Chijioke Obasi Department of Computer Engineering, Edo State University Iyamho, Edo State, Nigeria
  • Daniel Aliu Department of Computer Engineering, Edo State University Iyamho, Edo State, Nigeria

Keywords:

Breast cancer, Deep learning, Histopathology, Dual magnification, EfficientNetV2-S, StyleGAN2-ADA, Class imbalance

Abstract

Breast cancer diagnosis from histopathological images requires accurate and generalizable computational models. In this study, a dual-magnification deep learning approach was developed using paired 40× and 400× image patches from the BreakHis dataset. The model employed an EfficientNetV2-S dual-branch architecture to capture both macro- and micro-level features. Class imbalance, a key limitation of BreakHis, was mitigated using synthetic augmentation with StyleGAN2-ADA, which generated realistic minority-class patches after stain normalization and patch extraction. The dataset comprised 41 training pairs, 9 validation pairs, and 9 test pairs (batch size 8, 100 epochs).

Without augmentation, validation accuracy plateaued at 77.78% despite perfect training accuracy, indicating overfitting. With GAN-based balancing, validation accuracy improved to 96.5%, with an average F1-score of 0.95. For binary classification (benign vs malignant, n = 76), the model achieved 97.3% accuracy, AUROC of 0.982, and AUPRC of 0.978. In an extended evaluation on four histopathological subtypes (n = 109 samples), performance reached 92.5% accuracy with a macro-F1 of 0.924, supported by one-vs-rest AUROC (≈0.95) and AUPRC (0.92–0.95) curves. These results demonstrate the potential of combining dual-scale feature extraction with GAN-based class balancing to enhance breast cancer histopathology classification. The study acknowledges dataset limitations and emphasizes future validation on larger multi-institutional cohorts.

Published

2025-11-25