The Journal of Engineering Research (TJER), Vol. 19, No. 1, (2022) 41-53 
 

*Corresponding author’s e-mail: s121293@student.squ.edu.om 
   
   
  DOI:10.53540/tjer.vol19iss1pp41-53 

 
RECENT CNN-BASED TECHNIQUES FOR BREAST CANCER HISTOLOGY 

IMAGE CLASSIFICATION 

ArunaDevi Karuppasamy1,*, Abdelhamid Abdesselam1, Rachid Hedjam1, 
 Hamza Zidoum 1, and Maiya Al-Bahri2    

1 Department of Computer Science, Sultan Qaboos University,  
2 Department of Pathology, Sultan Qaboos University Hospital,  

Muscat, Sultanate of Oman 
 
 
ABSTRACT: Histology images are extensively used by pathologists to assess abnormalities and detect 
malignancy in breast tissues. On the other hand, Convolutional Neural Networks (CNN) are by far, the privileged 
models for image classification and interpretation. Based on these two facts, we surveyed the recent CNN-based 
methods for breast cancer histology image analysis. The survey focuses on two major issues usually faced by 
CNN-based methods namely the design of an appropriate CNN architecture and the lack of a sufficient labelled 
dataset for training the model. Regarding the design of the CNN architecture, methods examining breast histology 
images adopt three main approaches: Designing manually from scratch the CNN architecture, using pre-trained 
models and adopting an automatic architecture design. Methods addressing the lack of labelled datasets are 
grouped into four categories: methods using pre-trained models, methods using data augmentation, methods 
adopting weakly supervised learning and those adopting feedforward filter learning. Research works from each 
category and reported performance are presented in this paper. We conclude the paper by indicating some future 
research directions related to the analysis of histology images. 
 

Keywords: Breast cancer; CNN; Deep learning; Histology image Classification; Machine learning. 
 

 لتصنیف صور سرطان الثدي  CNN المعتمدة على شبكةالتقنیات الحدیثة 
 

 و عبدالحمید عبدالسالم و راشد حجام و حمزة زیدوم و میا البحري *آرون دیفي كربوسامي
 
 
ة في أنسجة الثدي. من یستخدم علماء األمراض صور األنسجة على نطاق واسع لتقییم التشوھات واكتشاف األورام الخبیث :الملخص
قمنا  ج الممیزة لتصنیف الصور وتفسیرھا؛ بناًء على ھاتین الحقیقتینالنماذ تعد الشبكات العصبیة التالفیفیة إلى حد بعیدناحیة أخرى 

لتحلیل صورة أنسجة سرطان الثدي. یركز المسح على مشكلتین رئیسیتین عادة ما تواجھھما  CNNبمسح الطرق الحدیثة المعتمدة على 
المناسبة وعدم وجود مجموعة بیانات مصنفة كافیة لتدریب النموذج. فیما یتعلق  CNNوھما تصمیم بنیة  CNNاألسالیب المعتمدة على 

، استخدام نماذج مدربة CNNلھندسة  ثة مناھج رئیسیة: التصمیم الیدوي، تتبنى طرق فحص صور أنسجة الثدي ثالCNNبتصمیم بنیة 
ق التي تعالج نقص مجموعات البیانات المصنفة في أربع فئات: األسالیب التي مسبقًا واعتماد تصمیم ھندسي تلقائي. یتم تجمیع الطر

تستخدم النماذج المدربة مسبقًا، والطرق التي تستخدم زیادة البیانات، والطرق التي تعتمد التعلم الخاضع لإلشراف الضعیف، وتلك التي 
ة من كل فئة واألداء المبلغ عنھ. نختم الورقة باإلشارة إلى بعض تتبنى التعلم التوضیحي. في ھذه الورقة، نقوم بعرض األعمال البحثی

 اتجاھات البحث المستقبلیة المتعلقة بتحلیل صور األنسجة.
  

 .األليالتعلم  ؛تعلم اآللة ؛تصنیف صور األنسجة ؛الشبكات العصبیة التالفیفیة ؛سرطان الثدي الكلمات المفتاحیة:

 
Recent CNN-Based Techniques For Breast Cancer Histology Image Classification 
 

42 
 

1. INTRODUCTION 
 
Breast cancer is the most common cancer among 
women in developed countries (IARC, 2020) 
(Intercollegiate and Network, 2005) (Sickles, 1997). 
Currently, pathologists examine histology images to 
interpret the tissue appearance and detect abnormal 
conditions.  These images usually have a large 
resolution, different magnification scales, various 
acquisition processes, and different staining methods, 
which set them apart from other medical images (See 
Figure 1). Besides, the result of a histology image 
analysis may vary from different pathologists, and 
sometimes from different analyses done by the same 
pathologist (Gurcan et al., 2009, Germain, 2017), 
which leads to inter-observer and intra-observer 
variability and uncertainties in the decision-making 
process. Hence, there is a need for developing effective 
and efficient computer-aided techniques to analyze and 
interpret histology images. 

Machine Learning (ML) is a discipline of Artificial 
Intelligence that has proven to be very efficient in 
solving classification problems. It allows computers to 
learn from data without being explicitly programmed. 
Supervised learning is an ML category in which a 
model is trained on a set of inputs (features) with 
known outcomes (labels). Once the training is 
completed, the defined model will be capable of 
making predictions when fed with new unseen data. 
Traditionally, features are explicitly selected and 
extracted by the user (see Figure 2).  

Extracted features are mostly related to intensity, 
morphology, and texture (Boucheron, 2008)                
(Dundar et al., 2011) (Liu et al., 2011). Examples of 
these features include: (i) Gabor-wavelet filters, and 
density measures features (Marugame et al., 2009), (ii) 
Gaussian Markov random field and fractal dimension 
features (Al-Kadi, 2010), (iii) intensity, morphological, 
co-occurrence, and run-length features (Irshad, 2013), 

(iv)  Haar-like features and Gaussian filters features 
(Vink et al., 2013), and (v) local binary patterns, 
morphometric features, entropy features and gray-
Level co-occurrence matrix (Tashk et al., 2015) (Bruno 
et al., 2016), (Peikari et al., 2017).  

With the advent of Deep Learning (DL), image 
classification and object recognition achieved 
unprecedented levels of accuracy (Lecun et al., 2015). 
Similarly, its application to healthcare especially for 
analyzing medical images showed excellent 
performance in solving various problems such as 
segmentation, interpretation and registration (Kim M et 
al. 2019). The success of DL is mainly due to the 
ability of modern computers to process huge amounts 
of data, and extract features automatically at different 
abstraction levels. Convolutional Neural Network 
(CNN), is a DL technique specially adapted to process 
images and videos. The first CNN,  LeNet (Y. LeCun 
et al., 1998) was proposed in 1998 to classify 
handwritten digits,  but the real surge in CNN 
popularity started with AlexNet (Krizhevsky et al., 
2012) when it won the ImageNet challenge 2012 and 
since then several deeper and more performant CNNs 
were developed (VGG16 (K. Simonyan and A. 
Zisserman, 2014), GoogleNet (C. Szegedy et al., 2015 
and 2016), and ResNet(He et al., 2016).  
 

Figure 1. Benign and malignant samples from BreaKHis 

dataset of various magnification (F. Spanhol, 
2016). 

Figure 2. The traditional machine learning process for breast cancer histology image classification. 


ArunaDevi Karuppasamy, A. Abdesselam, R. Hedjam, H. Zidoum , and Maiya Al-Bahri    
 

43 

 
Figure 3.  The number of papers addressing the problem of 
breast cancer histology image classification 
published between 2013 and 2021.  

CNN-based methods, including those addressing 
histopathology image classification, usually face two 
major challenges: 1)- there is no systematic approach 
for designing an appropriate CNN architecture and 2)- 
there is a need for large-scale training dataset that is, 
unfortunately not available for many real-world 
problems, including those related to the medical field. 

This review aims at exploring how these two major 
challenges have been addressed in the literature, 
especially by those research works using breast cancer 
histopathological images. To the best of our 
knowledge, this is the first article conducting such a 
study. 

The selection of the reviewed papers is based on the 
search keywords “Deep Learning, Histopathology and 
Breast cancer” in Springer, ScienceDirect, and IEEE 
Explore during the period 2013-2021. The total 
number of publications in the breast cancer 
histopathology classification gradually increased 
during that period (See Figure3).  The search returned 
with 987 research papers, then we removed 136 
duplications, and reported only the papers related to the 
classification of breast cancer using CNN. 

The remainder of the paper is organized as follows: 
Section 2 describes the main components of a CNN 
model and introduces the datasets used by the methods 
included in this survey.  

 Section 3 describes the proposed models for breast 
cancer histology image classification and categorizes 
them based on the way their architecture is designed.  

Section 4 describes methods proposed to address 
the problem of the lack of labelled datasets.  The 
performance, in terms of accuracy and training time, of 
all surveyed works, were reported.  

The paper concludes by summarizing the main 
characteristics of the proposed approaches and 
presenting some future research directions related to 
the analysis of histology images. 

 
2. CONVOLUTIONAL NEURAL 
NETWORK FOR HISTOLOGY 
IMAGE CLASSIFICATION  

 
CNNs are capable of learning automatically features 
from raw input data. They consist of several feature 
extraction layers where the first layers extract basic 
features such as edges and blobs and deeper layers 
extract more complex and abstract features. A CNN 
includes three main building blocks: (i) Convolutional  
and pooling layers, (ii) Fully-Connected layers and (iii) 
a classification layer (See Figure 4). The convolution 
operation is the most expensive computation of a CNN, 
it extracts information by convolving the input data 
with a set of filters. A feature map is obtained by 
applying an activation function to the output of the 
convolution operation. This ensures non-linearity and 
enforces a sparse representation of the feature map. 
Early models used Sigmoid and hyperbolic tangent 
functions as activation functions (LeCun et al., 2012). 
Recent works adopted other activation functions such 
as ReLU (Krizhevsky et al., 2012), Leaky 
ReLU(LReLU) (Maas et al., 2013), Parametric-
ReLU(PReLU) (He et al., 2015), Randomised-ReLU 
(RReLU) (Xu et al., 2015), S-shaped ReLU(S-ReLU) 
(Jin et al., 2016), Maxout (Goodfellow et al., 2013), 
and  Exponential Linear Unit (ELU) (Trottier et al., 
2017). A pooling operation usually follows 
convolutional and activation operations to reduce the 
size of the feature maps. There are different pooling 
types in CNN (Maas et al., 2013) such as max-pooling, 
mean-pooling, stochastic pooling, spatial pyramid 
pooling, and deformation pooling. Max pooling 
(Boureau, Bach, et al., 2010)(Boureau, Ponce, et al., 
2010)  is the most used method; it consists of replacing 
a pooling window by the maximum value in the 
window. Then come a set of fully-connected layers, 
where each neuron in a layer is connected to all neurons 
of the following layer. The last feature map is 

passed to the first fully connected layer in a form of 
a 1-D vector (Kröse et al., 1993) to produce an output 
more appropriate to the classification task that is 
performed either by applying a function like a Softmax 
for multi-label classification or a Sigmoid for binary 
classification. One of the traditional classification 
models such as Logistic Regression (LR), Support 
Vector Machines (SVM) or Random Forests (RF) can 
also perform the classification. 

 
Figure 4. Convolutional neural network for breast cancer 

histology image classification. 

Pu
bl

ic
at

io
n 

Year 


Recent CNN-Based Techniques For Breast Cancer Histology Image Classification 
 

44 
 

Table 1. Dataset and links for breast cancer histology 
classification. 

Dataset Links 

ICPR-2012 http://ludo17.free.fr/mitos_2012/ 
TUPAC2016 http://tupac.tue-image.nl/ 
BreakHis 2016 https://web.inf.ufpr.br/vri/databases/

breastcancer-histopathological-
 Camelyon16 https://camelyon16.grand-

challenge.org/ 
Patchcamelyon https://patchcamelyon.grand-

challenge.org/Download/ 
 BACH-ICIAR18 https://iciar2018-challenge.grand-
challenge  

Kaggle data 
repository 

https://www.kaggle.com/c/histopatho
logic-cancer-detection 

 
Most of the models discussed in this paper are 

trained and tested on breast cancer datasets publicly 
available on the Internet. Their links are shown in 
Table 1. 

Three main performance metrics are used by the 
methods surveyed in this paper. Classification 
accuracy, F-score metric and   Area Under Curve 
(AUC), where 
 
1. Accuracy= # 𝑜𝑜𝑜𝑜 𝑐𝑐𝑜𝑜𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑜𝑜𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑜𝑜𝑐𝑐𝑐𝑐

𝑇𝑇𝑜𝑜𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑛𝑛𝑛𝑛𝑛𝑛𝑐𝑐𝑐𝑐 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑡𝑡 𝑐𝑐𝑛𝑛𝑐𝑐𝑖𝑖𝑐𝑐𝑐𝑐
           (1) 

 
2. F-Score= = 2 X Precision x Recall

Precision+Recall
,                          (2) 

 
where precision is the fraction of correctly classified 
images among those labelled as positive, and recall is 
the fraction correctly classified images out of all 
positive images in the test dataset.      
 
3. Area Under the Curve (AUC) indicates how well 

the positive classes are classified compared to 
negative classes.                                                 (3) 
 

F-score and AUC are more appropriate for 
evaluating the classification performance on datasets 
with imbalanced classes. 
 
3.  CNN ARCHITECTURE DESIGN  
 
Based on the way architectures are designed, we can 
categorize CNN-based methods for classifying cancer 
histology images into three categories: (i) Manually-
designed CNN architectures, (ii) Pre-trained CNN 
architectures, and (iii) Automatically-designed CNN 
architectures.  
 
• Manually-Designed CNN Architectures 

Most CNN-based histology image classification 
methods are of this category as shown in Table 2. They 
use relatively shallow architectures (small number of 
layers) and backpropagation algorithms for filter 

learning. They are trained mostly on datasets of limited 
size using a limited number of epochs and achieve 
moderate to good levels of performance.  

(Cirecsan et al., 2013)  proposed a 6-layer CNN 
(four Convolutional and pooling layers and two fully-
connected layers) to detect mitosis in a dataset of 50 
images.  It was trained for one day with less than 30 
epochs in an optimized GPU implementation. This 
method won the ICPR2012 competition by achieving 
an F1-score of 0.78. (Cruz-Roa et al., 2014) designed a 
3-layer CNN to automatically detect invasive ductal 
carcinoma from whole-slide images (WSI) of breast 
cancer. The dataset consists of 162 WSIs from which 
about 165000 patches of size 100x100 were extracted 
and used for model training and testing. The proposed 
CNN was trained with 25 epochs and achieved an F1-
score of 0.71. (Wang et al., 2014) proposed a cascade 
ensemble that combines hand-crafted features 
(morphology, colour, and texture features) with 
features extracted from a 3-layer CNN. These two 
feature sets were passed to a random forest classifier to 
detect mitotic nuclei on the ICPR dataset consisting of 
50 High Power Field images (scanned into 2084x2084 
pixels RGB images). The model was trained for 9 
epochs and it required nearly 18 hours without GPU 
implementation. They reported an F1-score of 0.73. 
(Litjens et al., 2016) proposed a patch-based CNN 
model to classify prostate cancer and detect micro-
metastases and macro-metastases breast cancer in 
sentinel lymph nodes. Their dataset consists of 271 
WSIs from which 2.6 million patches of size 128x128 
were obtained through data augmentation. For 
metastases detection, the CNN was trained with 12 
epochs, each epoch requiring 200 minutes on GeForce 
GTX970, and obtained an AUC of 0.88. (Araújo et al., 
2017) proposed an 8-layer CNN model that classifies 
histology images into four categories: i) normal tissue, 
ii) benign lesion, iii) in situ carcinoma, and iv) invasive 
carcinoma. Their dataset consists of 269 high 
resolution images (2049x1536 pixels) obtained from 
the Bio-imaging 2015 breast histology classification 
challenge dataset from which 70000 (512x512) patches 
were extracted. The CNN model was trained with 50 
epochs and achieved a validation accuracy of 80.6% 
for binary classification (non-carcinoma and 
carcinoma). The accuracy improved to 83.3% when 
using the CNN as a feature selection tool and an SVM 
as a classifier.  (Bayramoglu et al., 2017), proposed 
two 6-layer CNN architectures for breast cancer 
histology image classification characterized by their 
independence of the magnification level. A single task 
CNN (3 convolutional layers followed by 3 fully 
connected layers) was used to predict malignancy and 
a multi-task CNN (3 convolutional layers followed by 
two sequential fully connected layers) is used to predict 
both malignancy and image magnification level 
simultaneously. The CNNs were trained with images 
from different magnification levels (40x, 100x, 200x,  
and 400x) of BreaKHis dataset.

 
http://ludo17.free.fr/mitos_2012/
http://tupac.tue-image.nl/
https://patchcamelyon.grand-challenge.org/Download/
https://patchcamelyon.grand-challenge.org/Download/
https://iciar2018-challenge.grand-challenge/
https://iciar2018-challenge.grand-challenge/


ArunaDevi Karuppasamy, A. Abdesselam, R. Hedjam, H. Zidoum , and Maiya Al-Bahri    
 

45 

Table 2. Overview of manually-designed cnn architectures used in breast cancer histology image classification.

The authors reported an average recognition accuracy 
of  83.25% for single CNN and averages of 82.13% and 
80.10% for benign or malignant classification and 
magnification detection respectively.  (Wahab et al., 
2017) proposed a two-phase  CNN to handle 
imbalanced histology data. The first phase of the CNN 
was trained with 80x80 pixels non-mitosis patches to 
classify them into three imbalanced classes: easy, 
normal and hard using 5 epochs.  The resulting classes 
are then balanced and combined with augmented 
mitosis patches to be used for retraining the CNN with 
25 additional epochs during 15 hours on i7-3770 CPU 
of 3.4GHz to classify the patches into mitotic and non-
mitotic nuclei. The patches were extracted from two 
datasets (50 HPFs(high-power fields) from the 
ICPR2012 context dataset and 73 HPFs from the 
TUPAC16 dataset). The authors reported an F1-score 
of 0.79. (Cruz-Roa et al., 2018) proposed a framework 
called HASHI for invasive breast cancer detection in 
WSI. The training data is obtained by applying a 
regular sampling on 600 labelled WSIs. The patches 
resulting from a pseudorandom sampling of the new 
unseen WSIs are passed to a 2-layer CNN classifier; 
resulting predictions are used to build an interpolated 
probability map.  Dense sampling is further applied to 
regions with high uncertainty, which produces an 
improved probability map estimation. The authors 
reported an  

AUC of 0.90. (Roy et al., 2019) proposed an 8-layer 
CNN model for binary (Non-Malignant and 
Malignant) and multi-class (normal, benign, in-situ and 
invasive) classification on the ICIAR-2018 dataset, 
which consists of 500 images (400 for training and 100 
for testing) of size 2048x1536. The model was trained 
with about 70 epochs on 128GB RAM of CPU based 
system with Xenon processor to yield an accuracy of 
92.5% and 90% for binary classification and multi-
classification, respectively. 
 
• Pre-trained CNN Architectures  

This category of methods implements transfer learning, 
where features learned to solve a problem from one 
domain constitute the baseline for the features used to 
solve a problem from a different but related domain 
(Mehra, 2018). These techniques take an existing CNN 
architecture usually trained on a large dataset, and re-
train it on a smaller dataset related to the problem under 
investigation. During the training, only the weights of 
the last few layers are updated using the 
backpropagation approach for filter learning. These 
CNNs are usually deep (more than seven layers), 
initially trained for a long time and with a large number 
of epochs which make them very performant in solving 
the original problem. Retraining takes usually, much 
less time (less 
epochs) on much smaller datasets and resulting models 

  
Reference CNN Architecture Dataset Performance 
(Cirecsan et al., 2013)  6-layer CNN(4-Convolutional 

 layers and 2 Fully-connected layers)  
ICPR2012(50 images) F1-score :0.78  

(Cruz-Roa et al., 2014) 3-layer CNN (2-Convolutional  
layers and 1-Fully-connected layer)  

Hospital of University of 
Pennsylvania and The cancer 
Institute of New Jersey dataset 
(162 images ) 
 
 
F-score : 0.71 
Acc: 84.23% 

(Wang et al., 2014) 3-layer CNN (2-Convolutional  
layers and a Fully-connected layer) 

ICPR dataset (50 images) 
 

F-score: 0.73 

(Litjens et al., 2016) 6-layer CNN(4-Convolutional  
layers and 2 Fully-connected layers ) 
 

Radboud University Medical 
Center dataset(173 images) 

AUC: 0.88 

(Araújo et al., 2017) 8-layer CNN(5 Convolutional layers 
and 3 Fully-Connected layers) 

Bioimaging 2015 breast histology 
classification challenge dataset 
(269 images) 

Acc: 
multi-class:77.8% and 
binary-class: 83.3% 

(Bayramoglu et al., 2017) Two 6-layer CNNs BreakHis dataset (7909 images) Average recognition  
rate:83.25% 

(Wahab et al., 2017) Two-Phase CNN model ICPR2012 dataset(50 images)  and 
TUPAC16 dataset(73 images) 

F-score : 0.79 

(Cruz-Roa et al., 2018) 2-layer CNN Hospital of the Univ. of 
Pennsylvania(239 images), Case 
Western Reserve Univ(110 images), 
Cancer Institute of New Jersey(52 
images),  and The Cancer Genome 
Atlas(195 images) 

Acc: 0.90 
Dice Coefficient: 76%  

(Roy et al., 2019) Patch-Based classifier with CNN 
(6-convolutional layers and 2-fully 
connected layers) 

ICIAR-2018 Dataset (500 images) Acc: 
Binay-class: 92.5% 
Multi-Class:90%  


Recent CNN-Based Techniques For Breast Cancer Histology Image Classification 
 

46  

Table 3. Overview of pre-trained CNN architectures used in breast cancer histology image classification. 

Reference CNN Architecture Dataset Performance 
(Fabio Alexandre 
Spanhol et al., 2016) 

AlexNet BreakHis dataset (7909 images) Acc:85.6 (40x), 
83.0(100x), 
83.1(200x), 
80.8 (400x) 

(Fabio A Spanhol et al., 
2017) 

BVLC CaffeNet model (pre-trained 
AlexNet as a CNN) along with DeCaf 
features 

BreakHis dataset (7909 images) Acc: 84.3 (40x), 
84.7 (100x), 
84.1(200x), 
81.6 (400x) 

(Song et al., 2017) Component Selective Encoding  CNN 
(combination of Fisher vector and 
VggNet19) 

BreakHis dataset (7909 images) Acc: 87.5 (40x), 
88.6(100x), 
85.5(200x), 
85.0(400x) 

(Bejnordi et al., 2017) CAS-CNN( a combination of wide-
ResNet and VggNet) 

Radboud University  
Medical Centre Dataset(224 images) 

AUC: 0.96(Binary) 
Acc: 89%(Binary) 
Acc: 81.3%(multi-
class) 

(Han et al., 2017) Class Structure-based Deep 
Convolutional Neural Network 
(GoogleNet) 

BreakHis dataset (7909 images) Acc: 95.8 (40x),  
96.9 (100x),  
96.7 (200x), 
94.9 (400x) 

(Vang et al., 2018) InceptionV3 and Dual-path network ICIAR-2018 Dataset(400 images) Acc: 87.5% 

(Motlagh et al., 2018) ResNet152 TMA database(6,402) and  
BreakHis dataset(7909 images)  

TMA - Acc:99.8%  
BreakHis 
Acc:98.7%(Binary) 
Acc:94.6%(multi-
class) 
 

(Vesal et al., 2018) ResNet50 BACH-2018 dataset (320 images) Acc: 97.50% 

(Kausar et al., 2019) VggNet16 ICIAR 2018 (400 images) and 
BreaKHis dataset(7909 images)   

Acc: 
98.2%(ICIAR2018) 
96.85%(40x-
BreaKHis) 

Budak et al. (2019) FCN based on AlexNet and  
Bi-LSTM (Bidirectional Long Short-
Term Memory) 
 

BreaKHis  dataset(7909 images)   Acc: 95.69(40x), 
93.61(100x), 
96.32(200x), 
94.29(400x).  

Alom et al. (2019) Inception Recurrent Residual CNN 
(IRRCNN)  
 

Bioimaging 
Challenge 2015 
 

Acc: 99.05% (for 
binary) and 
98.59%(for multi) 
 

Vo et al. (2019) Inception and 
Gradient boosting trees 
 

BreaKHis dataset(7909 images)   Acc: 93.5(40x), 
95.3(100x), 
96.1(200x), 
91.1(400x) 
 

(Anwar et al., 2020) ResNet50,HoG,WPD BreakHis dataset (7909 images)   Acc: 97.10(40x), 
97.56(100x), 
96.41(200x), 
94.32(400x) 

(Chakraborty et al., 2020)  ResNet50 Kaggle data repository (220,025 
patches of  images) 

Acc:96.48% 
F1-score:96.32% 

Laxmisagar and 
Hanumantharaju (2021) 
 

MobileNet2.10ex Bioimaging 
Challenge 2015 
 

Acc: 88.92% 
 

Munien and Viriri (2021) EfficientNet ICIAR2018 dataset Acc: 98.33% 


ArunaDevi Karuppasamy, A. Abdesselam, R. Hedjam, H. Zidoum , and Maiya Al-Bahri    
 

47 

are usually of high performance. The most re-used 
CNN architectures are AlexNet (Krizhevsky et al., 
2012), VggNet (Simonyan and Zisserman, 2014), 
GoogLeNet (Szegedy et al., 2015), and ResNet   (He et 
al., 2016) (See Table 3). 

AlexNet is an 8-layer CNN trained on an ILSVRC 
dataset (Deng et al., 2009). The dataset consists of 
images of 1000 different classes. 1.2 million images are 
used for training, 50,000 images for validation and 
150,000 images for testing. It took 6 days for a 90- 
epoch training and achieved a winning error rate of 
15.3%.  The second version of AlexNet trained on 
CIFAR-10 dataset that contains 60000 images (50000 
for training and 10000 for testing) from 10-classes has 
also been used as a pre-trained model (Krizhevsky and 
Hinton, 2010) (Krizhevsky, 2009). VggNet is available 
in two architectures, 16 layers and 19 layers; they were 
trained on the ILSVRC14 challenge dataset for 74 
epochs and won second place by obtaining an error rate 
of 14.7% and 7.3% respectively. GoogLeNet is a 22-
layer architecture also trained on the ILSVRC14 
challenge dataset and won the competition by 
obtaining an error rate of 6.7%. ResNet is a deep 
network that is available in different versions, the most 
common ones have 50, 101, and 152 layers 
respectively. The ResNet50 was trained on the 
ImageNet dataset and achieved an error rate of 
5.25%.  The ResNet101 model was trained using 80k 
iterations for detection and segmentation of the COCO 
2015 challenge dataset and won the COCO 2015 
competitions by achieving a 28% relative improvement 
on object detection. The ResNet152 model was trained 
on the ILSVRC15 challenge dataset for 60 × 104 
iterations and won the ILSVRC15 competitions by 
achieving an error rate of 3.57%. Classification of 
histology images with pre-trained networks yields 
better performance compared with manually designed 
CNNs. 

Fabio Alexandre Spanhol et al. (2016) applied the 
AlexNet pre-trained on the CIFAR-10 dataset on a 
multi- magnification dataset called  BreaKHis that was 
introduced by  (Fabio A Spanhol et al., 2015). A subset 
of 1000 patches of size 64x64 was used to  train the 
model for about 80,000 iterations to obtain image-level 
accuracy of 85.6 ± 4.8% (40x), 83.0 ± 3.9% (100x), 
83.1 ± 1.9% (200x), and 80.8 ± 3 %  (400x). (Fabio A 
Spanhol et al., 2017) applied a modified version of 
AlexNet (the order of pooling and normalization layer 
was exchanged) pre-trained on the ImageNet dataset.  
On the breaks dataset their model achieved an image-
level accuracy of 84.3 ± 2.9% (40x), 84.7 ± 4.4% 
(100x), 84.1 ± 1.5% (200x), and 81.6 ± 3.7% (400x).  
(Song et al., 2017) proposed a model, which extracts 
features from the final convolutional layer of 
VggNet19, encoded by a fisher-vector to feed an SVM 
classifier. The proposed model was evaluated on the 
BreaKHis dataset and obtained a binary classification 
accuracy of 87.5 ± 1.6 (40x), 88.6 ± 3.6 (100x), 85.5 ± 2.0 
(200x), and 85.0 ± 4.6 (400x). (Bejnordi et al., 2017) 
proposed a cascaded CNN (CAS-CNN) model, which 

is a combination of a modified version of ResNet called 
Wide-ResNet (Zagoruyko and Komodakis, 2016) and 
VggNet. The proposed CAS-CNN model was trained 
and tested on 224 WSI (100 normal/benign, 69 DCIS, 
and 55 IDC WSIs) from Radboud University Medical 
Center, Netherland. An Accuracy of 89% was obtained 
for binary classification (Benign and Cancer) and 81% 
for multi-classification (Benign, Ductal Carcinoma, 
and Invasive Ductal Carcinoma). (Han et al., 2017) 
proposed a Class Structure-based Deep CNN 
(CSDCNN), which is based on a pre-trained 
GoogLeNet for binary and multi-classification. The 
data augmentation method was applied on BreaKHis 
dataset to increase the training images. Moreover, an 
over-sampling based on Gaussian distribution was 
applied to deal with the imbalance of the classes. The 
training took 10 hours on the augmented dataset on 
Intel i7- NIVIDIA Quadro K22200 GPU and achieved 
image-level accuracies of (95.8 ± 3.1 (40x), 96.9 ± 1.9 
(100x), 96.7 ± 2.0 (200x), and 94.9 ± 2.8 (400x)).  
(Vang et al., 2018) used the pre-trained GoogLeNet 
model on the ICIAR-2018 challenge dataset, which 
contains 400 microscopy images of size 2040x1536 
pixels. The training was conducted on 4 GPUs (2 
NIVIDIA TitanX GPUs and 2 NIVIDIA GTX 1080Ti) 
for 30 epochs to achieve an accuracy of 87.5% for a 
multi-classification (normal, benign, in-situ carcinoma 
and invasive carcinoma). (Motlagh et al., 2018) used 
the pre-trained ResNet152 to extract hierarchical 
features on TMA (Tissue Micro Array) and BreaKHis 
datasets. The model was trained on ASUS GeForce 
GTX1080 for 3000 epochs.  It achieved an accuracy of 
98.7% and 96.4%, for binary and multi-classification, 
respectively. (Vesal et al., 2018) applied ResNet50 on 
the BACH-2018 challenge dataset, which consists of 
400 WSI of size 2040x1536 pixels. From the 320 
training images, 67,200 patches of size 512x512 were 
extracted for training. The model was trained for about 
100 epochs and achieved an accuracy of 97.5%. 
(Kausar et al., 2019) proposed a modified version of 
VggNet16 by replacing the last three fully connected 
layers with a global pooling layer. The input of the 
model consists of Haar wavelet-decomposed images. 
ICIAR2018 and BreaKHis datasets have been used to 
train the model for about 150 epochs on NVIDIA Tesla 
M40 GPU to achieve an accuracy of 98.2% and 96.85% 
on the two datasets, respectively. (Budak et al., 2019) 
used an integrated model combining AlexNet, and Bi-
LSTM (Bidirectional Long Short-Term Memory) to 
classify BreaKHis dataset. The authors reported that 
pre-trained AlexNet was re-trained for 20 epochs in the 
NVIDIA Quadro P6000 GPU, and achieved the 
accuracy of  95.69% (40x), 93.61% (100x), 
96.31%(200x), and 94.29%(400x). (Alom et al., 2019) 
proposed a model that combines Inception-V4, ResNet 
and Recurrent CNN. The model was re-trained for 150 
epochs on the BreaKHis dataset using a 56G of RAM 
GPU with NVIDIA GeForce GTX-980 Ti. It achieved 
an accuracy of 97.95% (40x), 97.57% (100x), 97.32 
(200x), and 97.36 % (400x).  (Vo et al., 2019) proposed 


Recent CNN-Based Techniques For Breast Cancer Histology Image Classification 
 

48  

a model based on Inception-v3 and ResNet-152. It was 
trained on BreaKHis dataset for 50 epochs  on GeForce 
GTX 1080 Ti that yields an accuracy of  93.5% (40x), 
95.3(100x), 96.1%(200x), and 91.1%(400x).  (Anwar 
et al., 2020) proposed a model that combines features 
produced by ResNet50, wavelets packet 
decomposition, and histograms of the gradient. PCA is 
then applied to reduce the feature dimensionality. A 
Data Augmentation was applied to the BreaKHis 
dataset to obtain 237270 training patches. The authors 
reported accuracies of 97.10 (40x), 97.56 (100x), 96.41 
(200x), and 94.32 (400x). (Chakraborty et al., 2020) 
proposed a Dual Channel Residual CNN (DCRCNN) 
which decomposes input images into two separate stain 
channels (Eosin and Hemotoxylin) (Ruifrok, 2019). 
Stain channels are fed into two separate ResNet50; 
resulting features are then fused for classification. The 
proposed model was applied on Kaggle data repository 
(Kaggle Dataset, 2018), which contains nearly 220,025 
images of size 96x96. The authors trained the model 
for 50 epochs and achieved an accuracy, and F1-score 
of 96.48% and 96.32%, respectively. Laxmisagar and 
Hanumantharaju (2021) proposed a model that 
incorporated MobileNet2.10e with fully convolutional 
deep neural network on bio-imaging challenge 2015 
dataset. This model was trained for 280 epochs in 
Google colab with GPU 2496 CUDA, and achieved an 
accuracy of 88.92%. (Munien and Viriri, 2021) 
proposed a model based on EfficientNet-B2 trained for 
50 epochs in Google colab with NVIDIA Tesla K80 
GPU, and achieved an accuracy of 98.3% in 
ICIAR2018 dataset. 
 
• Automatically-Designed CNN Architecture  

Network Architecture Search (NAS) is a state-of-the-
art approach for designing efficient CNN-based 
solutions; It was recently introduced to automate and 
optimize the hyper-parameters and the architecture 
design of neural networks (Zoph and Le, 
2017)(Wistuba, 2019). NAS methods can be 
categorized according to three dimensions (Elsken et 
al. 2019): search space, search strategies and 
performance evaluation strategies. The search space 
defines which architectures a NAS method might 
discover, the search strategy defines the way the search 
space is explored to discover an optimized architecture 
in terms of some parameters such as number of layers, 
number of filters, size of the filters and pooling 
window, and strides while a performance evaluation 
strategy aims at estimating the performance of a neural 
network architecture in terms of predefined criteria 
such as the accuracy. The reader is referred to (Elsken 
et al. 2019) for more details on these strategies.  

(Koohbanani et al., 2018) proposed a model that 
applies the NAS approach to design a neural network 
for breast cancer histology image classification. In this 
work, a Bayesian optimization approach (Zoph and Le, 
2017) was applied to estimate the optimal values of the 
main hyper-parameters of the model (filter size, 
dropout layer, learning rate, and momentum). The 

optimized model was used to detect metastasis on 
CAMELYON16 dataset that consists of 400 WSI (270 
WSI for training and 130 WSI for testing). The authors 
reported an AUC of 0.99. (Oyelade and Ezugwu, 2021) 
proposed a NAS based model using Ebola 
Optimization Search Algorithm for search space to 
optimize the hyper-parameters of deep neural networks 
that applied on the breast cancer histopathology images 
of BreaKHis and BACH dataset. The optimized model 
trained for 500 epochs on both datasets and attained a 
100% accuracy 

 
4. LACK OF LABELED HISTOLOGY 

DATASETS  
 
To achieve high prediction performance, a CNN model 
requires large labelled training datasets which is, 
unfortunately, not available for many problem domains 
including the medical domain. There are two main 
reasons behind the lack of labelled histology image 
datasets: 1) Labeling histology images is time-
consuming and requires medical experts, which make 
the task very expensive. 2) Medical data, in general, are 
usually confidential and therefore not easily available 
to the research community. Several approaches have 
been adopted to address this problem; the most 
common methods used for histology image 
classification are patch generation, data augmentation 
and pre-trained CNNs. The two other methods 
addressing this problem are weakly supervised 
learning and feed-forward filter learning. 

•  Commonly Used Approaches Dealing with 
Small Histology Datasets 

Histology images are in general of large size (Wide 
Slide Images 2040 x 1536 pixels) which allows 
increasing the number of the dataset samples by 
generating thousands of smaller labelled images 
(patches) from the original WSI dataset (Litjens et al., 
2016) (Araújo et al., 2017) (Wahab et al., 2017) (Vesal 
et al., 2018). Data augmentation is another commonly 
adopted method for increasing the size of image 
datasets. It consists of generating additional labelled 
images by applying various image transformations, 
such as rotation, translation, zooming, and cropping, to 
the original dataset. (Fabio Alexandre Spanhol et al., 
2016) (Vang et al., 2018) (Anwar et al., 2020). Finally, 
pre-trained models are often adopted because they can 
lead to high accuracy even when re-trained on 
relatively small datasets (Song et al., 2017) (Bejnordi 
et al., 2017).  

• CNN using Weakly-Supervised Learning 
Weakly supervised learning includes three categories 
(Zhou, 2018): 1) incomplete supervision, in which only 
a subset of training data is given with labels; 2)inexact 
supervision, in which the training data are given with 
only coarse-grained labels, and 3) inaccurate 
supervision, in which the given labels are not always 
ground truth (Albarqouni et al., 2016). 


ArunaDevi Karuppasamy, A. Abdesselam, R. Hedjam, H. Zidoum , and Maiya Al-Bahri    
 

49 

Table 3. Overview of methods dealing with the lack of large labelled datasets in breast cancer histology classification. 
Reference Supervision/ 

Filter Learning 
Dataset Performance 

(Albarqouni et al., 2016) Inaccurate supervision MICCAI-AMIDA13 challenge dataset 
(311 images) 

F1-Score: 0.74 
AUC: 0.869 
 (Das et al., 2018) Inexact supervision 

(MIL) 
BreakHis dataset 
(7909 Images) 

Acc:89.52(40x), 
89.06 (100x), 
88.84(200x),  
87.67(400x)  
 
 
(Sudharshan et al., 2019) Inexact supervision  
(MIL) 

BreakHis dataset 
(7909 Images) 

Acc:86.9(40x), 
85.7 (100x), 
85.7 (200x), 
83.4 (400x) 
 
 (Jaiswal et al., 2019)            Incomplete supervision PatchCamelyon (327680 patches of  

images) 
AUC: 0.98 

(Shi et al., 2016)  PCA MGH dataset  
(66 images) 

AUC: 0.88 
Acc: 81.51±2.91 

(Huang et al., 2017) PCA Netherland Cancer Institute (778 images), 
Vancouver General Hospital (664 images) 
and Public dataset. [Stanford Tissue 
Microarray Consortium Web Portal and  
The UCSB Bio-Segmentation Benchmark 
dataset] (45 images) 

Acc: 85%  

Incomplete supervision is implemented via two 
major techniques, namely active learning which 
involves a human expert to label selected unlabeled 
instances considered as the most valuable, and semi-
supervised learning that makes use of labelled data and 
some basic assumptions about data distribution to 
predict labels of unlabeled data. For more details, the 
reader is referred to (Zhou, 2018). Inexact supervision 
uses the multiple-instance learning (MIL) technique 
introduced by (Dietterich et al., 1997). In MIL, CNNs 
inputs consist of multiple instances of images (called a  
bag of images), and the bag labels are obtained using 
two methods that are instance-based and bag-based 
methods. For more details, the reader is referred to 
(Herrera et al., 2016). In inaccurate supervision, 
training data may contain erroneous labelling. A 
typical scenario is when some of the labels are obtained 
through the crowdsourcing technique introduced by 
(Brabham, 2008). In this technique, non-expert users 
registered to a common platform provide labels for the 
unlabeled data. All published works applying methods 
grouped under this category are using the 
backpropagation approach for filter learning.  

Albarqouni et al. (2016) proposed a 5-layer CNN 
model (AggNet) for mitotic detection. A 
crowdsourcing platform, called Crowdflower was 
developed to annotate the unlabeled images. Users 
registered on this platform are able to access the 
samples of labelled images and participate in the 
annotation process. This model was trained for 100 
epochs in GeForce GT 750M on the MICCAI-
AMIDA13 challenge dataset that consists of 318 HPF 
images for training and 22 HPF images for testing, 
along with 5500 labelled patches from the 
Crowdflower platform (550 patches annotated by 10 

participants). The authors reported an AUC of 0.87. 
(Das et al., 2018) proposed a MIL-CNN with new 
multiple instance-pooling layers that extract features 
from the final fully connected layer for classification. 
A bag with multiple instances is obtained by extracting 
regions of interest (ROI) from WSI. Each patient is 
assigned a bag containing 25 to 60 randomly chosen 
patches of size 224x224 extracted from the BreaKHis 
dataset. Their model achieved patient-level accuracies 
of 89.52% (40x), 89.6% (100x), 88.4% (200x) and 
87.67 (400x). (Sudharshan et al., 2019) applied on 
BreakHis dataset MILCNN model initially proposed 
by (Sun et al. 2016). In their work, they considered two 
different settings; in the first set, each patient is 
considered as a bag which is labelled by its diagnosis 
while in the second set 1000 patches are randomly 
extracted from each image to constitute bags assigned 
the image label. The model was trained for 80 to120 
epochs and achieved patient-level accuracies of 86.9 ± 
5.4% (40x), 85.7 ± 4.8% (100x), 85.9 ± 3.9% (200x), 
and 83.4 ± 5.3% (400x) and image-level accuracies of 
86.1 ± 4.2% (40x), 83.8 ± 3.1% (100x), 80.2 ± 2.6% 
(200x), and 80.6 ± 4.6% (400x). (Jaiswal et al., 2019) 
applied a modified DenseNet-201 proposed by (G. 
Huang and Weinberger, 2018) where they substituted 
the last two fully connected layers with a global-max 
and global-average pooling layers. The model was 
trained with a small labelled dataset then the trained 
model is applied on a batch of the unlabeled dataset to 
obtain a pseudo-labelled dataset that was used to 
explore discriminating features of the unlabeled 
dataset. During the training, the backpropagation 
algorithm optimizes the objective function that 
combines the loss from the predictions of the pseudo-
labelled dataset with that from the labelled dataset. 


Recent CNN-Based Techniques For Breast Cancer Histology Image Classification 
 

50  

Their model was applied to PCam (PatchCamelyon) 
dataset that includes labelled and unlabeled images 
consisting of 327,680 patches of size 96x96 pixels. 
They trained the model for 5 to 7 epochs on NVIDIA 
Tesla P100 and reported an AUC of 0.98. 

• Methods Using Forward Filter Learning 
In this approach, predefined filters are used in the 
convolutional layers instead of learning them through 
the traditional backpropagation algorithm. These 
methods are trained for a short period at a low 
computational cost and they can perform well with 
small-sized datasets.  Two pioneer research works have 
adopted this approach for learning filters, namely 
ScatNet (Bruna, J. and Mallat, S. 2013) and PCANet 
(Chan et al., 2015). The formerly used wavelet 
operators to define the filters while the latter used PCA 
method to define them.  

Shi et al. (2016) proposed a method inspired by 
PCANet called Color pattern Random Binary Hashing-
Based PCANet (C-RBH-PCANet). In this method, 
filters learned using a PCA-based technique were 
applied to a 2-block CNN (each including a 
convolutional layer, non-linearity layer and a pooling 
layer). They reported an AUC and accuracy of 0.88 and 
81.51%, respectively on a dataset that consists of 1000 
patches from 66 WSI of size 800x1800 from the MGH 
(Massachusetts General Hospital) dataset. (Huang et 
al., 2017) proposed a 5-layer CNN model. The filters 
were learned using principal component analysis on the 
public datasets (University of Pittsburgh School of 
Medicine,UPSM and the UCSB Bio-Segmentation, 
UCSB-BS Benchmark dataset), and a sample of 
different types of histology images obtained through 
the Google search engine. They achieved an accuracy 
of 85% on randomly selected 12,000 patches of size 
50x50 pixels from three datasets (Netherland Cancer 
Institute dataset, Vancouver General Hospital dataset, 
and the UPSM and UCSB-BS public datasets). 
 
5.  DISCUSSION  
In this paper, the focus was on how CNN-based 
methods addressed the two major challenges faced by 
any CNN-based model namely the design of an 
appropriate CNN architecture and the acquisition of an 
adequate labelled dataset.  

Regarding the CNN architecture design, the survey 
showed that the early methods adopted a handcrafted 
approach that builds from scratch the adequate CNN 
architecture based on trial and error. These methods are 
characterized by their relatively small depths (mostly 
less than 7 layers), reasonable training time and epoch 
number and moderate to good accuracy (mostly from 
70% to 90%). Tuning the values of the CNN 
hyperparameters such as the number of layers, number 
of filters and their sizes, etc. is tedious and time-
consuming. Besides, the identified architectures are 
difficult to adapt to new datasets. With the event of 
performant deep CNNs such as AlexNet (8 layers), 
VggNet (16 and 19 layers), GoogleNet (22 layers) and 

ResNet (152 layers), transfer learning became the 
preferred approach for designing image classifiers. The 
proposed methods attained better accuracies on breast 
cancer histology images (from mid-80% to mid-90%) 
with much less design effort, as most of the time the 
design consists of retraining and /or modifying the few 
last layers of a predesigned CNN architecture. The 
main drawback of these methods is the large number of 
learned weights that have to be stored. Very recently 
Network Architecture Search (NAS) techniques have 
been proposed. They aim to automatically design 
optimized CNN architectures with efficient 
hyperparameter setting that suits the dataset under 
consideration. Designed CNN can achieve very high 
classification accuracies, the only work we found 
addressing breast cancer histology image classification 
reported an accuracy of 99%. The main drawback of 
these methods resides in the significant search time 
required to identify the appropriate architecture. 

Regarding the acquisition of an adequate labelled 
dataset, the survey showed that proposed breast cancer 
histology image classification methods addressed the 
problem in different ways. Some methods make use of 
the large size of the Whole-Slide Images or WSI (about 
2040 x 1536) to extract a significant number of small 
labelled patches of a few hundred pixels in each 
dimension. The resulting images are then used to train 
and test the designed CNN models which lead to 
achieving accuracies in the range of mid-80% to mid-
90%. Other methods use pre-trained models that 
usually do not need a large number of images for 
training because of the huge size of the dataset used to 
design the pre-trained CNN. When there is a need to 
further increase the size of the dataset, two approaches 
are applied: data augmentation and weakly supervised 
learning. In data augmentation, new images/patches 
are generated from collected labelled data by applying 
image transformations such as rotations, zooming, 
illumination variations, cropping etc. Obtained 
classification accuracies are usually very high (from 
85% to 99%). Weakly supervised learning acquires 
labelled images through three techniques 
crowdsourcing, bagging or the semi-supervised 
technique proposed by Chapelle et al. Methods 
adopting these techniques achieved good classification 
accuracies (from 83% to 89%). Recently, feedforward 
approaches for learning filters were proposed to 
overcome the drawbacks of the traditional 
backpropagation method including the need for large 
labelled datasets. Proposed methods achieve good 
accuracy levels (from 81% to 85%) with a few layers 
and reasonable size of labelled datasets.  
 
6. CONCLUSION 

 
In this paper, we surveyed recent CNN-based methods 
for breast cancer histology image classification. The 
surge in high-performance GPU computing coupled 
with the availability of a large amount of training data 
has made CNN achieve state-of-the-art performance in 


ArunaDevi Karuppasamy, A. Abdesselam, R. Hedjam, H. Zidoum , and Maiya Al-Bahri    
 

51 

the classification tasks. Although the feature 
representation from pre-trained models has been 
learned from natural images, these models achieve 
high classification accuracy on histological images and 
therefore are widely applied. Recently, training with 
few samples is getting more attention from the research 
community using histology images because of the 
limited size of this type of dataset. This survey showed 
that the use of automatic architecture design and 
feedforward filter learning approaches have not been 
fully explored for the analysis of breast histology 
images despite the impressive results obtained by the 
former in classifying images and the promising 
perspectives offered by the latter to deal with the lack 
of labelled histology image datasets. Therefore, we 
believe these two directions need to be further explored 
and could be of great potential and value in the future. 
 
CONFLICT OF INTEREST 
 
The authors declare that there are no conflicts of 
interest regarding this publication.  
 
FUNDING 
 
No Funding was received for this study. 
 
REFERENCE 

Albarqouni, S., Baur, C., Achilles, F., Belagiannis, 
V., Demirci, S., and Navab, N. (2016). AggNet: Deep 
Learning From Crowds for Mitosis Detection in Breast 
Cancer Histology Images. IEEE Transactions on 
Medical Imaging, 35(5), 1313–1321. https://doi 
.org/10.1109/TMI.2016.2528120 

Alom, M. Z., Yakopcic, C., Nasrin, M. S., Taha, T. 
M., and Asari, V. K. (2019). Breast cancer 
classification from histopathological images with 
inception recurrent residual convolutional neural 
network. Journal of Digital Imaging, 32(4), 605–617. 

Anwar, F., Attallah, O., Ghanem, N., and Ismail, 
M. A. (2020). Automatic breast cancer classification 
from histopathological images. 2019 International 
Conference on Advances in the Emerging Computing 
Technologies, AECT 2019, 16–21. https://doi.org/ 
10.1109/AECT47998.2020.9194194 

Araújo, T., Aresta, G., Castro, E., Rouco, J., 
Aguiar, P., Eloy, C., Polónia, A., and Campilho, A. 
(2017). Classification of breast cancer histology 
images using convolutional neural networks. PLOS 
One, 12(6). 

Bayramoglu, N., Kannala, J., and Heikkila, J. 
(2017). Deep learning for magnification independent 
breast cancer histopathology image classification. 
Proceedings - International Conference on Pattern 
Recognition, 2440–2445. https://doi.org/10.1109 
/ICPR.2016.7900002 

Bejnordi, B. E., Zuidhof, G., Balkenhol, M., 
Hermsen, M., Bult, P., van Ginneken, B., 
Karssemeijer, N., Litjens, G., and van der Laak, J. 

(2017). Context-aware stacked convolutional neural 
networks for classification of breast carcinomas in 
whole-slide histopathology images. 1–13. 
https://doi.org/10.1117/1.JMI.4.4.044504 

Boureau, Y.-L., Bach, F., LeCun, Y., and Ponce, J. 
(2010). Learning mid-level features for recognition. 
2010 IEEE Computer Society Conference on Computer 
Vision and Pattern Recognition, 2559–2566. 

Boureau, Y.-L., Ponce, J., and LeCun, Y. (2010). A 
theoretical analysis of feature pooling in visual 
recognition. Proceedings of the 27th International 
Conference on Machine Learning (ICML-10), 111–
118. 

Brabham, D. C. (2008). Crowdsourcing as a model 
for problem-solving: An introduction and cases. 
Convergence, 14(1), 75–90. https://doi.org/10.1177 
/1354856507084420 

Budak, Ü., Cömert, Z., Rashid, Z. N., \cSengür, A., 
and Ç\ibuk, M. (2019). Computer-aided diagnosis 
system combining FCN and Bi-LSTM model for 
efficient breast cancer detection from histopathological 
images. Applied Soft Computing, 85, 105765. 

Chakraborty, S., Aich, S., Kumar, A., Sarkar, S., 
Sim, J. S., and Kim, H. C. (2020). Detection of 
cancerous tissue in histopathological images using 
Dual-Channel Residual Convolutional Neural 
Networks (DCRCNN). International Conference on 
Advanced Communication Technology, ICACT, 2020, 
197–202. 
https://doi.org/10.23919/ICACT48636.2020.9061289 

Cire\csan, D. C., Giusti, A., Gambardella, L. M., 
and Schmidhuber, J. (2013). Mitosis detection in breast 
cancer histology images with deep neural networks. 
International Conference on Medical Image 
Computing and Computer-Assisted Intervention, 411–
418. 

Cruz-Roa, A., Basavanhally, A., González, F., 
Gilmore, H., Feldman, M., Ganesan, S., Shih, N., 
Tomaszewski, J., and Madabhushi, A. (2014). 
Automatic detection of invasive ductal carcinoma in 
whole slide images with convolutional neural 
networks. 9041(216), 904103. https://doi.org/10.1117 
/12.2043872 

Cruz-Roa, A., Gilmore, H., Basavanhally, A., 
Feldman, M., Ganesan, S., Shih, N., Tomaszewski, J., 
Madabhushi, A., and González, F. (2018). High-
throughput adaptive sampling for whole-slide 
histopathology image analysis (HASHI) via 
convolutional neural networks: Application to invasive 
breast cancer detection. PLOS One, 13(5). 

Das, K., Conjeti, S., Roy, A. G., Chatterjee, J., and 
Sheet, D. (2018). Multiple instance learning of deep 
convolutional neural networks for breast 
histopathology whole slide classification. Proceedings 
- International Symposium on Biomedical Imaging, 
2018-April(Isbi), 578–581. https://doi.org/10.1109/ 
ISBI.2018.8363642 

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and 
Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical 
image database. 2009 IEEE Conference on Computer 


Recent CNN-Based Techniques For Breast Cancer Histology Image Classification 
 

52  

Vision and Pattern Recognition, 248–255. 
Dietterich, T. G., Lathrop, R. H., and Lozano-

Pérez, T. (1997). Solving the multiple instance 
problem with axis-parallel rectangles. Artificial 
Intelligence, 89(1–2), 31–71. 
https://doi.org/10.1016/s0004-3702 (96)00034-3 

Goodfellow, I. J., Warde-Farley, D., Mirza, M., 
Courville, A., and Bengio, Y. (2013). Maxout 
Networks. ArXiv Preprint ArXiv:1302.4389. 

Han, Z., Wei, B., Zheng, Y., Yin, Y., Li, K., and Li, 
S. (2017). Breast Cancer Multi-classification from 
Histopathological Images with Structured Deep 
Learning Model. Scientific Reports, 7(1), 1–10. 
https://doi.org/10.1038/s41598-017-04075-z 

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep 
residual learning for image recognition. Proceedings of 
the IEEE Conference on Computer Vision and Pattern 
Recognition, 770–778. 

He, K., Zhang, X., Ren, S., and Sun, J. (2015). 
Delving deep into rectifiers: Surpassing human-level 
performance on imagenet classification. Proceedings 
of the IEEE International Conference on Computer 
Vision, 1026–1034. 

Herrera, F., Ventura, S., Bello, R., Cornelis, C., 
Zafra, A., Sánchez-Tarragó, D., Vluymans, S., Herrera, 
F., Ventura, S., Bello, R., Cornelis, C., Zafra, A., 
Sánchez-Tarragó, D., and Vluymans, S. (2016). 
Multiple Instance Learning. In Multiple Instance 
Learning. https://doi.org/10.1007/978-3-319-47759-
6_2 

Huang, Y., Zheng, H., Liu, C., Ding, X., and 
Rohde, G. K. (2017). Epithelium-stroma classification 
via convolutional neural networks and unsupervised 
domain adaptation in histopathological images. IEEE 
Journal of Biomedical and Health Informatics, 21(6), 
1625–1632. 
https://doi.org/10.1109/JBHI.2017.2691738 

IARC. (2020). GLOBOCAN 2020: New Global 
Cancer Data. https://www.uicc.org/news/globocan-
2020-new-global-cancer-data 

Intercollegiate, S., and Network, G. (2005). SIGN 
Guideline No. 84 Management of breast cancer in 
women. December. 

Jaiswal, A. K., Panshin, I., Shulkin, D., Aneja, N., 
and Abramov, S. (2019). Semi-Supervised Learning for 
Cancer Detection of Lymph Node Metastases. 
http://arxiv.org/abs/1906.09587 

Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., and 
Yan, S. (2016). Deep learning with s-shaped rectified 
linear activation units. Thirtieth AAAI Conference on 
Artificial Intelligence. 

Kaggle dataset. (2018). https://www.kaggle.com 
/c/histopathologic-cancer-detection 

Kausar, T., Wang, M. J., Idrees, M., and Lu, Y. 
(2019). HWDCNN: Multi-class recognition in breast 
histopathology with Haar wavelet decomposed image 
based convolution neural network. Biocybernetics and 
Biomedical Engineering, 39(4), 967–982. 
https://doi.org/10.1016/j.bbe.2019.09.003 

Koohbanani, N. A., Qaisar, T., Shaban, M., 

Gamper, J., and Rajpoot, N. (2018). Significance of 
hyperparameter optimization for metastasis detection 
in breast histology images. In Computational 
Pathology and Ophthalmic Medical Image Analysis 
(pp. 139–147). Springer. 

Krizhevsky, A. (2009). Learning Multiple Layers 
of Features from Tiny Images. … Science Department, 
University of Toronto, Tech. …, 1–60. https:// 
doi.org/10.1.1.222.9220 

Krizhevsky, A., and Hinton, G. (2010). 
Convolutional deep belief networks on cifar-10. 
Unpublished Manuscript, 1–9. http://scholar. 
google.com/scholar?hl=enandbtnG=Searchandq=intitl
e:Convolutional+Deep+Belief+Networks+on+CIFAR
-10#0 

Krizhevsky, A., Sutskever, I., and Hinton, G. E. 
(2012). Imagenet classification with deep 
convolutional neural networks. Advances in Neural 
Information Processing Systems, 1097–1105. 

Kröse, B., Krose, B., van der Smagt, P., and Smagt, 
P. (1993). An introduction to neural networks. 

Laxmisagar, H. S., and Hanumantharaju, M. C. 
(2021). Design of an Efficient Deep Neural Network 
for Multi-level Classification of Breast Cancer 
Histology Images. In Intelligent Computing and 
Applications (pp. 447–459). Springer. 

LeCun, Y. A., Bottou, L., Orr, G. B., and Müller, 
K.-R. (2012). Efficient backprop. In Neural networks: 
Tricks of the trade (pp. 9–48). Springer. 

Litjens, G., Sánchez, C. I., Timofeeva, N., 
Hermsen, M., Nagtegaal, I., Kovacs, I., Kaa, C. H. Van 
De, Bult, P., and Ginneken, B. Van. (2016). Deep 
learning as a tool for increased accuracy and efficiency 
of histopathological diagnosis. Nature Publishing 
Group, January, 1–11. https://doi.org/10.1016/ 
j.media.2020.101813 

Maas, A. L., Hannun, A. Y., and Ng, A. Y. (2013). 
Rectifier nonlinearities improve neural network 
acoustic models. Proc. Icml, 30(1), 3. 

Mehra, R. (2018). Breast cancer histology images 
classification : Training from scratch or transfer 
learning ? ICT Express, xxxx. 
https://doi.org/10.1016/j.icte.2018.10.007 

Motlagh, N. H., Jannesary, M., Aboulkheyr, H., 
Khosravi, P., Elemento, O., Totonchi, M., and 
Hajirasouliha, I. (2018). Breast cancer 
histopathological image classification: A deep learning 
approach. BioRxiv, 242818. 

Munien, C., and Viriri, S. (2021). Classification of 
Hematoxylin and Eosin-Stained Breast Cancer 
Histology Microscopy Images Using Transfer 
Learning with EfficientNets. Computational 
Intelligence and Neuroscience, 2021. 

Oyelade, O. N., and Ezugwu, A. E. (2021). A 
bioinspired neural architecture search based 
convolutional neural network for breast cancer 
detection using histopathology images. Scientific 
Reports, 11(1), 1–28. https://doi.org/10.1038/s41598-
021-98978-7 

Roy, K., Banik, D., Bhattacharjee, D., and 


ArunaDevi Karuppasamy, A. Abdesselam, R. Hedjam, H. Zidoum , and Maiya Al-Bahri    
 

53 

Nasipuri, M. (2019). Patch-based system for 
Classification of Breast Histology images using deep 
learning. Computerized Medical Imaging and 
Graphics, 71, 90–103. 
https://doi.org/10.1016/j.compmedimag.2018.11.003 

Ruifrok, A. C. (2019). quantification of 
histochemical staining by color deconvolution. Journal 
of Chemical Information and Modeling, 53(9), 1689–
1699. 

Shi, J., Wu, J., Li, Y., Zhang, Q., and Ying, S. 
(2016). Histopathological image classification with 
color pattern random binary hashing-based PCANet 
and matrix-form classifier. IEEE Journal of 
Biomedical and Health Informatics, 21(5), 1327–1337. 

Sickles, E. A. (1997). Breast Cancer Screening 
Outcomes in Women Ages 40 – 49 : Clinical 
Experience With Service Screening Using Modern 
Mammography. 99–104. 

Simonyan, K., and Zisserman, A. (2014). Very 
Deep Convolutional Networks for Large-Scale Image 
Recognition. 1–14. https://doi.org/10.1016/j.infsof 
.2008.09.005 

Song, Y., Zou, J. J., Chang, H., and Cai, W. (2017). 
Adapting fisher vectors for histopathology image 
classification. 2017 IEEE 14th International 
Symposium on Biomedical Imaging (ISBI 2017), 600–
603. 

Spanhol, F. (2016). A Dataset for Breast Cancer 
Histopathological A Dataset for Breast Cancer 
Histopathological Image Classification. 63(November 
2015), 1455–1462. https://doi.org/10.1109/ 
TBME.2015.2496264 

Spanhol, Fabio A, Cavalin, P. R., Oliveira, L. S., 
Petitjean, C., and Heutte, L. (2017). Deep Features for 
Breast Cancer Histopathological Image 
Classification. 1868–1873. 

Spanhol, Fabio A, Oliveira, L. S., Petitjean, C., and 
Heutte, L. (2015). A dataset for breast cancer 
histopathological image classification. IEEE 
Transactions on Biomedical Engineering, 63(7), 1455–
1462. 

Spanhol, Fabio Alexandre, Oliveira, L. S., 
Petitjean, C., Heutte, L., Cavalin, P. R., Oliveira, L. S., 
Petitjean, C., Heutte, L., Sudharshan, P. J., Petitjean, 
C., Spanhol, F. A., Eduardo, L., Heutte, L., and 
Honeine, P. (2016). Breast cancer histopathological 
image classification using Convolutional Neural 
Networks. Proceedings of the International Joint 
Conference on Neural Networks, 2016-Octob, 2560–
2567. https://doi.org/10.1109/IJCNN.2016.7727519 

Sudharshan, P. J., Petitjean, C., Spanhol, F., 
Eduardo, L., Heutte, L., and Honeine, P. (2019). 
Multiple instance learning for histopathological breast 
cancer image classification. Expert Systems With 
Applications, 117, 103–111. https://doi.org/10.1016 
/j.eswa.2018.09.049 

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed,

 S., Anguelov, D., Erhan, D., Vanhoucke, V., and 
Rabinovich, A. (2015). Going deeper with 
convolutions. Proceedings of the IEEE Conference on 
Computer Vision and Pattern Recognition, 1–9. 

Trottier, L., Gigu, P., Chaib-draa, B., and others. 
(2017). Parametric exponential linear unit for deep 
convolutional neural networks. 2017 16th IEEE 
International Conference on Machine Learning and 
Applications (ICMLA), 207–214. 

Vang, Y. S., Chen, Z., and Xie, X. (2018). Deep 
Learning Framework for Multi-class Breast Cancer 
Histology Image Classification. Lecture Notes in 
Computer Science (Including Subseries Lecture Notes 
in Artificial Intelligence and Lecture Notes in 
Bioinformatics), 10882 LNCS, 914–922. https://doi 
.org/10.1007/978-3-319-93000-8_104 

Vesal, S., Ravikumar, N., Davari, A., … S. E.-… 
C. I., and 2018,  undefined. (n.d.). Classification of 
breast cancer histology images using transfer learning. 
Springer,1,https://link.springer.com/chapter/10.1007/
978-3-319-93000-8_92 

Vo, D. M., Nguyen, N. Q., and Lee, S. W. (2019). 
Classification of breast cancer histology images using 
incremental boosting convolution networks. 
Information Sciences, 482, 123–138. https://doi.org 
/10.1016/j.ins.2018.12.089 

Wahab, N., Khan, A., and Lee, Y. S. (2017). Two-
phase deep convolutional neural network for reducing 
class skewness in histopathological images based 
breast cancer detection. Computers in Biology and 
Medicine, 85(November 2016), 86–97. https://doi.org 
/10.1016/j.compbiomed.2017.04.012 

Wang, H., Cruz-Roa, A., Basavanhally, A., 
Gilmore, H., Shih, N., Feldman, M., Tomaszewski, J., 
Gonzalez, F., and Madabhushi, A. (2014). Mitosis 
detection in breast cancer pathology images by 
combining handcrafted and convolutional neural 
network features. Journal of Medical Imaging, 1(3), 
034003. https://doi.org/10.1117/1.JMI.1.3.034003 

Wistuba, M. (n.d.). A Survey on Neural 
Architecture Search. 

Xu, B., Wang, N., Chen, T., and Li, M. (2015). 
Empirical evaluation of rectified activations in 
convolutional network. ArXiv Preprint 
ArXiv:1505.00853. 

Zagoruyko, S., and Komodakis, N. (2016). Wide 
Residual Networks. British Machine Vision 
Conference 2016, BMVC 2016, 2016-Septe, 87.1-
87.12. https://doi.org/10.5244/C.30.87 

Zhou, Z.-H. (2018). A brief introduction to weakly 
supervised learning. National Science Review, 5(1), 
44–53. 

Zoph, B., and Le, Q. V. (2017). Neural architecture 
search with reinforcement learning. 5th International 
Conference on Learning Representations, ICLR 2017 - 
Conference Track Proceedings, 1–16. 

 
	1. INTRODUCTION
	2. CONVOLUTIONAL NEURAL NETWORK FOR HISTOLOGY IMAGE CLASSIFICATION
	3.  CNN ARCHITECTURE DESIGN
	• Manually-Designed CNN Architectures
	• Pre-trained CNN Architectures
	• Automatically-Designed CNN Architecture
	4. LACK OF LABELED HISTOLOGY DATASETS
	•  Commonly Used Approaches Dealing with Small Histology Datasets
	• CNN using Weakly-Supervised Learning
	• Methods Using Forward Filter Learning
	5.  DISCUSSION
	6. CONCLUSION
	CONFLICT OF INTEREST
	FUNDING