Malaria is a deadly mosquito-borne disease caused by Plasmodium parasites. These parasites are transmitted by the bites of infected female mosquitoes. According to WHO report malaria parasites infect more than 200 million people and cause more than 400,000 deaths per year! and this number is a lot. Let’s try to use AI for good cause and make a model that can classify blood cells as healthy or infected with malaria.
Throughout this post we would be using Jupyter notebook for data exploration, building model and training and testing our model. You can follow our another post to get started with Jupyter notebook.
Dataset
We would be using high quality image dataset provided by researchers at the Lister Hill National Center for Biomedical Communications (LHNCBC). This malaria dataset has 27,558 cell images with an equal number of parasitized and uninfected cells. You can find more information about dataset here. Click on this link to download dataset directly and extract it. After extracting the data from cell_images.zip
, the folder structure looks like:
1 2 3 | /content/cell_images ├── Parasitized [13780 entries exceeds filelimit, not opening dir] └── Uninfected [13780 entries exceeds filelimit, not opening dir] |
Exploring Dataset
Dataset contains equal amount of parasitized and Uninfected images. However sizes of images are not all same. Image size ranges from 46*46*3
to 384 * 395*3
. So we need to resize image before feeding to our model. One thing we need to consider while resizing is trade-off between size, memory/computation and accuracy. Small the size lower the memory consumption and computation but feature might get lost while downsizing hence lower accuracy. Unlike humans, CNN models are pretty good at classifying small size images(CIFAR 10 classification performance comparison). And also rule of thumb is it’s better to set the input size near to the minimum image size distribution ( zooming is bad). Let’s dive into code.
Import necessary modules
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | %matplotlib inline import numpy as np np.random.seed(1000) import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split # For CNN model creation import keras from keras.layers import Convolution2D, MaxPooling2D, Flatten, Dense, BatchNormalization, Dropout from keras.models import Sequential from keras.utils import to_categorical # For working with images import os import cv2 |
We are going to use Keras with TensorFlow backend for CNN model. OpenCV for image manipulation and finally matplotlib for data visualization.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | #constant data information DATA_DIR = '/content/cell_images/' SIZE = 100 dataset = [] label = [] parasitized_images = os.listdir(DATA_DIR + 'Parasitized/') for i, image_name in enumerate(parasitized_images): try: if (image_name.split('.')[1] == 'png'): image = cv2.imread(DATA_DIR + 'Parasitized/' + image_name) image = cv2.resize(image,(SIZE,SIZE)) dataset.append(np.array(image)) label.append(0) except Exception as e: print("Could not read image {} with name {}".format(i, image_name)) print(e) uninfected_images = os.listdir(DATA_DIR + 'Uninfected/') for i, image_name in enumerate(uninfected_images): try: if (image_name.split('.')[1] == 'png'): image = cv2.imread(DATA_DIR + 'Uninfected/' + image_name) image = cv2.resize(image,(SIZE,SIZE)) dataset.append(image) label.append(1) except Exception: print("Could not read image {} with name {}".format(i, image_name)) print(Exception) |
After extracting data from zip file, two new folders are created for parasitized and uninfected cells with name Parasitized
and Uninfected
respectively. So we need to read data from both of these directories.
In Line No 6 and 18 we are iterating over all files in two directories and reading only PNG image file.
In Line No 9 and 21 we are reading image using OpenCV’s imread function
and resizing the image, which is very important, finally we append resized image to datset array. Likewise, there we are appending 0 and 1 for two different category in Line No 12 and 23. This is because we are doing binary classification problem. We will later convert this vector to binary matrix representation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | def plot_figures(figures, nrows = 1, ncols=1): """Plot a dictionary of figures. Parameters ---------- figures : (title,image) list ncols : number of columns of subplots wanted in the display nrows : number of rows of subplots wanted in the figure """ fig, axeslist = plt.subplots(ncols=ncols, nrows=nrows) for ind,(title,image) in enumerate(figures): print(image) axeslist.ravel()[ind].imshow(image/255.0) axeslist.ravel()[ind].set_title(title) axeslist.ravel()[ind].set_axis_off() plt.tight_layout() # optional figures = [('infected' if not label[ran] else 'healthy' ,dataset[ran]) for ran in random.sample(range(0,len(dataset)-1),9)] plot_figures(figures,3,3) |
This block shows random 8 images from dataset along with it’s label. In Line No 18 we are making list of tuple with each random tuple consisting (<label>,<image>)
. Function plot_figures
takes this list and shows each one in subplot. Output of this block is:

Based on the sample images above, we can notice some subtle differences between cell with malaria and healthy cell images. We will train our deep learning models to detect and learn these patterns.
Building Model
Before building and training our model, let’s make sure that training data is random in order.
1 2 3 4 5 | n = np.arange(len(dataset)) np.random.shuffle(n) dataset = dataset[n] label = label[n] |
In above block, we are shuffling dataset as well as label so that our model can perform better.
Now, let’s build our model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | l2_reg = keras.regularizers.l2(l=0.001) label = to_categorical(label) X_train,X_test,Y_train,Y_test = train_test_split(dataset,label,test_size = 0.3) print(np.array(Y_train).shape) classifier = Sequential() classifier.add(Convolution2D(32, (3, 3), input_shape = (SIZE, SIZE, 3), activation = 'relu')) classifier.add(MaxPooling2D(pool_size = (2, 2),strides=(2,2),data_format="channels_last")) classifier.add(BatchNormalization(axis = -1)) classifier.add(Dropout(0.1)) classifier.add(Convolution2D(64, (3, 3), activation = 'relu')) classifier.add(MaxPooling2D(pool_size = (2, 2),strides=(2,2),data_format="channels_last")) classifier.add(BatchNormalization(axis = -1)) classifier.add(Dropout(0.1)) classifier.add(Convolution2D(128, (3, 3), activation = 'relu')) classifier.add(MaxPooling2D(pool_size = (2, 2),strides=(2,2),data_format="channels_last")) classifier.add(BatchNormalization(axis = -1)) classifier.add(Dropout(0.1)) classifier.add(Convolution2D(128, (3, 3), activation = 'relu')) classifier.add(MaxPooling2D(pool_size = (2, 2),strides=(2,2),data_format="channels_last")) classifier.add(BatchNormalization(axis = -1)) classifier.add(Dropout(0.1)) classifier.add(Flatten()) classifier.add(Dense(activation = 'relu', kernel_regularizer=l2_reg, units=256)) classifier.add(BatchNormalization(axis = -1)) classifier.add(Dropout(0.25)) classifier.add(Dense(activation = 'relu', kernel_regularizer=l2_reg, units=512)) classifier.add(BatchNormalization(axis = -1)) classifier.add(Dropout(0.25)) classifier.add(Dense(activation = 'softmax', units=2)) classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy']) classifier.summary() |
In line no. 5 we are defining a sequential model and we will append different layers to this model. First of all let’s discuss about different layer used in our model:
Convolution Layer: Convolution layer is main building block of Convolution neural networks and contains a set of filters , 32 filters in first layer in above model, whose parameters need to be learned. These filter extract different feature from image by sliding over entire image(Convolution operation). After every convolution operation which is a linear function, we apply ReLU activation function. ReLU activation function introduces non linearity in convolutional layer.
Pooling Layer: Pooling layer is responsible for making our model robust to translation of feature in the image. Putting simply, In our above sample images, the dark stain like feature in infected cell can be anywhere in image. This operation is also done by filter like element sliding over entire image but output of single pooling operation is either average of all element under filter or element with maximum value depending upon layer type. For Example, In Line No 8 we are using MaxPooling2D
with size (2,2).
Batch Normalization Layer: It’s definition is little bit complicated and full of mathematical jargon. Let’s start with basic normalization first. Normalization is the process of generalization of feature in input. Best example I could come up with is this black and colored cat example.
Suppose we made a deep learning model to classify cat or not cat. And we trained it with all black cats. When we test the model with colored cat then it classifies as not cat. But why? because the testing image distribution is different than that of images used for training. And it learned color as one of the feature. So how we overcome this problem? we normalize data by subtracting mean from every element and dividing by std. deviation. This makes input to have distribution with mean 0 and standard deviation close to 1. After this normalization color information is also eliminated from dataset. Different type of normalization technique could be found in this link.
Likewise Batch normalization is method used to normalize the activations of the previous layer(in hidden layers) at each batch, i.e. it applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1. more on this topic could be found in this post.
Dropout Layer: Dropout is a regularization technique used to avoid over fitting. It’s method of dropping random neurons in both hidden layer and visible layer. Doing so removes the contribution of that neuron in both forward pass and backward pass during training i.e in activation and during weight update.
Flatten Layer: All the layers discussed earlier preserves the 2D-spatial information in image but at some point we need to have output nets with activation functions like softmax which gives distribution of probability of each class and this layer needs single dimension vector as input. So flatten layer does the task of converting theose 2d feature maps to 1 dimensional feature vector.
Dense Layer: This is normal fully connected layer. Final output layer is fully connected output layer with 2 neurons and softmax as activation function. This layer gives us probability distributionof input for each class.
Our model consist of four convolution layer, each followed by max pooling layer, batch normalization after non-linearity and dropout for regularization. Tuning of Hyper-parameters like depth of network, number of learnable filters, size of filters, choice of activation functions etc are beyond the scope of this post. I will write another post in future about hyperparameter tuning in CNN. So let’s stick with it now.
Compiling and Training
Before Training our model, we need to configure different parameters like loss function, optimizer, metrics to measure how well our model is performing etc. And this is done by compile()
function in line no 39.
Now let’s train our model
1 2 3 4 5 6 7 8 9 | history = classifier.fit(np.array(X_train), np.array(Y_train), batch_size = 64, verbose = 1, epochs = 20, validation_split = 0.1, shuffle = False) print("Test Accuracy: {:.2f}%".format(classifier.evaluate(np.array(X_test), np.array(Y_test))[1]*100)) |
We get validation accuracy of 96% which is good.

And plotting the basic performance metrics of model:
1 2 3 4 5 6 7 8 9 10 11 12 13 | fig,ax = plt.subplots() print(history.history.keys()) fig.set_size_inches(8,8) fig.set_dpi(80) ax.plot(history.history['loss'],'--') ax.plot(history.history['val_acc'],':') ax.plot(history.history['acc'],'-.') ax.plot(history.history['val_loss'],'-') ax.legend(['loss', 'val_acc','acc','val_loss'], loc=0) ax.set_title("Model Title") ax.set_xlabel("Epoches") fig.show() |

Testing And Saving Model
At this point we have defined our model and trained with input data. Now, it’s time that we actually test our model and see the result. Let’s do it:
1 2 3 4 5 6 7 | test_samp = X_test[:4] # label is not needed but we can print to validate/compare the prediction test_label = Y_test[:4] pred = classifier.predict(test_samp) pred = np.argmax(pred,axis = -1) figures = [('infected' if not pred[ind] else 'healthy' ,img) for ind,img in enumerate(test_samp)] plot_figures(figures,2,2) |
In line no 1 & 2 we are taking 4 sample images out of test set and making prediction in line no 4. As I pointed out earlier, the result of prediction is probability distribution and is something like :
1 2 3 4 | array([[0.98511374, 0.0148863 ], [0.9828425 , 0.01715751], [0.94831526, 0.0516848 ], [0.15194416, 0.84805584]], dtype=float32) |
Here np.argmax()
finds the index (Which represent healthy-infected class) with maximum value. Finally, plotting the result we get:

And finally it’s time to save our model.
1 | model.save("<path to file>.h5") |
This will save our model architecture, weights, training configuration etc.
*****
If you have any questions or comments regarding the post (or something else), please feel free to reach out through the comments.
Related posts
Today's pick
Categories
- Computer Vision/ML (3)
- Javascript (1)
- Linux (1)
- Python (20)
- Advance Python (3)
- Basic Python (6)
- Intermediate Python (11)
- Uncategorized (1)