# Anomaly Detection with Autoencoders¶

(C) 2020 - Umberto Michelucci, Michela Sperti

This notebook is part of the book Applied Deep Learning: a case based approach, 2nd edition from APRESS by U. Michelucci and M. Sperti.

This notebook is referenced in Chapter 25 and 26 in the book.

## Notebook learning goals¶

At the end of this notebook you will be able to build a simple anomaly detection algorithm using autoencoders with Keras, using Dense layers in Keras.

Datasets used:

MNIST dataset

fashion MNIST dataset

## Libraries Import¶

import numpy as np
import tensorflow.keras as keras
import pandas as pd
import time
import sys
import seaborn as sns
import matplotlib.pyplot as plt

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

import tensorflow as tf


## MNIST and FASHION MNIST dataset¶

For this notebnook we will use two datasets:

You can check the two datasets with the links above. They can be easily imported using keras. Below you can see how easy is using tensorflow.keras.datasets.

from keras.datasets import mnist
import numpy as np
(mnist_x_train, mnist_y_train), (mnist_x_test, mnist_y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step


As usual we will do the typical normalisation of the datasets as you can see below. At this point in the book you should be able to understand the code below easily.

mnist_x_train = mnist_x_train.astype('float32') / 255.
mnist_x_test = mnist_x_test.astype('float32') / 255.
mnist_x_train = mnist_x_train.reshape((len(mnist_x_train), np.prod(mnist_x_train.shape[1:])))
mnist_x_test = mnist_x_test.reshape((len(mnist_x_test), np.prod(mnist_x_test.shape[1:])))

from keras.datasets import fashion_mnist
import numpy as np
(fashion_x_train, fashion_y_train), (fashion_x_test, fashion_y_test) = fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
26427392/26421880 [==============================] - 0s 0us/step
8192/5148 [===============================================] - 0s 0us/step
4423680/4422102 [==============================] - 0s 0us/step


Note that we are doing the same normalisation for the fashion mnist datasets as for the classical MNIST.

fashion_x_train = fashion_x_train.astype('float32') / 255.
fashion_x_test = fashion_x_test.astype('float32') / 255.
fashion_x_train = fashion_x_train.reshape((len(fashion_x_train), np.prod(fashion_x_train.shape[1:])))
fashion_x_test = fashion_x_test.reshape((len(fashion_x_test), np.prod(fashion_x_test.shape[1:])))


## Problem to be solved¶

Now let’s create a special dataset that is made of the 10000 images of the MNIST test dataset and one single image from the fashion mnist dataset. Our goal in this notebook will be to find this image automatically without looking at them. Can we do it?

x_test = np.concatenate((mnist_x_test, fashion_x_test[0].reshape(1,784)))
print(x_test.shape)

(10001, 784)


All the images in the MNIST dataset are hand written digits. Below you can see an example of one

plt.imshow(mnist_x_test[10].reshape(28,28))

<matplotlib.image.AxesImage at 0x7f936f75acc0>


But the images in the fashion MNIST are all gray level images of clothing items. In particular the one we are adding to the hand written digits is the image of a show that can be seen below.

plt.imshow(fashion_x_test[0].reshape(28,28))

<matplotlib.image.AxesImage at 0x7f9376fecf60>


## Function to create the autoencoders¶

Now we need to create the keras models. An autoencoder is made of two main parts: an encoder and a decoder. The function below create_autoencoders() returns the following parts as separate models:

• The encoder

• the decoder

• the complete model, when the encoder and decoder are joined in one model.

def create_autoencoders (feature_layer_dim = 16):
input_img = Input(shape = (784,), name = 'Input_Layer')
# The layer encoded has a dimension equal to feature_layer_dim and contains
# the encoded input (therefore the name)
encoded = Dense(feature_layer_dim, activation = 'relu', name = 'Encoded_Features')(input_img)
decoded = Dense(784, activation = 'sigmoid', name = 'Decoded_Input')(encoded)

autoencoder = Model(input_img, decoded)
encoder = Model(input_img, encoded)

encoded_input = Input(shape = (feature_layer_dim,))
decoder = autoencoder.layers[-1]
decoder = Model(encoded_input, decoder(encoded_input))

return autoencoder, encoder, decoder


## Autoencoder with layers with $$(784,64,784)$$ neurons.¶

As a first step let’s create an autoencoder with the layer dimensions of $$(784, 64, 784)$$.

autoencoder, encoder, decoder = create_autoencoders (64)

keras.utils.plot_model(autoencoder, show_shapes=True)


As for any keras model we need to compile the model and the fit it to the data. As you can see we don’t need any custom code to work with autoencoders. A simple model definition $$\rightarrow$$ compile $$\rightarrow$$ fit is enough.

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

history = autoencoder.fit(mnist_x_train, mnist_x_train,
epochs=30,
batch_size=256,
shuffle=True,
validation_data=(mnist_x_test, mnist_x_test),
verbose = 0)

encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)


We can now calculate the reconstruction error of image $$j$$ ($$\textrm{RE}^{[j]}$$) simply calculating

$\textrm{RE}^{[j]} = \sum_{i=1}^{784}\frac{(x_i^{[j]}-x_{rec,i}^{[j]})^2}{m}$

where $$x_i^{[j]}$$ is the $$i^{th}$$ pixel value of the $$j$$ image, and the sum is over all the pixels.

RE = ((x_test - decoded_imgs)**2).mean(axis=1)


The $$\textrm{RE}$$ for the image of the shoe we added can be easily printed since it is the last element of the vector RE.

RE[-1]

0.05830472


It is easy to see that this is the highest $$\textrm{RE}$$ we have for all 10000 images by far. We can check this by sorting the RE vector.

RE.sort()
print(RE[9990:])

[0.01794755 0.01799489 0.01815483 0.01842355 0.01906578 0.01941952
0.02057061 0.02083137 0.02164584 0.024418   0.05830472]


You can see that the second highest reconstruction error (0.024418) is less than half of the $$\textrm{RE}$$ for the added image (0.05830472). Below you can see the original image and the one that the trained autoencoder has reconstructed. You can see how the reconstructed image does not look like the original at all.

fig = plt.figure(figsize = (14,7))

plt.title("Original Image", fontsize = 16)
plt.imshow(x_test[10000].reshape(28,28))

plt.title("Reconstructed Image", fontsize = 16)
plt.imshow(decoded_imgs[10000].reshape(28,28))

<matplotlib.image.AxesImage at 0x7f9362f48518>


The autoencoder is able to reconstruct perfectly hand-written images, as you can see below.

fig = plt.figure(figsize = (14,7))

plt.title("Original Image", fontsize = 16)
plt.imshow(x_test[500].reshape(28,28))

plt.title("Reconstructed Image", fontsize = 16)
plt.imshow(decoded_imgs[500].reshape(28,28))

<matplotlib.image.AxesImage at 0x7f9362e7b588>


The image below (and its reconstructed version) of the image with the second highest reconstruction error is show below. The reason is clear, this image does not seems like a hand-written digit at all! That could even count as an outlier in the dataset.

fig = plt.figure(figsize = (14,7))


<matplotlib.image.AxesImage at 0x7f9362d7ca20>