Neural Network with Keras

We have made a lot of effort to program our neural network that is able to classify differenr handwritten number with the help of numpy. A lot of other people did that already and since this is the basis for many applications nowadays, a large number of API (application programming interfaces) exist. Python plays therby a leading role. We will use in the follwing the interface provided by the keras module. keras is actually sitting on top of the real machine learning API, which is in our case tensorflow. keras makes the use of tensorflow a bit more friendly and from the example below, you wil recognize by how much shorter our code gets with the keras and tensorflow API.

[14]:
from keras.datasets import mnist
from keras.utils import to_categorical

from keras import Sequential
from keras.layers import Dense

import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
%config InlineBackend.figure_format = 'retina'


plt.rcParams.update({'font.size': 18,
                     'axes.titlesize': 20,
                     'axes.labelsize': 20,
                     'axes.labelpad': 1,
                     'lines.linewidth': 2,
                     'lines.markersize': 10,
                     'xtick.labelsize' : 18,
                     'ytick.labelsize' : 18,
                     'xtick.top' : True,
                     'xtick.direction' : 'in',
                     'ytick.right' : True,
                     'ytick.direction' : 'in'
                    })

MNIST Data Set (Keras)

This loads the same data as in our previous notebook, except that the function to do that is directly provided by keras.

[9]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape((60000, 28*28))
x_train = x_train.astype('float32')/255

x_test = x_test.reshape((10000, 28*28))
x_test = x_test.astype('float32')/255

# one-hot encoding
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

Build the model

The next few lines create the whole neural network with an input layer, a hidden layer with 64 neurons and and output layer with 10 neurons.

[10]:
model = Sequential([
    Dense(64, activation='sigmoid', input_shape=(28 * 28, )),
    Dense(10, activation='softmax')
])

Compile the model

The compile method assembles everything to create a model for training. You can specify here the stochastic gradient descent method in the same way as the loss function.

[11]:
model.compile(optimizer='SGD',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

Train the model

Finally, the fit method allows us to train the model for a specified number of epochs.

[12]:
model.fit(x_train, y_train, epochs=10)
Epoch 1/10
1875/1875 [==============================] - 2s 804us/step - loss: 1.5120 - accuracy: 0.6947
Epoch 2/10
1875/1875 [==============================] - 1s 779us/step - loss: 0.7621 - accuracy: 0.8422
Epoch 3/10
1875/1875 [==============================] - 1s 770us/step - loss: 0.5596 - accuracy: 0.8684
Epoch 4/10
1875/1875 [==============================] - 1s 744us/step - loss: 0.4720 - accuracy: 0.8819
Epoch 5/10
1875/1875 [==============================] - 1s 776us/step - loss: 0.4225 - accuracy: 0.8901
Epoch 6/10
1875/1875 [==============================] - 1s 789us/step - loss: 0.3907 - accuracy: 0.8961
Epoch 7/10
1875/1875 [==============================] - 2s 803us/step - loss: 0.3681 - accuracy: 0.8996
Epoch 8/10
1875/1875 [==============================] - 1s 794us/step - loss: 0.3509 - accuracy: 0.9026
Epoch 9/10
1875/1875 [==============================] - 1s 781us/step - loss: 0.3373 - accuracy: 0.9058
Epoch 10/10
1875/1875 [==============================] - 1s 770us/step - loss: 0.3261 - accuracy: 0.9081
[12]:
<tensorflow.python.keras.callbacks.History at 0x7fb436b23a90>

Testing the model

We may now use our trained model to predict the number in the image with the model.predict function. This delivers an array of 10 numbers, which represent the confidences that the number \(0,\ldots,9\) are contained. The index of the biggest number thus represents the number contained in the image.

[15]:
i=32
plt.imshow(x_test[i,:].reshape(28,28), cmap='gray')
print("prediction: ",np.argmax(model.predict(x_test[i,:].reshape(1,784))))
prediction:  3
../../_images/notebooks_L13_2_deep_learning_keras_17_1.png