The effect of two hyperparameters in the learning performance of the Convolutional Neural Networks
Saari, Mikko Lauri Henrik
Permanent address of the item is
Kahden hyperparametrin vaikutus konvoluutioneuroverkkojen oppimiskykyyn
The research topic of this work was to study the effect of two special hyperparameters to the learning performance of Convolutional Neural Networks (CNN) in the Cifar-10 image recognition problem. The first hyperparameter, that was chosen to be studied, was the depth of the CNN, which is known to effect to the amount of the feature extraction. The second hyperparameter, was the use of the special regularization technique called Dataset Augmentation (DA), which is known to increase the amount of training data available artificially so that there is got extracted and taught more features from the members of each classes to the networks. The hypothesis was that the increment of the depth and the training data would improve the learning accuracy, especially by improving the testing accuracy, which measures the ability of generalization of learned features for the model. The work was implemented by the high-level Deep Learning software called Keras, of which source code is freely available on Github. For the CNN needed in the work there was found ready Python implementation, which was modified slightly by adding a few code lines. The depth of the networks was incremented first without Dataset Augmentation, by adding the number of the convolutional layers one-by-one from four to eight. The same was repeated while the augmentation was set on. The results were partly equivalent, partly contradictory with the hypothesis. The increment of the depth increased both the training and testing accuracy, when the Dataset Augmentation was not used. Instead if it was used, while the training accuracy still improved, the testing accuracy dropped, even below the accuracy achieved by the original networks. The natural conclusion is that both increasing of the depth and the amount of data artificially effect the overfitting. Instead to apply both at same time, only either of them should be used.