Acoustic Event Classification Using Deep Neural Networks
Permanent address of the item is
Audio information retrieval has been a popular research subject over the last decades and being a subfield of this area, acoustic event classification has a considerable amount of share in the research. In this thesis, acoustic event classification using deep neural networks is investigated. Neural networks have been used in several pattern recognition (both function approximation and classification) tasks. Due to their stacked, layer-wise structure they have been proved to model highly nonlinear relations between inputs and outputs of a system with high performance. Even though several works imply an advantage of deeper networks over shallow ones in terms of recognition performance, advancements in training deep architectures were encountered only recently. These methods excel conventional methods such as HMMs and GMMs in terms of acoustic event classification performance. In this thesis, effects of several NN classifier parameters such as number of hidden layers, number of units in hidden layers, batch size, learning rate etc. on classification accuracy are examined. Effects of implementation parameters such as types of features, number of adjacent frames, number of most energetic frames etc. are also investigated. A classification accuracy of 61.1% has been achieved with certain parameter values. In the case of DBNs, An application of greedy, layer-wise, unsupervised training before standard supervised training in order to initialize network weights in a better way, provided a 2-4% improvement in classification performance. A NN that had randomly initialized weights before supervised training was shown to be considerably powerful in terms of acoustic event classification tasks compared to conventional methods. DBNs have provided even better classification accuracies and justified its significant potential for further research on the topic.