N-Dimensional Polynomial Neural Networks and their Applications
Metadata
Afficher la notice complèteAuthor
Ben Abdallah, Habib
Date
2022-04-06Citation
Ben Abdallah, Habib. N-Dimensional Polynomial Neural Networks and their Applications; A thesis submitted to the Faculty of Graduate Studies of The University of Winnipeg in partial fulfillment of the requirements of the degree of Master of Science, Department of Applied Computer Science, University of Winnipeg. Winnipeg, Manitoba, Canada: University of Winnipeg, 2022. DOI: 10.36939/ir.202204211510.
Abstract
In addition to being extremely non-linear, modern machine learning problems require millions
if not billions of parameters to solve or at least to get a good approximation of the solution,
and neural networks are known to assimilate that complexity by deepening and widening their
topology in order to increase the level of non-linearity needed for a better approximation. However,
compact topologies are always preferred to deeper ones as they offer the advantage of using
less computational units and less parameters. This compactness comes at the price of reduced
non-linearity and thus, of limited solution search space. This thesis proposes the N-Dimensional
Polynomial Neural Network (NDPNN) model that uses automatic polynomial kernel estimation
for N-Dimensional Convolutional Neural Networks (NDCNNs) and introduces a high degree of
non-linearity from the first layer which can compensate the need for deep and/or wide topologies.
We first theoretically formalized the 1DPNN model which can process 1-dimensional
signals and we demonstrated that its inherent non-linearity enables it to yield better results
with less computational and spatial complexity than a regular 1DCNN on various classification
and regression problems related to audio signals, even though it introduces more computational
and spatial complexity on a neuronal level. The experiments were conducted on three publicly
available datasets and demonstrate that the proposed 1DPNN model can extract more relevant
information from the data than a 1DCNN in less time and with less memory. We subsequently
extended the theoretical foundation of the 1DPNN to NDPNN which can process 2D signals such
as images and 3D signals such as videos. Also, we theoretically created a general polynomial degree
reduction formula that we used to develop a heuristic algorithm, which enables the degree
reduction of any pre-trained NDPNN. This algorithm compresses an NDPNN without altering
its performance, thus making the model faster and lighter. Following that, we used 2DPNNs
and 3DPNNs to tackle the problem of plant species recognition on a publicly available plant
species recognition dataset composed of 40,000 images with different sizes consisting of 8 plant
species. As a result, we created a novel method, called Variably Overlapping Time—Coherent
Sliding Window (VOTCSW), that transforms a dataset composed of images with variable size
to a 3D representation with fixed size that is suitable for convolutional neural networks, and
we demonstrated that this representation is more informative than resizing the images of the
dataset to a given size. We theoretically formalized the use cases of the method as well as its
inherent properties and proved that it has an oversampling and a regularization effect on the
data. By combining the VOTCSW method with 3DPNNs, we were able to create a model that
achieved a state-of-the-art accuracy of 99.9% on the considered dataset, surpassing well-known
architectures such as ResNet and Inception. Furthermore, we established that the currently
available plant species dataset could not be used for machine learning in its present form, due
to a substantial class imbalance between the training set and the test set. Hence, we created
a specific preprocessing and a model development framework that enabled us to improve the accuracy from 49.23% to 99.9%. The contributions of this thesis are the creation of a novel
generic model called NDPNN that can extract more information from data than a NDCNN with
less computational and spatial complexity, the evaluation of the performance of NDPNNs on
audio signals, images and videos, the creation of a general direct polynomial reduction formula,
the design of a heuristic algorithm for NDPNN compression that generates faster and lighter
models, the formalization of an image transformation method that circumvents image resizing
without altering fine-grained information, and the production of a state-of-the-art 3DPNN for
plant species recognition.