Learning Algorithm for LesserDNN, a DNN with Quantized Weights
Abstract
This paper presents LesserDNN, a model that uses a set of floating-point values \{-1.0, -0.5, -0.25, -0.125, -0.0625, 0.0625, 0.125, 0.25, 0.5, 1.0\} as quantized weights, and a new learning algorithm for the proposed model.
In previous studies on deep neural networks (DNNs) with quantized weights, because DNNs employ the gradient descent method as their learning algorithm, quantized weights were applied only during the inference stage.
Due to differentiability properties, quantized weights cannot be used when the gradient descent method is applied during training.
To address this issue, we devised an algorithm based on simulated annealing.
Since simulated annealing has no differentiability requirements, LesserDNN can utilize quantized weights during training. With the use of quantized weights and this simulated annealing-based algorithm, the learning process becomes a combinatorial problem. The proposed algorithm was applied to train networks on the MNIST handwriting dataset. The tested models were trained with the simulated annealing-based algorithm and quantized weights, achieving the same level of accuracy as gradient descent-based comparison methods. Additionally, we conducted tests using the CIFAR-10 dataset, and achieved the good results to demonstrate the algorithm.
Thus, LesserDNN has a simple design and small implementation scale because backpropagation is not applied. Moreover, this model achieves a high accuracy.
Full Text:
PDFReferences
Vincent Vanhoucke and Andrew Senior and Mark Z. Mao (2011), Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
Jingyong Cai and Masashi Takemoto and Hironori Nakajo (2018), Deep Learning, Neural Networks Compression, Computer Vision, Logarithmic Quantization, Proceedings of the 10th International Conference on Advances in Information Technology, Association for Computing Machinery.
Seyyed Mohammad Mousavi and Elham S. Mostafavi and Pengcheng Jiao (2017), Next generation prediction model for daily solar radiation on horizontal surface using a hybrid neural network and simulated annealing method, Energy Conversion and Management, Elsevier, pp. 671-682.
Matthieu Courbariaux and Yoshua Bengio and Jean-Pierre David (2016) BinaryConnect: Training Deep Neural Networks with binary weights during propagations, arXiv.
Fengfu Li and Bin Liu and Xiaoxing Wang and Bo Zhang and Junchi Yan (2022) Ternary Weight Networks, arXiv.
Diederik P. Kingma and Jimmy Ba (2017) Adam: A Method for Stochastic Optimization, arXiv.
DOI: https://doi.org/10.31449/inf.v49i1.7145

This work is licensed under a Creative Commons Attribution 3.0 License.