Integrated Speaker and Speech Recognition for Wheel Chair Movement Using Artificial Intelligence
Abstract
A speech signal is a result of the constrictions of vocal tract and different sounds can be generated by different vocal tract constrictions. A speech signal carries two things i.e. speaker's identity and meaning. For a specific applications of speaker and speech recognition like voice operated wheel chair, both the speaker and speech is to be recognized for movement of wheel chair. Automation in wheelchair is, today's requirement as the numbers of people are increasing with disabilities like injuries in spine, amputation, impairments in hands etc. They need assistance for moving their wheel chair. Voice operated wheel chair is one of the solution. The intention of this study is to use a speaker and speech dependent system to control the wheelchair and minimize the risk of unwanted accident. We have proposed a system in which speaker (patient) as well as speech (commands) is recognized based upon the acoustic features like Mel Frequency Cepstral Coefficients (MFCC). Optimization of the features using Artificial Bee Algorithm (ABC) is done to gain good accuracy with artificial intelligence technique as a classifier. We have tested our system on standard dataset (TIDIGITS) and our own prepared dataset. Also, validation of proposed work is done by generating control signal to actuate the wheel chair in real time scenario.
Full Text:
PDFReferences
Cutajar M., Micallef J., Casha O., Grech I., and Gatt E., “Comparative study of automatic speech recognition techniques,” IET Signal Processing, vol. 7, no. 1, pp. 25–46, 2013.
______________________________________________________
Kaur, G., Srivastava, M., & Kumar, A, "Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition," International Journal of Engineering and Technology Innovation, 78–88.
______________________________________________________
Ijjina E. P. and Mohan C. K., “Human action recognition using action bank features and convolutional neural networks,” 2014 Asian Conference on Computer Vision (ACCV), vol. 59, pp. 178–182, 2014.
______________________________________________________
Ijjina E. P. and Mohan C. K., “Hybrid deep neural network model for human action recognition,” Applied Soft Computing, vol. 46, pp. 936–952, 2015.
______________________________________________________
Karaboga D. and Akay B., “A comparative study of Artificial Bee Colony algorithm,” Applied Mathematics and Computation, vol. 214, no. 1, pp. 108–132, 2009.
______________________________________________________
Bolaji A. L., Khader A. T., Al-Betar M. A., and Awadallah M. A., “Artificial bee colony algorithm, its variants and applications: A survey,” Journal of Theoretical and Applied Information Technology, vol. 47, no. 2, pp. 434–459, 2013.
______________________________________________________
Chandra B. and Sharma R. K., “Fast learning in Deep Neural Networks,” Neuro-computing, vol. 171, pp. 1205–1215, 2016.
______________________________________________________
Li K., Wu X., and Meng H., “Intonation classification for L2 English speech using multi-distribution deep neural networks,” Computer Speech & Language, vol. 43, pp. 18–33, 2017.
______________________________________________________
Richardson F., Member S., Reynolds D., and Dehak N., “Deep Neural Network Approaches to Speaker and Language Recognition,” IEEE Signal Processing Letters, vol. 22, no. 10, pp. 1671–1675, 2015.
______________________________________________________
Dahl G. E., Yu D., Deng L., and Acero A., “Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition,” vol. 20, no. 1, pp. 30–42, 2012.
______________________________________________________
Solera-Urena R. and Garcia-Moral A. I., “Real-time robust automatic speech recognition using compact support vector machines,” Audio speech and Language processing, vol. 20, no. 4, pp. 1347–1361, 2012.
______________________________________________________
Mohamad D.and Salleh S., “Malay Isolated Speech Recognition Using Neural Network : A Work in Finding Number of Hidden Nodes and Learning Parameters,” vol. 8, no. 4, pp. 364–371, 2011.
______________________________________________________
Desai Vijayendra A.and Thakar V. K., “Neural Network Based Gujarati Speech Recognition for Dataset Collected by in-ear Microphone,” Procedia Computer Science, vol. 93, pp. 668–675, 2016.
______________________________________________________
Abdalla O. A., Zakaria M. N., Sulaiman S., and Ahmad W. F. W., “A comparison of feed-forward back-propagation and radial basis artificial neural networks: A Monte Carlo study,” Proceedings 2010 International Symposium on Information Technology - Engineering Technology, vol. 2, pp. 994–998, 2010.
______________________________________________________
Chen X., Liu X., Wang Y., Gales M. J. F., and Woodland P. C., “Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, no. 11, pp. 2146–2157, 2016.
______________________________________________________
Simpson, R.C. et al.,. "NavChair : An Assistive Wheelchair Navigation System with Automatic Adaptation," Assistive Technology and Artificial Intelligence, 1458, pp.235–255, 1998.
______________________________________________________
Pacnik, G., Benkic, K. and Brecko, B., "Voice operated intelligent wheelchair - VOIC," IEEE International Symposium on Industrial Electronics, III, pp.1221–1226, 2005.
______________________________________________________
Jabardi, M.H., "Voice Controlled Smart Electric-Powered wheelchair based on Artificial Neural Network," 8(5), pp.31–37, 2017.
______________________________________________________
Siniscalchi S. M., Svendsen T., and Lee C.-H., “An artificial neural network approach to automatic speech processing,” Neurocomputing, vol. 140, pp. 326–338, 2014.
______________________________________________________
Hossain A., Rahman M., Prodhan U. K., and Khan F., “Implementation of back-propagation neural Network for isolated Bangla speech recognition,” International Journal of Information Sciences and Techniques, vol. 3, no. 4, pp. 1–9, 2013.
______________________________________________________
Mansour A. H., Zen G., Salh A., Hayder H., and Alabdeen Z., “Voice recognition Using back propagation algorithm in neural networks,” vol. 23, no. 3, pp. 132–139, 2015.
______________________________________________________
Qian Y., Tan T., and Yu D., “Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, no. 12, pp. 2231–2240, 2016.
______________________________________________________
Dede G. and Sazlı M. H., “Speech recognition with artificial neural networks,” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 20, no. 3, pp. 763–768, 2015.
______________________________________________________
Shahamiri S. R. and Binti Salim S. S., “Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach,” Neurocomputing, vol. 129, no. 5, pp. 1053–1063, 2014.
______________________________________________________
DOI: https://doi.org/10.31449/inf.v42i4.2003
This work is licensed under a Creative Commons Attribution 3.0 License.