Impact of Gaussian Noise for Optimized Support Vector Machine Algorithm Applied to Medicare Payment on Raspberry Pi

Shrirang Ambaji Kulkarni, Varadraj Gurpur, Christian King, Andriy Koval

Abstract


A relatively large dataset coupled with efficient but computationally slow machine learning algorithm poses a great deal of challenge for Internet of Things (IoT).  On the contrary, Deep Learning Neural Networks (DLANNs) are known for good performances in terms of accuracy, but by nature are computationally intensive. Based on this argument, the purpose of this article is to apply a pipelined Support Vector Machine (SVM)) learning algorithm for benchmarking public health data using Internet of Things (IoT). Support Vector Machine (SVM) a very good performing machine learning algorithm but has constraints in terms of huge training time and its performance is also susceptible to noise.  The applied software pipelined architecture to SVM was to minimize its computational time under a resource constrained device like raspberry pi. It was tested with a medicare dataset with Gaussian noise to assess the impact of noise. The classification results of Total Medicare Standardized Payment Amount obtained indicated that the proposed pipelined SVM model was optimal in performance compared to DLANN model by 79.74% in terms of computational time. Also the performance of SVM in terms of area under curve (AUC) was better compared to other models and outscored Logistic Regression by 7.2%, and DLANN model by 22.65%.


Full Text:

PDF

References


Allhoffa F. & Henschke A., The Internet of Things: Foundational ethical issues, Internet of Things, 2018, pp. 55–66.

. Haller S., Karnouskos S., & Schroth C., "The Internet of Things in an Enterprise Context," in Future Internet – FIS 2008 Lecture Notes in Computer Science Vol. 5468, 2009, pp 14-28.

. Zhang Z-K., Cho M , Wang C-W.., Hsu C-W,Chen C-K, & Shieh S , IoT Security: Ongoing Challenges and Research Opportunities, Proceedings of the 2014 IEEE 7th International Conference on Service-Oriented Computing and Applications, 2014,pp. 230-234.

. Gokhale P., Bhat O., Bhat S., Introduction to IOT, International Advanced Research Journal in Science, Engineering and Technology, Vol. 5(1), 2018, pp. 41- 44.

.Wan T.T.H, Gurupur V, Understanding the Difference between Healthcare Informatics and Healthcare Data Analytics in the Present State of Health Care Management, Health Services Research & Managerial Epidemiology, Vol. 7, 2020, pp. 1-3.

Shukla S., Hassan M.F., Khan M.K., Jung L.T., Awang A., An analytical model to minimize the latency in healthcare internet-of-things in fog computing environment, PLoS ONE, 2019, pp.1-31.

. Noble W.S, What is a support vector machine? Nature Biotechnology, Vol.24, 2006, pp. 1565–1567.

. Bradley A.P, The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms, Pattern Recognition, Vol. 30(7), 1997,pp. 1145-1159.

. Nalepa J. , Kawulok M, Selecting training sets for support vector machines: a review. Artif Intell Rev 52, 2019, pp. 857–900.

. Papadonikolakis M., Bouganis C. & Constantinides G., "Performance comparison of GPU and FPGA architectures for the SVM training problem," 2009 International Conference on Field-Programmable Technology, 2009, pp. 388-391.

. Huang C-L , Chen M-C , Wang C-J, Credit scoring with a data mining approach based on support vector machines, Expert Systems with Applications, Vol. 33, 2007, pp. 847–856

. Yazici M T. , Basurra S. & .Gaber M M, Edge Machine Learning: Enabling Smart Internet of Things Applications, Big Data and Cognitive Computing, Vol. 2: 26, 2018, ; pp. 1-17.

. Nguyen M H . Torre F de la., Optimal feature selection for support vector machines, Pattern Recognition, Vol.43, 2010, pp. 584–591

. Sanz H, Valim C., Vegas E, Oller J M. & Reverter F, SVM-RFE: selection and visualization of the most relevant features through non-linear kernels, BMC Bioinformatics, Vol. 19:432, 2018, pp 1-18.

. Gurupur V. P, Kulkarni S. A., Liu X., Desai U., & Nasir A., Analysing the power of deep learning techniques over the traditional methods using medicare utilisation and provider data, Journal of Experimental & Theoretical Artificial Intelligence, 2018, pp. 99-115.

. Zardo P., Collie A., Predicting research use in a public health policy environment: results of a logistic regression analysis, Implementation Science, Vol. 9, 2014, pp. 1-10.

. Sheets L., Petroski G.F., Zhuang Y., Phinney M.A,. Ge B, Parker J.C., Shyu C-R, Combining Contrast Mining with Logistic Regression to Predict Healthcare Utilization in a Managed Care Population, Applied Clinical Informatics, Vol. 8: 2, 2017, pp.430-446.

. Sakr G. E, Mokbel M., Darwich A., Khneisser M. N & Hadi A, "Comparing deep learning and support vector machines for autonomous waste sorting," 2016 IEEE International Multidisciplinary Conference on Engineering Technology (IMCET), 2016, pp. 207-212.

. Ravi D., Wong C., Deligianni F., Berthelot M., Andreu-Perez J., Lo B., &. Yang G-Z, Deep Learning for Health Informatics, IEEE Journal of Biomedical and Health Informatics, vol. 21: (1), 2017, pp.4-21.

. Gangsar P. & Tiwari R., Effect of noise on support vector machine based fault diagnosis of IM using vibration and current signatures, MATEC Web of Conferences, vol. 211, 2018.

. Pei Y., Huang Y., Zou Q., Zhang X. & Wang S., "Effects of Image Degradation and Degradation Removal to CNN-Based Image Classification," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43: 4, pp. 1239-1253, 2021.

. Wu X. & Zhu X., Mining with Noise Knowledge: Error-Aware Data Mining, IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systems and Humans, vol.38: (4), 2008, pp.15-19.

. Zualkernan A., Zualkernan I A., Dhou S, Judas J, Sajun A R, Gomez B R., Hussain L A., Sakhnini D, "Towards an IoT-based Deep Learning Architecture for Camera Trap Image Classification," 2020 IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT), 2020, pp. 1-6.

. Scikit-learn Machine Learning in Python. [Online]. Available: https://scikit-learn.org/ stable/

. Tukey J., Exploratory Data Analysis. Addison-Wesley, Reading MA, 1977

. Zhao Q., Zhou G., Zhang L., Cichocki A. & Amari S., Bayesian Robust Tensor Factorization for Incomplete Multiway Data, IEEE Transactions on Neural Networks and Learning Systems, vol.27:(4),2016, pp.736-748

. Khan Z., Naeem M., Khalil U., Khan D. M., Aldahmani S. & Hamraz M., Feature Selection for Binary Classification Within Functional Genomics Experiments via Interquartile Range and Clustering, IEEE Access, vol. 7, 2019, pp.78159-78169.

. Yusoff S. B. & Wah Y. B, Comparison of conventional measures of skewness and kurtosis for small sample size, 2012 International Conference on Statistics in Science, Business and Engineering (ICSSBE), 2012, pp.1-6.

. Heymann S., Latapy M. & Magnien C., Outskewer: Using Skewness to Spot Outliers in Samples and Time Series, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012, pp.527-534.

. Xu L., Hu O., Guo Y., Zhang M., Lu D., Cai C. B., Xie S., Goodarzi M., Fu H. Y., She Y. B., Representative splitting cross validation, Chemometrics and Intelligent Laboratory Systems, vol.183, 2018, pp.29-35.

.Tharwat A., Classification assessment methods, Applied Computing and Informatics, 2018, pp.1-13.

. Fatourechi M., Ward R. K., Mason S. G., Huggins J., Schlög A., & Birch G. E., Comparison of Evaluation Metrics in Classification Applications with Imbalanced Datasets, Proceedings of the 2008 Seventh International Conference on Machine Learning and Applications, 2008, pp.777 – 782.

. Huang J. & Ling C., Using AUC and Accuracy in Evaluating Learning Algorithms, IEEE Transactions on Knowledge & Data Engineering, vol.17:(3),2005, pp.299-310.

. Pipelines and composite estimators, https://scikit-learn.org/stable/modules/compose.html

. Nadarajah S. & Kotz S., On the Generation of Gaussian Noise, IEEE Transactions on Signal Processing, vol. 55 (3), 2007, pp.1172-1172.

. Zhuang L. & Ng M. K., Hyperspectral Mixed Noise Removal By ℓ1-Norm-Based Subspace Representation, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol.13 ,2020, pp.1143-1157.

. Hendrycks D., & Dietterich T. G., Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations, arXiv: Learning, 2018, pp.1-13

. Domingo-Ferrer J., Seb´e F., & Castell`a-Roca J., On the Security of Noise Addition for Privacy in Statistical Databases, International Workshop on Privacy in Statistical Databases, 2004, pp.149-161.

. Yao S., Zhao Y., Zhang A., Hu S., Shao H., Zhang C., Su L., Abdelzaher T., Deep Learning for the Internet of Things, Computer, vol. 51: 5, 2018, pp. 32-41.

. Ma X., Yao T., Hu M., Dong Y., Liu W., Wang F., Liu J., A Survey on Deep Learning Empowered IoT Applications, in IEEE Access, vol. 7, 2019, pp. 181721-181732.

. Ahmed I., Din S., Jeon G., Piccialli F., Exploring Deep Learning Models for Overhead View Multiple Object Text of the second sectionin IEEE Internet of Things Journal, vol. 7: 7, pp. 5737-5744,(2020)




DOI: https://doi.org/10.31449/inf.v45i4.3747

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.