Intelligent analysis and processing technology of big data based on clustering algorithm

Zheng Zheng, Fukai Cao, Song Gao, Amit Sharma


 In order to study the big data intelligent analysis and processing technology based on clustering algorithm, an attribute category clustering method based on hierarchical clustering is proposed, which combines the attribute categories with similar fault type distribution, reduces the data dimension, and binarizes it. Aiming at the problem of more missing values of continuous data, a data completion method based on attribute distribution function is adopted. Then, from the perspective of the selection and estimation of project unit price in construction enterprises, this paper combs and summarizes the data mining process facing the characteristics of project cost data, and puts forward the method of analyzing and processing project cost data based on clustering algorithm. Finally, the processed data sets are subjected to bottom-up hierarchical clustering analysis, and finally the ideal analysis results can be obtained. The experimental results show that the preprocessing method based on attribute clustering proposed in this paper can effectively merge attributes, reduce the dimension after binary transformation and effectively reduce the amount of data under the condition of ensuring data information.

Full Text:



Li, W., & Huang, Q. (2017). Research on intelligent avoidance method of shipwreck based on bigdata analysis. Polish Maritime Research.


Li, L., Wang, J., & Li, X. (2020). Efficiency analysis of machine learning intelligent investment based on K-means algorithm. Ieee Access, 8, 147463-147470.


Dong-rui, L. (2017). Cluster analysis algorithm based on key data integration for cloud computing. International Journal of Reasoning-based Intelligent Systems, 9(3-4), 123-129.


Zhu, K., Joshi, S., Wang, Q. G., & Hsi, J. F. Y. (2019). Guest editorial special section on big data analytics in intelligent manufacturing. IEEE Transactions on Industrial Informatics, 15(4), 2382-2385.


Del Ser, J., Sanchez-Medina, J. J., & Vlahogianni, E. I. (2019). Introduction to the special issue on online learning for big-data driven transportation and mobility. IEEE Transactions on Intelligent Transportation Systems, 20(12), 4621-4623.


Wu, C. (2019, June). Research on Clustering Algorithm Based on Big Data Background. In Journal of Physics: Conference Series (Vol. 1237, No. 2, p. 022131). IOP Publishing.


Duan, S., & Wang, Z. (2021). Research on the service mode of the university library based on data mining. Scientific Programming, 2021.

Xing, Z., & Li, G. (2019). Intelligent classification method of remote sensing image based on big data in spark environment. International Journal of Wireless Information Networks, 26(3), 183-192.

Cai, Z. M. (2020). Network community partition based on intelligent clustering algorithm. Компьютерная оптика, 44(6), 985-989.


Xu, Z., Shi, D., & Tu, Z. (2021). Research on diagnostic information of smart medical care based on big data. Journal of Healthcare Engineering, 2021.

Li, W., Luo, Y., Tang, C., Zhang, K., & Ma, X. (2021). Boosted Fuzzy Granular Regression Trees. Mathematical Problems in Engineering, 2021.

Shi, F., & Zhu, L. (2019). Analysis of trip generation rates in residential commuting based on mobile phone signaling data. Journal of Transport and Land Use, 12(1), 201-220.

Wendong, X., Yuanfeng, L., & Deli, C. (2017). Algorithm of key data ensemble clustering and approximate analysis in cloud computing. International Journal of Reasoning-based Intelligent Systems, 9(3-4), 177-184.


Singh, P. K., & Sharma, A. (2022). An intelligent WSN-UAV-based IoT framework for precision agriculture application. Computers and Electrical Engineering, 100, 107912.

Zeng, H., Dhiman, G., Sharma, A., Sharma, A., & Tselykh, A. (2021). An IoT and Blockchain‐based approach for the smart water management system in agriculture. Expert Systems, e12892.

Sharma, A., & Singh, P. K. (2021). UAV‐based framework for effective data analysis of forest fire detection using 5G networks: An effective approach towards smart cities solutions. International Journal of Communication Systems, e4826.

Sharma, A., Singh, P. K., & Kumar, Y. (2020). An integrated fire detection system using IoT and image processing technique for smart cities. Sustainable Cities and Society, 61, 102332.

Tseng, F. H., Cho, H. H., & Wu, H. T. (2019). Applying big data for intelligent agriculture-based crop selection analysis. IEEE Access, 7, 116965-116974.


Zhao, Y., Ding, F., Li, J., Guo, L., & Qi, W. (2019). The intelligent obstacle sensing and recognizing method based on D–S evidence theory for UGV. Future Generation Computer Systems, 97, 21-29.

Yuan, W., Deng, P., Taleb, T., Wan, J., & Bi, C. (2015). An unlicensed taxi identification model based on big data analysis. IEEE Transactions on Intelligent Transportation Systems, 17(6), 1703-1713.


Wang, L. (2021, December). Intelligent analysis of accounting information processing under the background of big data. In 2021 2nd International Conference on Big Data Economy and Information Management (BDEIM) (pp. 461-464). IEEE.


Ma, X., Wang, Z., Zhou, S., Wen, H., & Zhang, Y. (2018, June). Intelligent healthcare systems assisted by data analytics and mobile computing. In 2018 14th International Wireless Communications & Mobile Computing Conference (IWCMC) (pp. 1317-1322). IEEE.


Hu, H., Tang, B., Gong, X., Wei, W., & Wang, H. (2017). Intelligent fault diagnosis of the high-speed train with big data based on deep neural networks. IEEE Transactions on Industrial Informatics, 13(4), 2106-2116.


Vedavathi, N., Dharmaiah, Ghuram, Venkatadri, Kothuru and Gaffar, Shaik Abdul. Numerical study of radiative non-Darcy nanofluid flow over a stretching sheet with a convective Nield conditions and energy activation. Nonlinear Engineering, 10(1), 159-176, 2021.

Hayat, Tasawar, Ullah, Inayat, Muhammad, Khursheed and Alsaedi, Ahmed. Gyrotactic microorganism and bio-convection during flow of Prandtl-Eyring nanomaterial. Nonlinear Engineering, 10(1), 201-212, 2021.

Li, Zhenfang, Gao, Dong, Wu, Chuanji, Lv, Guoqing, Liu, Xin, Zhai, Haoran and Huang, Zhanfang. Mechanical performance of aerated concrete and its bonding performance with glass fiber grille. Nonlinear Engineering, 10(1), 240-244, 2021.

Liang, H., Yun, C., Kan, M. J., & Gao, J. (2019). Research and application of element logging intelligent identification model based on data mining. IEEE Access, 7, 94415-94423.


He, Z., He, Y., Liu, F., & Zhao, Y. (2019). Big data-oriented product infant failure intelligent root cause identification using associated tree and fuzzy DEA. IEEE Access, 7, 34687-34698.


He, X., Wang, K., Lu, H., Xu, W., & Guo, S. (2020). Edge qoe: Intelligent big data caching via deep reinforcement learning. IEEE Network, 34(4), 8-13.


Lei, Y., Jia, F., Lin, J., Xing, S., & Ding, S. X. (2016). An intelligent fault diagnosis method using unsupervised feature learning towards mechanical big data. IEEE Transactions on Industrial Electronics, 63(5), 3137-3147.


Srivani, B., Sandhya, N., & Padmaja Rani, B. (2020). Literature review and analysis on big data stream classification techniques. International Journal of Knowledge-Based and Intelligent Engineering Systems, 24(3), 205-215.


Liu, X., Sun, Q., Lu, W., Wu, C., & Ding, H. (2020). Big-data-based intelligent spectrum sensing for heterogeneous spectrum communications in 5G. IEEE Wireless Communications, 27(5), 67-73.



Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.