Federated Learning with Privacy Preservation in Large-Scale Distributed Systems Using Differential Privacy and Homomorphic Encryption
Abstract
This study proposes a large-scale distributed privacy-preserving machine learning algorithm based on federated learning. The algorithm allows participants to jointly train high-quality models without sharing original data to meet the challenges brought by increasingly stringent data privacy and security regulations. To verify the performance of the federated learning system in a real-world environment, we built a distributed experimental platform consisting of multiple physical servers and evaluated it using several publicly available datasets such as MNIST, Federated EMNIST, and Federated CIFAR10/100. The experimental results show that the accuracy of the federated learning system is 97.3%, which is slightly lower than the 98.2% of the centralized learning method, but this is an acceptable trade-off considering the advantages of the federated learning method in protecting data privacy. In addition, our system only slightly drops to about 96.8% after the introduction of malicious clients, which proves the robustness of the federated learning system. Specifically, we adopt differential privacy technology, set the privacy budget ε=1.0, and add Gaussian noise to the model update to ensure that even if a malicious user accesses the model update, no sensitive information of any individual user can be inferred from it. The experimental conditions include but are not limited to: the communication protocol uses homomorphic encryption, the average communication volume per iteration is 150 MB, and the total communication volume is 30 GB; the average CPU utilization of the client is about 70%, and the GPU utilization is about 80%. These settings ensure the efficiency of the system's computing resources, and also reflect the balance between privacy protection and model performance.
Full Text:
PDFReferences
References
Thapa C, Tang JW, Abuadbba A, Gao YS, Camtepe S, Nepal S, et al. Evaluation of federated learning in phishing email detection. Sensors. 2023; 23(9): 32.
Xia FL, Chen Y, Huang JW. Privacy-preserving task offloading in mobile edge computing: a deep reinforcement learning approach. Software-Practice & Amp; Experience. 2024; 54(9): 1774-1792.
Salim S, Moustafa N, Turnbull B, Razzak I. Perturbation-enabled deep federated learning for preserving internet of things-based social networks. ACM Transactions on Multimedia Computing Communications and Applications. 2022; 18(2): 19.
Cai C, Fang YM, Liu WD, Jin RB, Cheng J, Chen ZH. FedCov: enhanced trustworthy federated learning for machine RUL prediction with continuous-to-discrete conversion. IEEE Transactions on Industrial Informatics. 2024: 10.
Savazzi S, Nicoli M, Rampa V. Federated learning with cooperating devices: a consensus approach for massive IoT networks. IEEE Internet of Things Journal. 2020; 7(5): 4641-4654.
Li T, Sahu AK, Talwalkar A, Smith V. Federated learning: challenges, methods, and future directions. IEEE Signal Processing Magazine. 2020; 37(3): 50-60.
Lee DY, Choi B, Kim C, Fridgeirsson E, Reps J, Kim M, et al. Privacy-Preserving federated model predicting bipolar transition in patients with depression: prediction model development study. Journal of Medical Internet Research. 2023; 25: 16.
Lim WYB, Ng JS, Xiong ZH, Jin JM, Zhang Y, Niyato D, et al. Decentralized edge intelligence: a dynamic resource allocation framework for hierarchical federated learning. IEEE Transactions on Parallel and Distributed Systems. 2022; 33(3): 536-550.
Yang HW, He H, Zhang WZ, Cao XC. FedSteg: A federated transfer learning framework for secure image steganalysis. IEEE Transactions on Network Science and Engineering. 2021; 8(2): 1084-1094.
Zhao JZ, Yang MB, Zhang RL, Song WGJ, Zheng JL, Feng JR, Matwin S. Privacy-enhanced federated learning: a restrictively self-sampled and data-perturbed local differential privacy method. Electronics. 2022; 11(23): 20.
Zeng SL, Li ZH, Yu HF, Zhang ZH, Luo L, Li B, Niyato D. HFedMS: Heterogeneous federated learning with memorable data semantics in industrial metaverse. IEEE Transactions on Cloud Computing. 2023; 11(3): 3055-3069.
Hu SY, Chen XJ, Ni W, Hossain E, Wang X. Distributed machine learning for wireless communication networks: techniques, architectures, and applications. IEEE Communications Surveys and Tutorials. 2021; 23(3): 1458-1493.
Lee S. Distributed detection of malicious android apps while preserving privacy using federated learning. Sensors. 2023; 23(4): 15.
Qammar A, Naouri A, Ding JG, Ning HS. Blockchain-based optimized edge node selection and privacy preserved framework for federated learning. Cluster Computing: The Journal of Networks Software Tools and Applications. 2024; 27(3): 3203-3218.
Remya S, Pillai MJ, Sha AK, Rajan G, Subbareddy SR, Cho Y. Saltus-“a sudden transition” empowered by federated learning for efficient big data handling in multimedia sensor networks. IEEE Access. 2024; 12: 88620-88633.
Yu XP, Feng J, Zhao W, Yang HM, Tang DH. Peer-to-peer privacy-preserving vertical federated learning without trusted third-party coordinator. Peer-to -Peer Networking and Applications. 2023; 16(5): 2242-2255.
Hamouda D, Ferrag MA, Benhamida N, Seridi H. PPSS: A privacy-preserving secure framework using blockchain-enabled federated deep learning for Industrial IoTs. Pervasive and Mobile Computing. 2023; 88: 21.
Taïk A, Mlika Z, Cherkaoui S. Clustered vehicular federated learning: process and optimization. IEEE Transactions on Intelligent Transportation Systems. 2022; 23(12): 25371-25383.
Zhou YH, Shi MJ, Tian YX, Ye Q, Lv JC. DeFTA: A plug-and-play peer-to-peer decentralized federated learning framework. Information Sciences. 2024; 670: 17.
González BAM, Hasan O, Uriawan W, Badr Y, Brunie L. Secure and efficient decentralized machine learning through group-based model aggregation. Cluster Computing-the Journal of Networks Software Tools and Applications. 2023; 15.
Zhang J, Zhou JT, Guo JY, Sun XH. Visual object detection for privacy-preserving federated learning. IEEE Access. 2023; 11: 33324-33335.
Han S, Ding HX, Zhao S, Ren SQ, Wang ZB, Lin JH, Zhou SH. Practical and robust federated learning with highly scalable regression training. IEEE Transactions on Neural Networks and Learning Systems. 2023: 15.
Li ZH, Zhou HM, Zhou TY, Yu HF, Xu ZL, Sun G. ESync: Accelerating intra-domain federated learning in heterogeneous data centers. IEEE Transactions on Services Computing. 2022; 15(4): 2261-2274.
Wu X, Zhang YT, Shi MY, Li P, Li RR, Xiong NN. An adaptive federated learning scheme with differential privacy preserving. Future Generation Computer Systems-the International Journal of Escience. 2022; 127: 362-372.
Sultana A, Haque MM, Chen L, Xu F, Yuan X. Eiffel: Efficient and fair scheduling in adaptive federated learning. IEEE Transactions on Parallel and Distributed Systems. 2022; 33(12): 4282-4294.
Liu Y, Fan T, Chen TJ, Xu Q, Yang Q. FATE: an industrial grade platform for collaborative learning with data protection. Journal of Machine Learning Research. 2021; 22: 6.
Chen ZY, Tian P, Liao WX, Yu W. Zero knowledge clustering based adversarial mitigation in heterogeneous federated learning. IEEE Transactions on Network Science and Engineering. 2021; 8(2): 1070-1083.
Abou El Houda Z, Hafid AS, Khoukhi L. MiTFed: a privacy preserving collaborative network attack mitigation framework based on federated learning using SDN and Blockchain. IEEE Transactions on Network Science and Engineering. 2023; 10(4): 1985-2001.
Mirza B, Syed TQ, Khan B, Malik Y. Potential deep learning solutions to persistent and emerging big data challenges: a practitioners’ cookbook. ACM Computing Surveys. 2021; 54(1): 39.
Xu RH, Chen Y. μDFL: A secure microchained decentralized federated learning fabric atop IoT networks. IEEE Transactions on Network and Service Management. 2022; 19(3): 2677-2688.
DOI: https://doi.org/10.31449/inf.v49i13.7358

This work is licensed under a Creative Commons Attribution 3.0 License.