Data Mining Technology for Privacy Protection in Distributed Scenarios

Lanqin Wang; Yanmei Lv

doi:10.31449/inf.v48i23.6918

Contact Editors Europe, Africa:
Matjaz Gams
N. and S. America:
Karthick Gunasekaran
Asia, Australia:
Vinay Singh
Overview papers:
Maria Ganzha
Wiesław Pawlowski
Aleksander Denisiuk Abstacting / Indexing

Informatica is surveyed by:

ACM Digital Library
Citeseer
COBISS
Compendex
Computer & Information Systems Abstracts
Computer Database
Computer Science Index
dLib.si
DBLP Computer Science Bibliography
Directory of Open Access Journals
Google Scholar
InfoTrac OneFile
Inspec
Linguistic and Language Behaviour Abstracts
Mathematical Reviews, MatSciNet, MatSci on SilverPlatter and Current Mathematical Publications
Scopus Publishing

Informatica is published by:

Support

Informatica is supported by:

ACM Slovenia
Slovenian Society for Pattern Recognition
Slovenian Artificial Intelligence Society
Slovenian Society for Cognitive Science
Slovenian Society of Mathematicians, Physicists and Astronomers
Automatic Control Society of Slovenia
Slovenian Academy of Engineering
International Federation for Information Processing

Journal Help

User

Journal Content Search
Browse

Information

Notifications

About The Authors

Lanqin Wang

Yanmei Lv

Support & Indexing

Data Mining Technology for Privacy Protection in Distributed Scenarios

Lanqin Wang, Yanmei Lv

Abstract

In the Internet era, data mining is an important means to seize users. However, data exists on different platforms, which are incompatible with each other, and user privacy is easily leaked when mining data. To address this issue, a distributed data mining method based on differential privacy is proposed. The method aggregates frequent itemset data from the top m items of branch nodes through a central node. The decision tree algorithm is used as a data classification method to set privacy budgets, optimize count queries, and perform importance attribute filtering. The experimental results showed that the improved algorithm had an average increase of 0.1 in data mining accuracy, an average increase of 0.115 in relative error, and an average decrease of 0.08% in privacy leakage probability. The data classification accuracy of the improved algorithm increased by an average of 0.28, and the privacy leakage probability during data classification decreased by an average of 0.06%. From this, the improved algorithm can significantly improve the accuracy of data mining and classification, significantly reduce the privacy budget required for data mining and classification, reduce the probability of privacy leakage, and greatly improve the security of user data.

Full Text:

PDF

DOI: https://doi.org/10.31449/inf.v48i23.6918

This work is licensed under a Creative Commons Attribution 3.0 License.

Informatica is financially supported by the Slovenian research agency from the Call for co-financing of scientific periodical publications.

Webmaster: Mario Konecki

Username
Password
Remember me