Journal of Agricultural Big Data ›› 2024, Vol. 6 ›› Issue (2): 170-184.doi: 10.19788/j.issn.2096-6369.200001

Previous Articles     Next Articles

Survey of Differential Privacy Algorithms and Applications for High- Dimensional Data Publishing

LONG Chun1,2,*(), QIN ZeXiu1,2, LI LiSha1,2, LI Jing1, YANG Fan1, WEI JinXia1,2, FU YuHao1   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2024-01-30 Accepted:2024-06-03 Online:2024-06-26 Published:2024-07-03
  • Contact: LONG Chun

Abstract:

With the further development of big data and machine learning technologies, handling high-dimensional data with complex structures, relationships, and rich semantic information containing dozens to hundreds of features has become a challenge. Safely utilizing such high-dimensional data, while ensuring the privacy of individuals, has become a significant topic today. Upon reviewing existing literature, we found numerous reviews on differential privacy technology itself, but few on the algorithms and applications of differential privacy specifically tailored for high-dimensional data. Therefore, this paper provides a review of the application of differential privacy in the field of high-dimensional data, aiming to delve into the strengths and weaknesses of different methods in protecting the privacy of high-dimensional data and to guide future research directions for differential privacy algorithms tailored for high-dimensional data publishing. Firstly, this paper introduces the principles and characteristics of differential privacy, summarizing the current research work on the technology itself. Then, it analyzes the application of differential privacy in high-dimensional data environments from the perspectives of data dimensionality reduction and data synthesis, discussing the challenges and issues faced by differential privacy and proposing preliminary solutions to better address the issues of privacy protection and data analysis in the current high-dimensional data landscape. Lastly, potential future research directions are proposed to facilitate technological exchange and further advancements in the application of differential privacy in high-dimensional data settings.

Key words: differential privacy, high-dimensional data, perturbation mechanism, privacy allocation