CAAI Transactions on Intelligence Technology (Sep 2022)

Medical data publishing based on average distribution and clustering

  • Tong Yi,
  • Minyong Shi,
  • Haibin Zhu

DOI
https://doi.org/10.1049/cit2.12094
Journal volume & issue
Vol. 7, no. 3
pp. 381 – 394

Abstract

Read online

Abstract Most of the data publishing methods have not considered sensitivity protection, and hence the adversary can disclose privacy by sensitivity attack. Faced with this problem, this paper presents a medical data publishing method based on sensitivity determination. To protect the sensitivity, the sensitivity of disease information is determined by semantics. To seek the trade‐off between information utility and privacy security, the new method focusses on the protection of sensitive values with high sensitivity and assigns the highly sensitive disease information to groups as evenly as possible. The experiments are conducted on two real‐world datasets, of which the records include various attributes of patients. To measure sensitivity protection, the authors define a metric, which can evaluate the degree of sensitivity disclosure. Besides, additional information loss and discernability metrics are used to measure the availability of released tables. The experimental results indicate that the new method can provide better privacy than the traditional one while the information utility is guaranteed. Besides value protection, the proposed method can provide sensitivity protection and available releasing for medical data.

Keywords