CAAI Transactions on Intelligence Technology (Sep 2022)
Medical data publishing based on average distribution and clustering
Abstract
Abstract Most of the data publishing methods have not considered sensitivity protection, and hence the adversary can disclose privacy by sensitivity attack. Faced with this problem, this paper presents a medical data publishing method based on sensitivity determination. To protect the sensitivity, the sensitivity of disease information is determined by semantics. To seek the trade‐off between information utility and privacy security, the new method focusses on the protection of sensitive values with high sensitivity and assigns the highly sensitive disease information to groups as evenly as possible. The experiments are conducted on two real‐world datasets, of which the records include various attributes of patients. To measure sensitivity protection, the authors define a metric, which can evaluate the degree of sensitivity disclosure. Besides, additional information loss and discernability metrics are used to measure the availability of released tables. The experimental results indicate that the new method can provide better privacy than the traditional one while the information utility is guaranteed. Besides value protection, the proposed method can provide sensitivity protection and available releasing for medical data.
Keywords