BMC Medical Informatics and Decision Making (Jul 2017)
Security controls in an integrated Biobank to protect privacy in data sharing: rationale and study design
Abstract
Abstract Background With the goal of realizing genome-based personalized healthcare, we have developed a biobank that integrates personal health, genome, and omics data along with biospecimens donated by volunteers of 150,000. Such a large-scale of data integration involves obvious risks of privacy violation. The research use of personal genome and health information is a topic of global discussion with regard to the protection of privacy while promoting scientific advancement. The present paper reports on our plans, current attempts, and accomplishments in addressing security problems involved in data sharing to ensure donor privacy while promoting scientific advancement. Methods Biospecimens and data have been collected in prospective cohort studies with the comprehensive agreement. The sample size of 150,000 participants was required for multiple researches including genome-wide screening of gene by environment interactions, haplotype phasing, and parametric linkage analysis. Results We established the T ohoku M edical M egabank (TMM) data sharing policy: a privacy protection rule that requires physical, personnel, and technological safeguards against privacy violation regarding the use and sharing of data. The proposed policy refers to that of NCBI and that of the Sanger Institute. The proposed policy classifies shared data according to the strength of re-identification risks. Local committees organized by TMM evaluate re-identification risk and assign a security category to a dataset. Every dataset is stored in an assigned segment of a supercomputer in accordance with its security category. A security manager should be designated to handle all security problems at individual data use locations. The proposed policy requires closed networks and IP-VPN remote connections. Conclusion The mission of the biobank is to distribute biological resources most productively. This mission motivated us to collect biospecimens and health data and simultaneously analyze genome/omics data in-house. The biobank also has the mission of improving the quality and quantity of the contents of the biobank. This motivated us to request users to share the results of their research as feedback to the biobank. The TMM data sharing policy has tackled every security problem originating with the missions. We believe our current implementation to be the best way to protect privacy in data sharing.
Keywords