BMC Medical Genomics (Oct 2018)

Secure top most significant genome variants search: iDASH 2017 competition

  • Sergiu Carpov,
  • Thibaud Tortech

DOI
https://doi.org/10.1186/s12920-018-0399-x
Journal volume & issue
Vol. 11, no. S4
pp. 47 – 55

Abstract

Read online

Abstract Background One of the 3 tracks of iDASH Privacy & Security Workshop 2017 competition was to execute a whole genome variants search on private genomic data. Particularly, the search application was to find the top most significant SNPs (Single-Nucleotide Polymorphisms) in a database of genome records labeled with control or case. In this paper we discuss the solution submitted by our team to this competition. Methods Privacy and confidentiality of genome data had to be ensured using Intel SGX enclaves. The typical use-case of this application is the multi-party computation (each party possessing one or several genome records) of the SNPs which statistically differentiate control and case genome datasets. Results Our solution consists of two applications: (i) compress and encrypt genome files and (ii) perform genome processing (top most important SNPs search). We have opted for a horizontal treatment of genome records and heavily used parallel processing. Rust programming language was employed to develop both applications. Conclusions Execution performance of the processing applications scales well and very good performance metrics are obtained. Contest organizers selected it as the best submission amongst other received competition entries and our team was awarded the first prize on this track.

Keywords