Secure k -ish Nearest Neighbors Classifier
Author(s)
Shaul, Hayim; Feldman, Dan; Rus, Daniela
DownloadPublished version (846.1Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
<jats:title>Abstract</jats:title>
<jats:p>The <jats:italic>k</jats:italic>-nearest neighbors (<jats:italic>k</jats:italic>NN) classifier predicts a class of a query, <jats:italic>q</jats:italic>, by taking the majority class of its <jats:italic>k</jats:italic> neighbors in an existing (already classified) database, <jats:italic>S</jats:italic>. In secure <jats:italic>k</jats:italic>NN, <jats:italic>q</jats:italic> and <jats:italic>S</jats:italic> are owned by two different parties and <jats:italic>q</jats:italic> is classified without sharing data. In this work we present a classifier based on <jats:italic>k</jats:italic>NN, that is more efficient to implement with homomorphic encryption (HE). The efficiency of our classifier comes from a relaxation we make to consider <jats:italic>κ</jats:italic> nearest neighbors for <jats:italic>κ ≈k</jats:italic> with probability that increases as the statistical distance between Gaussian and the distribution of the distances from <jats:italic>q</jats:italic> to <jats:italic>S</jats:italic> decreases. We call our classifier <jats:italic>k</jats:italic>-ish Nearest Neighbors (<jats:italic>k</jats:italic>-ish NN). For the implementation we introduce <jats:italic>double-blinded coin-toss</jats:italic> where the bias and output of the toss are encrypted. We use it to approximate the average and variance of the distances from <jats:italic>q</jats:italic> to <jats:italic>S</jats:italic> in a scalable circuit whose depth is independent of |<jats:italic>S</jats:italic>|. We believe these to be of independent interest. We implemented our classifier in an open source library based on HElib and tested it on a breast tumor database. Our classifier has accuracy and running time comparable to current state of the art (non-HE) MPC solution that have better running time but worse communication complexity. It also has communication complexity similar to naive HE implementation that have worse running time.</jats:p>
Date issued
2020Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Proceedings on Privacy Enhancing Technologies
Publisher
Walter de Gruyter GmbH