A. Contribution

  1. Problem addressed by the paper

Generic solution to secure machine learning supervised classification over sensitive data that are encrypted.

  1. Solution proposed in the paper. Why is it better than previous work?

The solution proposed is by developing classifier from generic building blocks that can be reused. Previous works mostly only cover the training phase, assume weaker security model, or cover only specific classifier. This paper focuses more on the testing phase (classification), assumes stronger security model, and cover more classifiers.

  1. The major results.

This implementation is efficient, taking milliseconds to a few seconds to perform a classification when running on real medical datasets. It is faster than previous works by 3x and 500x.

B. Basic idea and approach. How does the solution work?

The authors first identify a set of core operations in several classification algorithms. These core operations turn out to be: comparison, argmax, and dot product. Then they design building blocks from these core operations. Then they build classifiers from those building blocks. These building blocks are designed in composable way, with regard to both functionality and security. Then they implemented the classifiers as specialized 2PC (2 phase commit protocol). They also demonstrate the ability of their building blocks and classifiers to construct new advanced classifiers.

Machine_Learning_Classification_over_Encrypted_Data_pdf

C. Strengths

  1. Novel idea that works efficiently and generic. It almost supports unlimited number of classifiers since it can be used to build classifiers. It also supports changing encryption scheme.

D. Weaknesses

  1. It still need developer effort to be implemented. There are casual/end users who might need this. It could be further developed as a software suite with ease of use as the goal.
  2. The protocol could be further developed with less round trips communication to save more communication bandwidth.
  3. The implementation could be further developed to use more parallelism technique to make it runs faster.
  4. It only provides mathematical proofs of its security. It does not discuss possible attacks that might occur. Such attacks might use advanced techniques to circumvent this approach.