Abstract
Machine learning models are often trained on sensitive data, such as medical records or bank transactions, posing high privacy risks. In fact, membership inference attacks can use the model parameters or predictions to determine whether a given data point was part of the training set. One of the most promising mitigations in literature is Knowledge Distillation (KD). This mitigation consists of first training a teacher model on the sensitive private dataset, and then transferring the teacher knowledge to a student model, by the mean of a surrogate dataset. The student model is then deployed in place of the teacher model. Unfortunately, KD on its own does not provide users much flexibility, meant as the possibility to arbitrarily decide how much utility to sacrifice to get membership-privacy. To address this problem, we propose a novel approach that combines KD with confidence score masking. Concretely, we repeat the distillation procedure multiple times in series and, during each distillation, perturb the teacher predictions using confidence masking techniques. We show that our solution provides more flexibility than standard KD, as it allows users to tune the number of distillation rounds and the strength of the masking function. We implement our approach in a tool, RepKD, and assess our mitigation against white-and black-box attacks on multiple models and datasets. Even when the surrogate dataset is different from the private one (which we believe to be a more realistic setting than is commonly found in literature), our mitigation is able to make the black-box attack completely ineffective and significantly reduce the accuracy of the white-box attack at the cost of only 0.6% test accuracy loss.
Original language | English |
---|---|
Title of host publication | AISec 2022 - Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security, co-located with CCS 2022 |
Publisher | Association for Computing Machinery |
Pages | 13-24 |
Number of pages | 12 |
ISBN (Electronic) | 9781450398800 |
DOIs | |
Publication status | Published - 11 Nov 2022 |
Event | 15th ACM Workshop on Artificial Intelligence and Security, AISec 2022 - Los Angeles, United States Duration: 11 Nov 2022 → 11 Nov 2022 Conference number: 15 |
Conference
Conference | 15th ACM Workshop on Artificial Intelligence and Security, AISec 2022 |
---|---|
Abbreviated title | AISec 2022 |
Country/Territory | United States |
City | Los Angeles |
Period | 11/11/22 → 11/11/22 |
Other | Co-located with CCS 2022 |
Keywords
- Confidence score masking
- Defense
- Knowledge distillation
- Membership inference attack
- Mitigation