*Result*: Optimizing membership inference attacks against low self-influence samples by distilled shadow models and inference models.

Title:
Optimizing membership inference attacks against low self-influence samples by distilled shadow models and inference models.
Source:
PeerJ Computer Science; Oct2025, p1-29, 29p
Database:
Complementary Index

*Further Information*

*Machine learning models face increasing threats from membership inference attacks, which aim to infer sample membership. Sample membership represents whether or not a particular data sample is included in the training set of a given model and is considered a fundamental form of privacy leakage. Recent research has focused on the likelihood ratio attack, a membership inference attack that aggregates membership-relevant features through difficulty calibration and infers sample membership through hypothesis testing. However, difficulty calibration approaches typically require large amounts of labeled data to train shadow models, limiting their general applicability. Moreover, hypothesis testing often fails to identify training samples with low self-influence, resulting in suboptimal attack performance. To address these shortcomings, we propose Distilled Shadow Model and Inference Model to perform Membership Inference Attack (DSMIM-MIA) a novel membership inference attack that reduces the reliance on ground-truth labels through knowledge distillation and mitigates the bias against low self-influence samples using an inference model. Specifically, we distill the target model to train shadow models, which not only remove the dependence on labeled data but also transfer potential membership-relevant information to improve feature aggregation. In place of hypothesis testing, we train a neural network, referred to as the inference model, to predict sample membership. By learning membership decision functions directly from data, without relying on predefined statistical assumptions, our method achieves more accurate and generalizable predictions, especially for samples with low self-influence. Extensive experiments across three datasets and four model architectures demonstrate that DSMIM-MIA consistently outperforms existing state-of-the-art attacks under various evaluation metrics. [ABSTRACT FROM AUTHOR]

Copyright of PeerJ Computer Science is the property of PeerJ Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*