We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Data Stealing Attacks against Large Language Models via Backdooring.
- Authors
He, Jiaming; Hou, Guanyu; Jia, Xinyue; Chen, Yangyang; Liao, Wenqi; Zhou, Yinhang; Zhou, Rang
- Abstract
Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring. Our method mainly focuses on the scenario of model customization and is conducted in two phases, including backdoor training and backdoor activation, which allow for the extraction of private information without prior knowledge of the model's architecture or training data. During the model customization stage, attackers inject the backdoor into the pre-trained LLM by poisoning a small ratio of the training dataset. During the inference stage, attackers can extract private information from the third-party knowledge database by incorporating the pre-defined backdoor trigger. Our method leverages the customization process of LLMs, injecting a stealthy backdoor that can be triggered after deployment to retrieve private data. We demonstrate the effectiveness of our proposed attack through extensive experiments, achieving a notable attack success rate. Extensive experiments demonstrate the effectiveness of our stealing attack in popular LLM architectures, as well as stealthiness during normal inference.
- Subjects
LANGUAGE models; DATA privacy; DATABASES; DATA mining; CUSTOMIZATION
- Publication
Electronics (2079-9292), 2024, Vol 13, Issue 14, p2858
- ISSN
2079-9292
- Publication type
Article
- DOI
10.3390/electronics13142858