Data Poisoning
Data poisoning is an adversarial attack that corrupts an AI model's training data to manipulate its behavior in targeted ways. Attackers inject malicious examples into training datasets to create backdoors, degrade performance on specific inputs, or bias the model toward particular outputs. It is one of the most difficult AI security threats to detect because the corrupted data can appear normal during inspection.
Example
An attacker contributes thousands of subtly mislabeled images to a public dataset used for training self-driving car models — labeling stop signs photographed at night as "speed limit" signs. The trained model then misclassifies stop signs under low-light conditions while performing normally otherwise.
Related Terms
Put this into practice
Build polished, copy-ready prompts in under 60 seconds with SurePrompts.
Try SurePrompts