Data Privacy and Security in ML | AI Reasoning

Data Privacy and Security in ML

Data Privacy and Security in ML #

Q1: Why is data privacy and security important in ML models? #

ML models often leak information about their training data.
Public models can reveal sensitive data either directly or through inference attacks.

Q2: What types of attacks can compromise ML models? #

Membership inference attacks: Determine if a datapoint was part of the training set.
Data extraction attacks: Extract parts of the training data from the model.
Other attacks include adversarial examples, data poisoning, model inversion, model extraction, and prompt injection.

Q3: What are security goals and threat models? #

A security goal defines what must or must not happen.
A threat model defines the adversary’s capabilities and limitations.
Both are necessary to properly reason about a system’s security.

Q4: How does threat modeling apply to ML APIs? #

Example: Google Vision API must prevent model extraction even when adversaries can query with arbitrary images.

Q5: What is a membership inference attack? #

An attack that identifies whether a specific data point was in the model’s training set.

Q6: How does shadow training help in membership inference? #

Shadow models simulate the target model’s behavior on known datasets.
An attack model is trained to classify whether a data point was part of the training set based on model outputs.

Q7: What are simple metric-based membership inference attacks? #

Prediction correctness: whether model predicts correctly.
Prediction loss: whether the model loss is low.
Prediction confidence: model’s maximum output probability.
Prediction entropy: uncertainty of model’s output distribution.

Q8: What is a data extraction attack? #

Directly extracting memorized sequences or examples from a model, especially from large LLMs.

Q9: How is perplexity used in data extraction? #

Lower perplexity on sequences indicates that they were likely memorized during training.

Q10: What are empirical defenses against privacy attacks? #

Limiting outputs (top-k predictions, quantization).
Adding noise to predictions.
Changing training methods (e.g., regularization).

Q11: Why is empirical defense evaluation hard? #

Following Kerckhoffs’s principle, defenses must work even when attackers know the defense.
Security is a cat-and-mouse game between defenders and attackers.

Q12: What is differential privacy (DP)? #

A formal, mathematical definition of privacy that limits how much an algorithm’s output depends on any single input.
Algorithms like DP-SGD make models less dependent on individual datapoints.

Q13: What challenges exist with using differential privacy? #

It introduces parameters (ε, δ) that are difficult to set.
Strong privacy may come at the cost of degraded model performance.

Q14: What resources were recommended? #

Surveys on membership inference attacks and privacy attacks.
Awesome ML privacy attacks collection.

Q15: What was the lab assignment? #

Implement a membership inference attack against a black-box model.

#