Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield

Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield