Universal and Transferable Adversarial Attacks on Aligned Language Models

Universal and Transferable Adversarial Attacks on Aligned Language Models