Javier Rando | AI Safety and Security

follow: @[email protected]

Posts

The Importance of Adversarial Evaluations for AI Safety

Do not write that jailbreak paper

The Worst (But Only) Claude 3 Tokenizer

Universal Jailbreak Backdoors from Poisoned Human Feedback