1 result found Sort:
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
Created
2024-06-13
4 commits to main branch, last one 4 months ago