4 results found Sort:

Attack to induce LLMs within hallucinations
Created 2023-09-29
22 commits to master branch, last one about a year ago
Papers about red teaming LLMs and Multimodal models.
Created 2024-02-05
36 commits to main branch, last one 3 months ago
Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"
Created 2024-04-06
24 commits to master branch, last one 5 months ago
2
27
unknown
1
Restore safety in fine-tuned language models through task arithmetic
Created 2024-02-17
83 commits to main branch, last one 11 months ago