5 results found Sort:

[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
Created 2023-08-01
19 commits to main branch, last one 2 months ago
43
421
apache-2.0
9
A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
Created 2024-12-03
183 commits to main branch, last one 13 hours ago
Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]
Created 2024-07-16
10 commits to main branch, last one about a month ago
An extensive prompt to make a friendly persona from a chatbot-like model like ChatGPT
Created 2023-04-16
10 commits to main branch, last one about a year ago
Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety a...
Created 2024-12-18
78 commits to main branch, last one 4 days ago