Search Results - RepositoryStats

273

3.0k

mit

38

[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).

llm prompt chatgpt jailbreak jailbreaking llm-security large-language-model

Created 2023-08-01

19 commits to main branch, last one 2 months ago

FuzzyAI cyberark

43

421

apache-2.0

9

A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.

ai llm llms fuzzing security jailbreak ai-read-team jailbreaking llm-security llm-evaluation

Created 2024-12-03

183 commits to main branch, last one 13 hours ago

llm-past-tense tml-epfl

10

63

unknown

2

Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]

llms robustness jailbreaking generalization

Created 2024-07-16

10 commits to main branch, last one about a month ago

FriendGPT LylaCoding

2

30

mit

1

An extensive prompt to make a friendly persona from a chatbot-like model like ChatGPT

ai chatgpt hacking friendlyai jailbreaking

Created 2023-04-16

10 commits to main branch, last one about a year ago

Principles-of-AI-LLMs dobriban

0

28

cc0-1.0

2

Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety a...

ai llms rlhf safety aisafety circuits alignment education inference robustness fine-tuning jailbreaking transformers hallucination interpretability test-time-computation

Created 2024-12-18

78 commits to main branch, last one 4 days ago