Statistics for topic adversarial-machine-learning
RepositoryStats tracks 595,856 Github repositories, of these 74 are tagged with the adversarial-machine-learning topic. The most common primary language for repositories using this topic is Python (42). Other languages include: Jupyter Notebook (15)
Stargazers over time for topic adversarial-machine-learning
Most starred repositories for topic adversarial-machine-learning (view more)
Trending repositories for topic adversarial-machine-learning (view more)
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Papers and resources related to the security and privacy of LLMs 🤖
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Detection of IoT devices infected by malwares from their network communications, using federated machine learning
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
The fastest && easiest LLM security guardrails for AI Agents and applications.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Papers and resources related to the security and privacy of LLMs 🤖
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A re-implementation of the "Red Teaming Language Models with Language Models" paper by Perez et al., 2022
CTF challenges designed and implemented in machine learning applications