Trending repositories for topic adversarial-machine-learning
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
A curated list of useful resources that cover Offensive AI.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
A Paperlist of Adversarial Attack on Object Detection
Papers and resources related to the security and privacy of LLMs 🤖
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of adversarial attacks and defenses papers on graph-structured data.
A Paperlist of Adversarial Attack on Object Detection
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
A curated list of useful resources that cover Offensive AI.
Papers and resources related to the security and privacy of LLMs 🤖
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
A curated list of adversarial attacks and defenses papers on graph-structured data.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Papers and resources related to the security and privacy of LLMs 🤖
A curated list of useful resources that cover Offensive AI.
A Paperlist of Adversarial Attack on Object Detection
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
A curated list of adversarial attacks and defenses papers on graph-structured data.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A Paperlist of Adversarial Attack on Object Detection
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Papers and resources related to the security and privacy of LLMs 🤖
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of useful resources that cover Offensive AI.
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
A curated list of adversarial attacks and defenses papers on graph-structured data.
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
Papers and resources related to the security and privacy of LLMs 🤖
CTF challenges designed and implemented in machine learning applications
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
A curated list of adversarial attacks and defenses papers on graph-structured data.
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
The fastest && easiest LLM security guardrails for AI Agents and applications.
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
A Paperlist of Adversarial Attack on Object Detection
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
CTF challenges designed and implemented in machine learning applications
The fastest && easiest LLM security guardrails for AI Agents and applications.
This repository includes code for the AutoML-based IDS and adversarial attack defense case studies presented in the paper "Enabling AutoML for Zero-Touch Network Security: Use-Case Driven Analysis" pu...
Papers and resources related to the security and privacy of LLMs 🤖
Generative Adversarial Networks with TensorFlow2, Keras and Python (Jupyter Notebooks Implementations)
A Paperlist of Adversarial Attack on Object Detection
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.
A guided mutation-based fuzzer for ML-based Web Application Firewalls
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
A curated list of academic events on AI Security & Privacy
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
The fastest && easiest LLM security guardrails for AI Agents and applications.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Papers and resources related to the security and privacy of LLMs 🤖
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of trustworthy deep learning papers. Daily updating...
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
The fastest && easiest LLM security guardrails for AI Agents and applications.
CTF challenges designed and implemented in machine learning applications
A curated list of adversarial attacks and defenses papers on graph-structured data.
A curated list of academic events on AI Security & Privacy
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
💡 Adversarial attacks on explanations and how to defend them
auto_LiRPA: An Automatic Linear Relaxation based Perturbation Analysis Library for Neural Networks and General Computational Graphs
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A re-implementation of the "Red Teaming Language Models with Language Models" paper by Perez et al., 2022
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
CTF challenges designed and implemented in machine learning applications
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
A curated list of academic events on AI Security & Privacy
Reading list for adversarial perspective and robustness in deep reinforcement learning.
6G Wireless Communication Security - Deep Learning Based Channel Estimation Dataset
A curated list of trustworthy deep learning papers. Daily updating...
A Paperlist of Adversarial Attack on Object Detection
Implements Adversarial Examples for Semantic Segmentation and Object Detection, using PyTorch and Detectron2
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack...