Trending repositories for topic adversarial-machine-learning
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
Papers and resources related to the security and privacy of LLMs 🤖
A curated collection of adversarial attack and defense on recommender systems.
GraphGallery is a gallery for benchmarking Graph Neural Networks, From InplusLab.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated collection of adversarial attack and defense on recommender systems.
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
Papers and resources related to the security and privacy of LLMs 🤖
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
GraphGallery is a gallery for benchmarking Graph Neural Networks, From InplusLab.
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
Papers and resources related to the security and privacy of LLMs 🤖
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of useful resources that cover Offensive AI.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
The fastest && easiest LLM security guardrails for AI Agents and applications.
A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
CTF challenges designed and implemented in machine learning applications
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.
GraphGallery is a gallery for benchmarking Graph Neural Networks, From InplusLab.
Detection of IoT devices infected by malwares from their network communications, using federated machine learning
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Detection of IoT devices infected by malwares from their network communications, using federated machine learning
The fastest && easiest LLM security guardrails for AI Agents and applications.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Papers and resources related to the security and privacy of LLMs 🤖
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
CTF challenges designed and implemented in machine learning applications
The code for ECCV2022 (Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal)
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
A curated resource list of adversarial attacks and defenses for Windows PE malware detection.
A list of papers in NeurIPS 2022 related to adversarial attack and defense / AI security.
A curated list of useful resources that cover Offensive AI.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
The fastest && easiest LLM security guardrails for AI Agents and applications.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
The fastest && easiest LLM security guardrails for AI Agents and applications.
A curated list of trustworthy deep learning papers. Daily updating...
CTF challenges designed and implemented in machine learning applications
A curated list of academic events on AI Security & Privacy
💡 Adversarial attacks on explanations and how to defend them
A curated list of adversarial attacks and defenses papers on graph-structured data.
A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
Papers and resources related to the security and privacy of LLMs 🤖
A re-implementation of the "Red Teaming Language Models with Language Models" paper by Perez et al., 2022
CTF challenges designed and implemented in machine learning applications
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
A curated list of academic events on AI Security & Privacy
6G Wireless Communication Security - Deep Learning Based Channel Estimation Dataset
Reading list for adversarial perspective and robustness in deep reinforcement learning.
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
A curated list of trustworthy deep learning papers. Daily updating...
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
Implements Adversarial Examples for Semantic Segmentation and Object Detection, using PyTorch and Detectron2
A Paperlist of Adversarial Attack on Object Detection
The code for ECCV2022 (Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal)
The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack...