Trending repositories for topic adversarial-machine-learning
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
[NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations"
A curated list of trustworthy deep learning papers. Daily updating...
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
[NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations"
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
A curated list of trustworthy deep learning papers. Daily updating...
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
A curated list of useful resources that cover Offensive AI.
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
Papers and resources related to the security and privacy of LLMs 🤖
Reading list for adversarial perspective and robustness in deep reinforcement learning.
[NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations"
CTF challenges designed and implemented in machine learning applications
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of trustworthy deep learning papers. Daily updating...
GraphGallery is a gallery for benchmarking Graph Neural Networks
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Reading list for adversarial perspective and robustness in deep reinforcement learning.
[NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations"
CTF challenges designed and implemented in machine learning applications
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of useful resources that cover Offensive AI.
Papers and resources related to the security and privacy of LLMs 🤖
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of trustworthy deep learning papers. Daily updating...
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
GraphGallery is a gallery for benchmarking Graph Neural Networks
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
A curated list of useful resources that cover Offensive AI.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
A list of recent papers about adversarial learning
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
Papers and resources related to the security and privacy of LLMs 🤖
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of academic events on AI Security & Privacy
A curated list of trustworthy deep learning papers. Daily updating...
CTF challenges designed and implemented in machine learning applications
auto_LiRPA: An Automatic Linear Relaxation based Perturbation Analysis Library for Neural Networks and General Computational Graphs
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.
GraphGallery is a gallery for benchmarking Graph Neural Networks
A curated list of adversarial attacks and defenses papers on graph-structured data.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
A list of recent papers about adversarial learning
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
A curated list of academic events on AI Security & Privacy
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
CTF challenges designed and implemented in machine learning applications
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
The code for ECCV2022 (Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal)
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.
Reading list for adversarial perspective and robustness in deep reinforcement learning.
Papers and resources related to the security and privacy of LLMs 🤖
The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack...
A guided mutation-based fuzzer for ML-based Web Application Firewalls
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhanc...
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
A curated list of useful resources that cover Offensive AI.
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A list of recent papers about adversarial learning
A curated list of trustworthy deep learning papers. Daily updating...
CTF challenges designed and implemented in machine learning applications
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
A curated list of academic events on AI Security & Privacy
A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.
A curated list of adversarial attacks and defenses papers on graph-structured data.
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
A list of recent papers about adversarial learning
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
This repository includes code for the AutoML-based IDS and adversarial attack defense case studies presented in the paper "Enabling AutoML for Zero-Touch Network Security: Use-Case Driven Analysis" pu...
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
Papers and resources related to the security and privacy of LLMs 🤖
A re-implementation of the "Red Teaming Language Models with Language Models" paper by Perez et al., 2022
CTF challenges designed and implemented in machine learning applications
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
6G Wireless Communication Security - Deep Learning Based Channel Estimation Dataset
A curated list of academic events on AI Security & Privacy
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
An ASR (Automatic Speech Recognition) adversarial attack repository.
The code for ECCV2022 (Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal)
Reading list for adversarial perspective and robustness in deep reinforcement learning.
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
Detection of IoT devices infected by malwares from their network communications, using federated machine learning
A curated list of trustworthy deep learning papers. Daily updating...