Trending repositories for topic adversarial-machine-learning
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
A curated list of useful resources that cover Offensive AI.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Papers and resources related to the security and privacy of LLMs 🤖
A curated list of useful resources that cover Offensive AI.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
CTF challenges designed and implemented in machine learning applications
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
CTF challenges designed and implemented in machine learning applications
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
💡 Adversarial attacks on explanations and how to defend them
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
A curated list of trustworthy deep learning papers. Daily updating...
Reading list for adversarial perspective and robustness in deep reinforcement learning.
CTF challenges designed and implemented in machine learning applications
[NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations"
A curated list of academic events on AI Security & Privacy
auto_LiRPA: An Automatic Linear Relaxation based Perturbation Analysis Library for Neural Networks and General Computational Graphs
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Papers and resources related to the security and privacy of LLMs 🤖
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
6G Wireless Communication Security - Deep Learning Based Channel Estimation Dataset
Reading list for adversarial perspective and robustness in deep reinforcement learning.
A re-implementation of the "Red Teaming Language Models with Language Models" paper by Perez et al., 2022
This repository includes code for the AutoML-based IDS and adversarial attack defense case studies presented in the paper "Enabling AutoML for Zero-Touch Network Security: Use-Case Driven Analysis" pu...
CTF challenges designed and implemented in machine learning applications
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
[ICLR 2023, Spotlight] Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning
[NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations"
A curated list of academic events on AI Security & Privacy
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
💡 Adversarial attacks on explanations and how to defend them
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
The fastest && easiest LLM security and privacy guardrails for GenAI apps.
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Papers and resources related to the security and privacy of LLMs 🤖
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.
A curated list of useful resources that cover Offensive AI.
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
A curated list of trustworthy deep learning papers. Daily updating...
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
The fastest && easiest LLM security and privacy guardrails for GenAI apps.
CTF challenges designed and implemented in machine learning applications
A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.
A curated list of academic events on AI Security & Privacy
💡 Adversarial attacks on explanations and how to defend them
A curated list of adversarial attacks and defenses papers on graph-structured data.
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
CTF challenges designed and implemented in machine learning applications
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
Reading list for adversarial perspective and robustness in deep reinforcement learning.
A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
A curated list of academic events on AI Security & Privacy
6G Wireless Communication Security - Deep Learning Based Channel Estimation Dataset
A curated list of trustworthy deep learning papers. Daily updating...
Implements Adversarial Examples for Semantic Segmentation and Object Detection, using PyTorch and Detectron2
A Paperlist of Adversarial Attack on Object Detection
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
The code for ECCV2022 (Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal)
The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack...
A list of papers in NeurIPS 2022 related to adversarial attack and defense / AI security.
[ICLR 2023, Spotlight] Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning