Statistics for topic adversarial-attacks
RepositoryStats tracks 642,274 Github repositories, of these 140 are tagged with the adversarial-attacks topic. The most common primary language for repositories using this topic is Python (96). Other languages include: Jupyter Notebook (19)
Stargazers over time for topic adversarial-attacks
Most starred repositories for topic adversarial-attacks (view more)
Trending repositories for topic adversarial-attacks (view more)
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
PyTorch implementation of adversarial attacks [torchattacks]
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
PyTorch implementation of adversarial attacks [torchattacks]
Beacon Object File (BOF) launcher - library for executing BOF files in C/C++/Zig applications
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
A Paperlist of Adversarial Attack on Object Detection
[ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
PyTorch implementation of adversarial attacks [torchattacks]
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Fantastic Robustness Measures: The Secrets of Robust Generalization [NeurIPS 2023]
[ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Beacon Object File (BOF) launcher - library for executing BOF files in C/C++/Zig applications
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
[ICLR 2025] Dissecting Adversarial Robustness of Multimodal LM Agents
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers