33 results found Sort:

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Created 2018-03-15
12,410 commits to main branch, last one 2 days ago
236
1.9k
mit
26
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
Created 2023-05-09
1,172 commits to main branch, last one a day ago
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
Created 2023-12-23
287 commits to main branch, last one about a month ago
The open-sourced Python toolbox for backdoor attacks and defenses.
Created 2021-10-26
348 commits to main branch, last one 3 months ago
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
Created 2023-12-14
1,590 commits to main branch, last one 10 days ago
26
165
mit
4
🚀 A fast safe reinforcement learning library in PyTorch
Created 2023-05-07
15 commits to main branch, last one about a month ago
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
Created 2023-05-26
34 commits to main branch, last one about a year ago
A comprehensive toolbox for model inversion attacks and defenses, which is easy to get started.
Created 2023-05-17
196 commits to main branch, last one about a month ago
Code of the paper: A Recipe for Watermarking Diffusion Models
Created 2023-03-17
33 commits to main branch, last one about a year ago
AI Verify
Created 2023-06-03
941 commits to main branch, last one 14 days ago
7
108
cc-by-sa-4.0
5
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
Created 2024-06-09
193 commits to main branch, last one 15 days ago
Official code repo for the O'Reilly Book - Machine Learning for High-Risk Applications
Created 2022-10-07
333 commits to main branch, last one about a year ago
A toolkit for tools and techniques related to the privacy and compliance of AI models.
Created 2021-04-28
149 commits to main branch, last one 4 months ago
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
Created 2024-02-09
28 commits to main branch, last one about a month ago
A project to add scalable state-of-the-art out-of-distribution detection (open set recognition) support by changing two lines of code! Perform efficient inferences (i.e., do not increase inference tim...
Created 2019-08-16
50 commits to master branch, last one 2 years ago
The official implementation for ICLR23 paper "GNNSafe: Energy-based Out-of-Distribution Detection for Graph Neural Networks"
Created 2023-01-24
15 commits to main branch, last one about a year ago
10
67
apache-2.0
5
[ICCV2021 Oral] Fooling LiDAR by Attacking GPS Trajectory
Created 2020-10-06
27 commits to master branch, last one 2 years ago
[NeurIPS'24 & ICMLW'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Created 2024-06-06
15 commits to main branch, last one about a month ago
A curated list of awesome academic research, books, code of ethics, data sets, institutes, newsletters, principles, podcasts, reports, tools, regulations and standards related to Responsible, Trustwor...
Created 2021-09-05
270 commits to main branch, last one 3 days ago
[ACM MM22] Towards Robust Video Object Segmentation with Adaptive Object Calibration, ACM Multimedia 2022
Created 2022-07-01
43 commits to main branch, last one about a year ago
Principal Image Sections Mapping. Convolutional Neural Network Visualisation and Explanation Framework
Created 2021-01-22
44 commits to master branch, last one about a year ago
A project to improve out-of-distribution detection (open set recognition) and uncertainty estimation by changing a few lines of code in your project! Perform efficient inferences (i.e., do not increas...
Created 2022-05-10
40 commits to master branch, last one 2 years ago
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024
Created 2023-11-27
24 commits to main branch, last one 3 months ago
A curated list of valuable resources from our studies at the University of Tehran (UT), School of Electrical and Computer Engineering (ECE)
Created 2024-07-11
66 commits to main branch, last one 2 months ago
3
39
bsd-2-clause
8
Official code of "StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis" (CVPR 2022)
Created 2022-03-22
8 commits to main branch, last one 2 years ago
Code of the paper: Finetuning Text-to-Image Diffusion Models for Fairness
Created 2023-12-03
6 commits to main branch, last one 6 months ago
10
36
unknown
2
[TPAMI, 2023] Fear-Neuro-Inspired Reinforcement Learning for Safe Autonomous Driving
Created 2023-06-30
32 commits to master branch, last one 11 months ago