Trending repositories for topic interpretability

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

11,352 (+20)

mit

shap/shap

A game theoretic approach to explain the output of any machine learning model.

23,601 (+18)

mit

EthicalML/awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

18,258 (+14)

mit

interpretml/interpret

Fit interpretable models. Explain blackbox machine learning.

6,436 (+8)

mit

google/yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

545 (+5)

apache-2.0

microsoft/responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libr...

1,495 (+4)

mit

pytorch/captum

Model interpretability and understanding for PyTorch

5,147 (+3)

bsd-3-clause

frgfm/torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

2,140 (+2)

apache-2.0

MAIF/shapash

🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models

2,860 (+2)

apache-2.0

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

47 (+1)

mit

PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

167 (+1)

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+1)

mit

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

614 (+1)

mit

Last 3 days (relative gain)

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

47 (+2%)

mit

google/yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

545 (+0.9%)

apache-2.0

PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

167 (+0.6%)

microsoft/responsible-ai-toolbox

1,495 (+0.3%)

mit

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+0.2%)

mit

jacobgil/pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

11,352 (+0.2%)

mit

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

614 (+0.2%)

mit

interpretml/interpret

Fit interpretable models. Explain blackbox machine learning.

6,436 (+0.1%)

mit

frgfm/torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

2,140 (+0.1%)

apache-2.0

EthicalML/awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

18,258 (+0.1%)

mit

shap/shap

A game theoretic approach to explain the output of any machine learning model.

23,601 (+0.1%)

mit

MAIF/shapash

🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models

2,860 (+0.1%)

apache-2.0

pytorch/captum

Model interpretability and understanding for PyTorch

5,147 (+0.1%)

bsd-3-clause

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

jacobgil/pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

11,352 (+46)

mit

EthicalML/awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

18,258 (+33)

mit

shap/shap

A game theoretic approach to explain the output of any machine learning model.

23,601 (+28)

mit

pytorch/captum

Model interpretability and understanding for PyTorch

5,147 (+11)

bsd-3-clause

interpretml/interpret

Fit interpretable models. Explain blackbox machine learning.

6,436 (+10)

mit

microsoft/responsible-ai-toolbox

1,495 (+7)

mit

MAIF/shapash

🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models

2,860 (+6)

apache-2.0

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+5)

mit

google/yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

545 (+5)

apache-2.0

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

330 (+4)

ndif-team/nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

526 (+4)

mit

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

614 (+4)

mit

stanfordnlp/pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

725 (+4)

apache-2.0

jphall663/awesome-machine-learning-interpretability

A curated list of awesome responsible machine learning resources.

3,745 (+4)

cc0-1.0

stanfordnlp/pyreft

Stanford NLP Python library for Representation Finetuning (ReFT)

1,449 (+3)

apache-2.0

frgfm/torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

2,140 (+3)

apache-2.0

JHoelli/Awesome-Time-Series-Explainability

A list of (post-hoc) XAI for time series

124 (+2)

mit

rigvedrs/YOLO-V11-CAM

Wanna know what your model sees? Here's a package for applying EigenCAM and generating heatmap from the new YOLO V11 model

180 (+2)

mit

chr5tphr/zennit

Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.

217 (+2)

wangyongjie-ntu/Awesome-explainable-AI

A collection of research materials on explainable AI/ML

1,472 (+2)

mit

Last week (relative gain)

Alsace08/Chain-of-Embedding

[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"

38 (+3%)

apache-2.0

KempnerInstitute/overcomplete

👋 Overcomplete is a Vision-based SAE Toolbox

42 (+2%)

mit

dilyabareeva/quanda

A toolkit for quantitative evaluation of data attribution methods.

43 (+2%)

mit

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

47 (+2%)

mit

JHoelli/Awesome-Time-Series-Explainability

A list of (post-hoc) XAI for time series

124 (+2%)

mit

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

330 (+1%)

rigvedrs/YOLO-V11-CAM

Wanna know what your model sees? Here's a package for applying EigenCAM and generating heatmap from the new YOLO V11 model

180 (+1%)

mit

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+1%)

mit

chr5tphr/zennit

Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.

217 (+0.9%)

google/yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

545 (+0.9%)

apache-2.0

mahmoodlab/SurvPath

Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction - CVPR 2024

130 (+0.8%)

fzi-forschungszentrum-informatik/TSInterpret

An Open-Source Library for the interpretability of time series classifiers

131 (+0.8%)

bsd-3-clause

ndif-team/nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

526 (+0.8%)

mit

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

614 (+0.7%)

mit

PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

167 (+0.6%)

stanfordnlp/pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

725 (+0.6%)

apache-2.0

microsoft/responsible-ai-toolbox

1,495 (+0.5%)

mit

jacobgil/pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

11,352 (+0.4%)

mit

SteveKGYang/MentalLLaMA

This repository introduces MentaLLaMA, the first open-source instruction following large language model for interpretable mental health analysis.

248 (+0.4%)

mit

hbaniecki/adversarial-explainable-ai

💡 Adversarial attacks on explanations and how to defend them

314 (+0.3%)

cc-by-sa-4.0

Last month (new repositories)

no newly created repositories trending in the last month

Last month (absolute gain)

jacobgil/pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

11,352 (+183)

mit

shap/shap

A game theoretic approach to explain the output of any machine learning model.

23,601 (+165)

mit

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+162)

mit

EthicalML/awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

18,258 (+132)

mit

pytorch/captum

Model interpretability and understanding for PyTorch

5,147 (+51)

bsd-3-clause

MAIF/shapash

🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models

2,860 (+39)

apache-2.0

interpretml/interpret

Fit interpretable models. Explain blackbox machine learning.

6,436 (+31)

mit

ndif-team/nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

526 (+24)

mit

stanfordnlp/pyreft

Stanford NLP Python library for Representation Finetuning (ReFT)

1,449 (+23)

apache-2.0

frgfm/torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

2,140 (+23)

apache-2.0

microsoft/responsible-ai-toolbox

1,495 (+22)

mit

stanfordnlp/pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

725 (+21)

apache-2.0

KempnerInstitute/overcomplete

👋 Overcomplete is a Vision-based SAE Toolbox

42 (+20)

mit

jphall663/awesome-machine-learning-interpretability

A curated list of awesome responsible machine learning resources.

3,745 (+19)

cc0-1.0

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

47 (+16)

mit

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

330 (+16)

google/yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

545 (+16)

apache-2.0

Alsace08/Chain-of-Embedding

[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"

38 (+14)

apache-2.0

google-deepmind/penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

1,748 (+14)

apache-2.0

SeldonIO/alibi

Algorithms for explaining machine learning models

2,474 (+14)

Last month (relative gain)

KempnerInstitute/overcomplete

👋 Overcomplete is a Vision-based SAE Toolbox

42 (+91%)

mit

Alsace08/Chain-of-Embedding

[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"

38 (+58%)

apache-2.0

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

47 (+52%)

mit

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+50%)

mit

bartbussmann/BatchTopK

Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)

32 (+23%)

OpenMOSS/Language-Model-SAEs

For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.

105 (+12%)

lingbai-kong/CausalFormer

PyTorch Implementation of CausalFormer: An Interpretable Transformer for Temporal Causal Discovery

30 (+11%)

gpl-3.0

stanfordnlp/axbench

Stanford NLP Python library for benchmarking the utility of LLM interpretability methods

62 (+11%)

apache-2.0

boniolp/kGraph

Graph Embedding for Interpretable Time Series Clustering

25 (+9%)

mit

rigvedrs/YOLO-V11-CAM

Wanna know what your model sees? Here's a package for applying EigenCAM and generating heatmap from the new YOLO V11 model

180 (+8%)

mit

dilyabareeva/quanda

A toolkit for quantitative evaluation of data attribution methods.

43 (+8%)

mit

dobriban/Principles-of-AI-LLMs

Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety a...

30 (+7%)

cc0-1.0

JHoelli/Awesome-Time-Series-Explainability

A list of (post-hoc) XAI for time series

124 (+7%)

mit

PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

167 (+6%)

foryichuanqi/RESS-Paper-2022.09-Remaining-useful-life-prediction-by-TaFCN

The source code of paper: Trend attention fully convolutional network for remaining useful life estimation in the turbofan engine PHM of CMAPSS dataset. Signal selection, Attention mechanism, and Inte...

56 (+6%)

zjunlp/KnowledgeCircuits

[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers

134 (+6%)

mit

ChicagoHAI/hypothesis-generation

This is the official repository for HypoGeniC (Hypothesis Generation in Context) and HypoRefine, which are automated, data-driven tools that leverage large language models to generate hypothesis for o...

59 (+5%)

mit

vanderschaarlab/autoprognosis

A system for automating the design of predictive modeling pipelines tailored for clinical prognosis.

138 (+5%)

apache-2.0

dmis-lab/Monet

[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers

60 (+5%)

apache-2.0

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

330 (+5%)

Last 12-months (new repositories)

google-deepmind/penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

1,748

apache-2.0

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

330

MadryLab/modelcomponents

Decomposing and Editing Predictions by Modeling Model Computation

138

mit

stanfordnlp/axbench

Stanford NLP Python library for benchmarking the utility of LLM interpretability methods

apache-2.0

dmis-lab/Monet

[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers

apache-2.0

ChicagoHAI/hypothesis-generation

mit

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

mit

KempnerInstitute/overcomplete

👋 Overcomplete is a Vision-based SAE Toolbox

mit

FarnoushRJ/MambaLRP

[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".

mit

Alsace08/Chain-of-Embedding

[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"

apache-2.0

gao-g/prelude

Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".

mit

yihuaihong/ConceptVectors

ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"

cc-by-4.0

bartbussmann/BatchTopK

Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)

dobriban/Principles-of-AI-LLMs

cc0-1.0

ombhojane/explainableai

Increase interpretability of your models!

mit

Last 12-months (absolute gain)

EthicalML/awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

18,258 (+2,479)

mit

shap/shap

A game theoretic approach to explain the output of any machine learning model.

23,601 (+2,173)

mit

jacobgil/pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

11,352 (+2,105)

mit

google-deepmind/penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

1,748 (+1,747)

apache-2.0

stanfordnlp/pyreft

Stanford NLP Python library for Representation Finetuning (ReFT)

1,449 (+1,446)

apache-2.0

pytorch/captum

Model interpretability and understanding for PyTorch

5,147 (+647)

bsd-3-clause

ndif-team/nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

526 (+475)

mit

interpretml/interpret

Fit interpretable models. Explain blackbox machine learning.

6,436 (+468)

mit

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+466)

mit

frgfm/torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

2,140 (+388)

apache-2.0

stanfordnlp/pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

725 (+355)

apache-2.0

jphall663/awesome-machine-learning-interpretability

A curated list of awesome responsible machine learning resources.

3,745 (+354)

cc0-1.0

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

330 (+327)

microsoft/responsible-ai-toolbox

1,495 (+317)

mit

MAIF/shapash

🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models

2,860 (+228)

apache-2.0

wangyongjie-ntu/Awesome-explainable-AI

A collection of research materials on explainable AI/ML

1,472 (+217)

mit

SeldonIO/alibi

Algorithms for explaining machine learning models

2,474 (+196)

csinva/imodels

Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).

1,443 (+165)

mit

hila-chefer/Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-bas...

839 (+141)

mit

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

614 (+138)

mit

Last 12-months (relative gain)

MadryLab/modelcomponents

Decomposing and Editing Predictions by Modeling Model Computation

138 (+3,350%)

mit

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

486 (+2,330%)

mit

zjunlp/KnowledgeCircuits

[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers

134 (+1,575%)

mit

ndif-team/nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

526 (+931%)

mit

AliAmini93/ADHDeepNet

ADHDeepNet is a model that integrates temporal and spatial characterization, attention modules, and explainability techniques, optimized for EEG data ADAD diagnosis. Neural Architecture Search (NAS), ...

39 (+875%)

suinleelab/MONET

Transparent medical image AI via an image–text foundation model grounded in medical literature

64 (+482%)

dmis-lab/Monet

[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers

60 (+445%)

apache-2.0

boniolp/kGraph

Graph Embedding for Interpretable Time Series Clustering

25 (+400%)

mit

JHoelli/Awesome-Time-Series-Explainability

A list of (post-hoc) XAI for time series

124 (+359%)

mit

mahmoodlab/SurvPath

Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction - CVPR 2024

130 (+282%)

PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

167 (+178%)

rigvedrs/YOLO-V11-CAM

Wanna know what your model sees? Here's a package for applying EigenCAM and generating heatmap from the new YOLO V11 model

180 (+134%)

mit

csinva/imodelsX

Interpret text data using LLMs (scikit-learn compatible).

163 (+120%)

mit

trustyai-explainability/trustyai-explainability

TrustyAI Explainability Toolkit

37 (+118%)

apache-2.0

mims-harvard/TimeX

Time series explainability via self-supervised model behavior consistency

45 (+105%)

stanfordnlp/pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

725 (+96%)

apache-2.0

aryamanarora/causalgym

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

41 (+95%)

microsoft/automated-brain-explanations

Generating and validating natural-language explanations for the brain.

50 (+85%)

mit

taufeeque9/codebook-features

Sparse and discrete interpretability tool for neural networks

60 (+71%)

mit

alanqrwang/keymorph

Robust multimodal image registration via keypoints

75 (+67%)

mit