Statistics for topic interpretability
RepositoryStats tracks 584,797 Github repositories, of these 166 are tagged with the interpretability topic. The most common primary language for repositories using this topic is Python (82). Other languages include: Jupyter Notebook (46)
Stargazers over time for topic interpretability
Most starred repositories for topic interpretability (view more)
Trending repositories for topic interpretability (view more)
A game theoretic approach to explain the output of any machine learning model.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction - CVPR 2024
This repository introduces MentaLLaMA, the first open-source instruction following large language model for interpretable mental health analysis.
A game theoretic approach to explain the output of any machine learning model.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
This is the official repository for HypoGeniC (Hypothesis Generation in Context), which is an automated, data-driven tool that leverages large language models to generate hypothesis for open-domain re...
For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
A game theoretic approach to explain the output of any machine learning model.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".
For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.
A toolkit for quantitative evaluation of data attribution methods.
A JAX research toolkit for building, editing, and visualizing neural networks.
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
A game theoretic approach to explain the output of any machine learning model.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
A JAX research toolkit for building, editing, and visualizing neural networks.
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Decomposing and Editing Predictions by Modeling Model Computation
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions