5 results found Sort:

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
This repository has been archived (exclude archived)
Created 2024-11-21
3 commits to main branch, last one 4 months ago
45
562
mit
5
A library for making RepE control vectors
Created 2024-01-21
28 commits to main branch, last one 2 months ago
This repository collects all relevant resources about interpretability in LLMs
Created 2024-06-30
56 commits to main branch, last one 5 months ago
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
Created 2024-02-17
13 commits to main branch, last one 5 months ago
5
41
apache-2.0
8
SANSA - sparse EASE for millions of items
Created 2023-07-11
71 commits to main branch, last one 2 months ago