2 results found Sort:

This repository collects all relevant resources about interpretability in LLMs
Created 2024-06-30
56 commits to main branch, last one 6 days ago
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
Created 2024-02-17
13 commits to main branch, last one about a month ago