Mechanistic Interpretability

[Franco and Crovella, 2025]
 
Show BibTeX entry
Gabriel Franco and Mark Crovella (2025).
Pinpointing Attention-Causal Communication in Language Models.
In: Proceedings of NeurIPS. San Diego, CA. doi:TBD
[Franco and Crovella, 2024]
 
Show BibTeX entry
Gabriel Franco and Mark Crovella (2024).
Sparse Attention Decomposition Applied to Circuit Tracing.
Technical Report Nr. 2410.00340. doi:10.48550/arXiv.2410.00340