π€ Implicit Personalization in Language Models Collection Works on detecting, attributing and controlling implicit personalization in language models β’ 20 items β’ Updated 14 days ago β’ 1
Interpreting Language Models Through Concept Descriptions: A Survey Paper β’ 2510.01048 β’ Published Oct 1 β’ 2
RelP: Faithful and Efficient Circuit Discovery via Relevance Patching Paper β’ 2508.21258 β’ Published Aug 28 β’ 3
Infherno: End-to-end Agent-based FHIR Resource Synthesis from Free-form Clinical Notes Paper β’ 2507.12261 β’ Published Jul 16 β’ 1
Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data Paper β’ 2507.00152 β’ Published Jun 30 β’ 1
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework Paper β’ 2506.15538 β’ Published Jun 18 β’ 1
ELI-Why Collection π§ ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations ACL Findings 2025 β’ 4 items β’ Updated Jun 11 β’ 3
Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability Paper β’ 2505.13963 β’ Published May 20 β’ 1
Truth or Twist? Optimal Model Selection for Reliable Label Flipping Evaluation in LLM-based Counterfactuals Paper β’ 2505.13972 β’ Published May 20 β’ 1
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods Paper β’ 2505.01198 β’ Published May 2 β’ 2
Do Large Language Models Latently Perform Multi-Hop Reasoning? Paper β’ 2402.16837 β’ Published Feb 26, 2024 β’ 28
Enhancing Automated Interpretability with Output-Centric Feature Descriptions Paper β’ 2501.08319 β’ Published Jan 14 β’ 11
QE4PE: Word-level Quality Estimation for Human Post-Editing Paper β’ 2503.03044 β’ Published Mar 4 β’ 6
A Primer on the Inner Workings of Transformer-based Language Models Paper β’ 2405.00208 β’ Published Apr 30, 2024 β’ 12
π Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized β’ 135 items β’ Updated 10 days ago β’ 116