Nils Feldhus

nfel

https://nfelnlp.github.io

AI & ML interests

Interpretability, Explainability, Natural Language Generation

Recent Activity

upvoted a collection 27 days ago

👤 Implicit Personalization in Language Models

liked a Space 29 days ago

aaron0eidt/ELIA

authored a paper 3 months ago

Interpreting Language Models Through Concept Descriptions: A Survey

View all activity

Organizations

upvoted a collection 27 days ago

👤 Implicit Personalization in Language Models

Collection

Works on detecting, attributing and controlling implicit personalization in language models • 20 items • Updated 14 days ago • 1

upvoted a paper 3 months ago

Interpreting Language Models Through Concept Descriptions: A Survey

Paper • 2510.01048 • Published Oct 1 • 2

upvoted a paper 4 months ago

RelP: Faithful and Efficient Circuit Discovery via Relevance Patching

Paper • 2508.21258 • Published Aug 28 • 3

upvoted a paper 5 months ago

Infherno: End-to-end Agent-based FHIR Resource Synthesis from Free-form Clinical Notes

Paper • 2507.12261 • Published Jul 16 • 1

upvoted 3 papers 6 months ago

Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data

Paper • 2507.00152 • Published Jun 30 • 1

Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework

Paper • 2506.15538 • Published Jun 18 • 1

GeistBERT: Breathing Life into German NLP

Paper • 2506.11903 • Published Jun 13 • 4

upvoted a collection 6 months ago

ELI-Why

Collection

🧠 ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations ACL Findings 2025 • 4 items • Updated Jun 11 • 3

upvoted 2 papers 7 months ago

Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability

Paper • 2505.13963 • Published May 20 • 1

Truth or Twist? Optimal Model Selection for Reliable Label Flipping Evaluation in LLM-based Counterfactuals

Paper • 2505.13972 • Published May 20 • 1

upvoted 2 papers 8 months ago

Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods

Paper • 2505.01198 • Published May 2 • 2

Do Large Language Models Latently Perform Multi-Hop Reasoning?

Paper • 2402.16837 • Published Feb 26, 2024 • 28

upvoted a paper 9 months ago

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Paper • 2501.08319 • Published Jan 14 • 11

upvoted a paper 10 months ago

QE4PE: Word-level Quality Estimation for Human Post-Editing

Paper • 2503.03044 • Published Mar 4 • 6

upvoted an article about 1 year ago

Article

What We Learned About LLM/VLMs in Healthcare AI Evaluation:

Nov 8, 2024

•

upvoted a paper over 1 year ago

A Primer on the Inner Workings of Transformer-based Language Models

Paper • 2405.00208 • Published Apr 30, 2024 • 12

upvoted a collection almost 2 years ago

🔍 Interpretability & Analysis of LMs

Collection

Outstanding research in LM interpretability and evaluation, summarized • 135 items • Updated 10 days ago • 116

Nils Feldhus

AI & ML interests

Recent Activity

Organizations

nfel's activity

What We Learned About LLM/VLMs in Healthcare AI Evaluation: