Matthew Kowal

AI Interpretability Researcher

Incoming researcher at Goodfire.AI, working on mechanistic interpretability. Previously a Member of Technical Staff at FAR AI, exploring AI safety with a focus on interpretability and LLM persuasion. PhD from York University (funded by NSERC CGS-D), where I studied interpretability of multi-modal and video understanding systems under the supervision of Kosta Derpanis. Past internships at Toyota Research Institute and Ubisoft La Forge.

Matthew Kowal

Research

Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution
Preprint

Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution

We leverage interpretability methods to improve the performance and efficiency of training data attribution.

Interpreting Physics in Video World Models
Preprint

Interpreting Physics in Video World Models

We interpret how video world models learn and represent physical concepts.

Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
ICLR 2026

Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry

A deep dive into the semantics and geometry of concepts in large vision models.

It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics
Preprint

It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics

We introduce the AttemptPersuadeEval (APE) and show that frontier models attempt to persuade users into harmful topics.

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
ICML 2025

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

We train SAEs to discover universal and unique concepts across different models.

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
ICML 2025

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

We design SAEs that solve the issue of stability across different training runs.

Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
CVPR 2024 Spotlight

Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models

Unsupervised discovery of concepts and their interlayer connections.

Understanding Video Transformers via Universal Concept Discovery
CVPR 2024 Spotlight

Understanding Video Transformers via Universal Concept Discovery

We discover universal spatiotemporal concepts in video transformers.

A Deeper Dive Into What Deep Spatiotemporal Networks Encode
CVPR 2022

A Deeper Dive Into What Deep Spatiotemporal Networks Encode

We develop a new metric for quantifying static and dynamic information in deep spatiotemporal models.

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs
ICCV 2021

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs

We show how spatial position information is encoded along the channel dimensions after pooling layers.

Shape or Texture: Understanding Discriminative Features in CNNs
ICLR 2021

Shape or Texture: Understanding Discriminative Features in CNNs

We develop a new metric for shape and texture information encoded in CNNs.

Feature Binding with Category-Dependent MixUp for Semantic Segmentation and Adversarial Robustness
BMVC 2020 Oral

Feature Binding with Category-Dependent MixUp for Semantic Segmentation and Adversarial Robustness

Source separation augmentation improves semantic segmentation and robustness.

News

2026
Joining Goodfire.AI to work on mechanistic interpretability!
Aug 2025
Officially defended my PhD!
May 2025
Two papers accepted to ICML 2025 improving interpretability methods with SAEs! Universal SAEs and Archetypal SAE.
2024
Joined FAR.AI as a Research Resident working on understanding LLM persuasive capabilities.
2024
Gave invited talks at David Bau's Lab (Northeastern) and Thomas Serre's Lab (Brown).
2024
Paper accepted to TPAMI — Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks. Paper.
2024
TWO papers accepted as Highlights at CVPR 2024! VTCD and VCC.
2023
Research internship at Toyota Research Institute in Palo Alto.
2023
Awarded the NSERC CGS-D Scholarship ($105,000).
2022
Paper accepted to CVPR 2022 and Spotlight at XAI4CV Workshop.
2021
Papers accepted to ICCV 2021 and ICLR 2021.
2020
Paper accepted as an Oral at BMVC 2020.