Research

Fundamental interpretability research to understand and intentionally design advanced AI systems

August 28, 2025

Finding the Tree of Life in Evo 2

Michael Pearce

Elana Simon

Michael Byun

Daniel Balsam

Read
August 21, 2025

Discovering Undesired Rare Behaviors via Model Diff Amplification

Santiago Aranguri

Thomas McGrath

Read
August 5, 2025

The Circuits Research Landscape: Results and Perspectives

Jack Lindsey

Emmanuel Ameisen

Neel Nanda

Stepan Shabalin

Mateusz Piotrowski

Read
June 28, 2025

Towards Scalable Parameter Decomposition

Lucius Bushnaq

Dan Braun

Lee Sharkey

Read
June 11, 2025

Replicating Circuit Tracing for a Simple Known Mechanism

Max Loeffler

Owen Lewis

Thomas McGrath

Connor Watts

Jack Merullo

Read
May 27, 2025

Painting With Concepts Using Diffusion Model Latents

Nick Cammarata

Mark Bissell

Nam Nguyen

Max Loeffler

Eric Ho

Read
April 15, 2025

Under the Hood of a Reasoning Model

Dron Hazra

Max Loeffler

Murat Cubuktepe

Levon Avagyan

Liv Gorton

Read
February 20, 2025

Interpreting Evo 2: Arc Institute's Next-Generation Genomic Foundation Model

Liv Gorton

Nicholas Wang

Nam Nguyen

Myra Deng

Eric Ho

Read
December 23, 2024

Mapping the Latent Space of Llama 3.3 70B

Thomas McGrath

Daniel Balsam

Liv Gorton

Murat Cubuktepe

Myra Deng

Read
September 25, 2024

Understanding and Steering Llama 3 with Sparse Autoencoders

Thomas McGrath

Daniel Balsam

Myra Deng

Eric Ho

Read

Contact us

Interested in Goodfire Ember?