Welcome! This is a viewer for sparse autoencoders features trained in this paper

Pick a feature:

Interesting features:

GPT-4

Technical knowledge

GPT-2 small

Safety relevant features (found via attribution methods)