Skip to content

samuelstevens/saev

Repository files navigation

saev - Sparse Auto-Encoders for Vision

Implementation of sparse autoencoders (SAEs) for vision transformers (ViTs) in PyTorch.

About

saev is a package for training sparse autoencoders (SAEs) on vision transformers (ViTs) in PyTorch. It also includes an interactive webapp for looking through a trained SAE's features.

Originally forked from HugoFry who forked it from Joseph Bloom.

Read logbook.md for a detailed log of my thought process.

See related-work.md for a list of works training SAEs on vision models. Please open an issue or a PR if there is missing work.

Installation

Installation is supported with uv. saev will likely work with pure pip, conda, etc. but I will not formally support it.

To install, clone this repository (maybe fork it first if you want).

In the project root directory, run uv run python -m saev --help. The first invocation should create a virtual environment and show a help message.

Using saev

See the docs for an overview.

I recommend using the llms.txt file as a way to use any LLM provider to ask questions. For example, you can run curl https://samuelstevens.me/saev/llms.txt | pbcopy on macOS to copy the text, then paste it into https://claude.ai and ask any question you have.

Roadmap

  1. Train models with data scaling (norm, mean) turned on.
  2. Train models on ViT-L/14 datasets.
  3. Semantic segmentation baseline with linear probe.
  4. ADE20K experiment to demonstrate faithfulness.

About

Sparse autoencoders for vision

Resources

License

Stars

Watchers

Forks

Languages