Skip to content

Commit

Permalink
Working version of faithfulness dashboard with SAE directions!
Browse files Browse the repository at this point in the history
  • Loading branch information
samuelstevens committed Nov 23, 2024
1 parent 6407c04 commit 7d346f4
Show file tree
Hide file tree
Showing 7 changed files with 523 additions and 256 deletions.
22 changes: 22 additions & 0 deletions contrib/faithfulness/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Faithfulness

This module demonstrates that SAE features are faithful and that the underlying vision model does in fact depend on the features to make its predictions.

It demonstrates this through an interactive dashboard and through larger-scale quantitative experiments.

## Dashboard

First, record activations for the ADE20K dataset.

```sh
uv run python -m saev activations \
--model-group clip \
--model-ckpt ViT-B-16/openai \
--d-vit 768 \
--n-patches-per-img 196 \
--layers -2 \
--dump-to /local/scratch/$USER/cache/saev \
--n-patches-per-shard 2_4000_000 \
data:ade20k-dataset \
--data.root /research/nfs_su_809/workspace/stevens.994/datasets/ade20k/images
```
Loading

0 comments on commit 7d346f4

Please sign in to comment.