Releases: adamkarvonen/SAEBench
v0.4.0
v0.4.0 (2025-02-22)
Chore
-
chore: making test less flaky (
3effa18
) -
chore: fix updated torch types (
4c46da6
) -
chore: fixing linting errors and adding precommit hook (
85f6241
)
Feature
- feat: allow setting the artifacts path (
2a4b4dc
)
Fix
-
fix: gracefully handle slashes in model filename for autointerp (
5d6464a
) -
fix: fix typing and updating mdl for saelens >=5.4.0 (
802d1c3
) -
fix: load probe class with weights_only = False (
f05bf40
) -
fix: Update README to include eval output schema update instructions (
f0adee2
) -
fix: Update json schema jsons (
2b2a6d3
)
Unknown
- Merge pull request #60 from chanind/deflaking-test
chore: making test less flaky (963f2e8
)
-
Remove threshold from state dict if we aren't using it (
d91a218
) -
Merge pull request #59 from chanind/artifacts-path-option
feat: allow setting the artifacts path (53901a2
)
- Merge pull request #58 from chanind/fixing-types
chore: fix updated torch types (849018f
)
- Merge pull request #57 from chanind/fix-slash-in-model-name-autointerp
fix: gracefully handle slashes in model filename for autointerp (11b2e38
)
-
adding artifacts_path to unlearning eval (
ce1de32
) -
By default we don't use a threshold for custom topk SAEs (
60579ed
) -
Merge pull request #56 from chanind/type-fixes
fix: fix typing and updating mdl for saelens >=5.4.0 (0888d07
)
- Merge pull request #55 from chanind/precommit-check
chore: fixing linting errors and adding precommit hook (7ac7ced
)
-
Fix SAE Bench SAEs repo names (
18dc457
) -
Prevent potential division by zero (
92315dd
) -
Add optional pinned dependencies (
e74f0cf
) -
Calculate featurewise statistics in demo (
5204b48
) -
Improve documentation on custom SAE usage (
f15fe53
) -
Merge pull request #53 from adamkarvonen/hide_absorption_stddev
hide stddev from default display for absorption (155afbc
)
-
hide stddev from default display for absorption (
d970f05
) -
Merge pull request #52 from adamkarvonen/update_scr_tpp
update scr_tpp_schema to show top 20 by default (f551e7b
)
-
update scr_tpp_schema to show top 20 by default (
59320e2
) -
Merge pull request #51 from adamkarvonen/update_schema_jsons
fix: Update eval output schema jsons (7b2021c
)
-
Add computational requirements (
9b621a9
) -
Improve graphing notebook, include matryoshka results in graphs (
f2d1d98
) -
Merge pull request #50 from chanind/lint-and-type-check
chore: Adding formatting, linting and type checking (a0fb5e9
)
-
adding README and Makefile with helpers (
7452eca
) -
fixing linting and type-checking issues (
e663e3a
) -
formatting with ruff (
14dad45
) -
Check that unlearning data exists before running unlearning eval (
294b25c
) -
Improve export notebook (
e2b0b3c
) -
Improve graphing utils (
661920d
) -
Fix spelling (
8c0df93
) -
Add standard deviation for absorption / autointerp, store results per class for sparse probing / tpp for potential error bars (
141aff7
) -
Use GPU probing in correct location (
ec5efa8
)
v0.3.2
v0.3.1
v0.3.0
v0.3.0 (2025-01-13)
Feature
- feat: Add a frac alive calculation to core (
0399550
)
Unknown
- added absorption fraction metric (#48)
feat: added absorption fraction metric
-
Small fixes
-
remove unused FeatureAbsorptionCalculator._filter_prompts function
Co-authored-by: Demian Till <[email protected]> (7545ee3
)
v0.2.0
v0.1.0
v0.1.0 (2025-01-09)
Feature
- feat: EvalOutput and EvalConfig base classes to allow easy JSON schema export (
537219a
)
Fix
-
fix: eval_result_unstructured should be optional (
38e81b0
) -
fix: dump to json file correctly (
5f1cf15
)
Unknown
feat: Setting up Python packaging and autodeploy with Semantic Release (e52a418
)
-
Merge branch 'main' into packaging (
9bc22a4
) -
Merge branch 'main' into packaging (
bb10234
) -
Update SAE Bench demo to use new graphing functions (
9bbfdc5
) -
switching to poetry and setting up CI (
a9af271
) -
Add option to pass in arbitrary sae_class (
e450661
) -
Mention dictionary_learning (
c140e71
) -
Update graphing notebook to work with filenames (
dc6f951
) -
deprecate graphing notebook (
67118ee
) -
migrating to sae_bench base dir (
bb8e145
) -
Use a smaller batch size for unlearning (
3a099d2
) -
Reduce memory usage by only caching required activations (
f026998
) -
Remove debugging check (
8ea7162
) -
Add sanity checks before major run (
0908b18
) -
Improve normalization check (
16a3c0e
) -
Add normalization for batchtopk SAEs (
6a031bd
) -
Add matroyshka loader (
1078899
) -
Add pythia 160m (
b219497
) -
simplify process of evaluating dictionary learning SAEs (
c2dca52
) -
Add a script to run evals on dictionary learning SAEs (
3f4139b
) -
Make the layer argument optional (
e53675d
) -
Add batch_top_k, top_k, gated, and jump_relu implementations (
9a7fce8
) -
Add a function to test the saes (
864b4b3
) -
Update demo for new relu sae setup (
5d04ce5
) -
Ensure loaded SAEs are on correct dtype and device (
a5d6d62
) -
Create a base SAE class (
8fcc9fe
) -
Add blog post link (
2d47229
) -
cleanup README (
0e724df
) -
Clean up graphing notebook (
c08f3f5
) -
Graph results for all evals in demo notebook (
29ac97b
) -
Clean up for release (
1c9822c
) -
Include baseline pca in every graph. (
a45afd2
) -
Clean up plot legends, support graphing subplots (
7ade8b0
) -
Merge pull request #45 from adamkarvonen/update_jsonschemas
update jsonschemas (879c7ca
)
-
update jsonschemas (
a14d465
) -
Use notebook as default demo, mention in README (
298796b
) -
Minor fixes to demo (
05808c7
) -
Add missing batch size argument (
877f2e7
) -
Fixes for changes to eval config formats (
e0cb629
) -
Add an optional best of k graphing cell (
081b59c
) -
Ignore any folder containing "eval_results" (
12f8d66
) -
Add cell to add training tokens to config dictionaries (
38173c9
) -
Also plot all sae bench checkpoints (
93563e0
) -
Add eval links (
2216f99
) -
rename core results to match convention (
51e47fd
) -
Ignore autointerp with generations when downloading (
aa20644
) -
Use != instead of > for L0 measurement (
83504b7
) -
Add utility cell for removing llm generations (
67c9b03
) -
Add utility cell for splitting up files by release name (
3cc51ea
) -
Add force rerun option to core, match sae loading to other evals (
8676d5d
) -
Improve plotting of results (
89e5567
) -
Consolidate SAE loading and output locations (
293b385
) -
Plot generator for SAE Bench (
c2cb78e
) -
Add utility notebook for adding sae configs (
8508a01
) -
Improve custom SAE usage (
e959f65
) -
Improve graphing (
490cd2a
) -
Fix failing tests (
ed88f65
) -
match core output filename with others (
8ca0787
) -
Remove del sae flag (
feaf1f8
) -
Add current status to repo (
9c95af7
) -
Add sae config to output file (
b2fbd6d
) -
Add a flag for k sparse probing batch size (
6f2e38f
) -
Merge pull request #44 from adamkarvonen/absorption-tweaks-2
improving memory usage of k-sparse probing (6ae8235
)
- Merge pull request #43 from adamkarvonen/fake_branch
single...