Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] b_dec_init_method="geometric_median" incompatible with normalize_activations="expected_average_only_in" #439

Open
1 task done
keltin13 opened this issue Mar 1, 2025 · 0 comments · May be fixed by #440
Open
1 task done

Comments

@keltin13
Copy link

keltin13 commented Mar 1, 2025

If you are submitting a bug report, please fill in the following details and use the tag [bug].

Describe the bug
Setting b_dec_init_method="geometric_median" and normalize_activations="expected_average_only_in" in the LanguageModelSAERunnerConfig causes an error because estimated_norm_scaling_factor has not been set.

Code example
Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.

You can either reproduce this in the test_sae_training_runner_works_with_huggingface_models test in tests/training/test_sae_training_runner.py, or in the "Training a Sparse Autoencoder" notebook.

To reproduce in the tests, add these lines to the call to build_sae_cfg in the test_sae_training_runner_works_with_huggingface_models test in tests/training/test_sae_training_runner.py:

        normalize_activations="expected_average_only_in",
        b_dec_init_method="geometric_median",

To reproduce in the "Training a Sparse Autoencoder" tutorial, just set b_dec_init_method="geometric_median". If you set normalize_activations to anything besides "expected_average_only_in", it will run without error. If you set normalize_activations="expected_average_only_in" then you get the error.

Both ways result in the same error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-6-7c36e6ae5326>](https://localhost:8080/#) in <cell line: 0>()
     60 )
     61 # look at the next cell to see some instruction for what to do while this is running.
---> 62 sparse_autoencoder = SAETrainingRunner(cfg).run()

5 frames
[/usr/local/lib/python3.11/dist-packages/sae_lens/sae_training_runner.py](https://localhost:8080/#) in __init__(self, cfg, override_dataset, override_model, override_sae)
     83                     )
     84                 )
---> 85                 self._init_sae_group_b_decs()
     86         else:
     87             self.sae = override_sae

[/usr/local/lib/python3.11/dist-packages/sae_lens/sae_training_runner.py](https://localhost:8080/#) in _init_sae_group_b_decs(self)
    169 
    170         if self.cfg.b_dec_init_method == "geometric_median":
--> 171             layer_acts = self.activations_store.storage_buffer.detach()[:, 0, :]
    172             # get geometric median of the activations if we're using those.
    173             median = compute_geometric_median(

[/usr/local/lib/python3.11/dist-packages/sae_lens/training/activations_store.py](https://localhost:8080/#) in storage_buffer(self)
    476         if self._storage_buffer is None:
    477             self._storage_buffer = _filter_buffer_acts(
--> 478                 self.get_buffer(self.half_buffer_size), self.exclude_special_tokens
    479             )
    480 

[/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py](https://localhost:8080/#) in decorate_context(*args, **kwargs)
    114     def decorate_context(*args, **kwargs):
    115         with ctx_factory():
--> 116             return func(*args, **kwargs)
    117 
    118     return decorate_context

[/usr/local/lib/python3.11/dist-packages/sae_lens/training/activations_store.py](https://localhost:8080/#) in get_buffer(self, n_batches_in_buffer, raise_on_epoch_end, shuffle)
    707         # every buffer should be normalized:
    708         if self.normalize_activations == "expected_average_only_in":
--> 709             new_buffer_activations = self.apply_norm_scaling_factor(
    710                 new_buffer_activations
    711             )

[/usr/local/lib/python3.11/dist-packages/sae_lens/training/activations_store.py](https://localhost:8080/#) in apply_norm_scaling_factor(self, activations)
    423     def apply_norm_scaling_factor(self, activations: torch.Tensor) -> torch.Tensor:
    424         if self.estimated_norm_scaling_factor is None:
--> 425             raise ValueError(
    426                 "estimated_norm_scaling_factor is not set, call set_norm_scaling_factor_if_needed() first"
    427             )

ValueError: estimated_norm_scaling_factor is not set, call set_norm_scaling_factor_if_needed() first

System Info

On my machine:

  • sae-lens was cloned from github and installed with poetry as specified by the Contributing guidelines.
  • macOS
  • Python 3.10.10

On Colab:

  • sae-lens (v5.5.2) and transformer-lens (v2.15.0) were installed via pip
  • Linux
  • Python 3.11.11

Additional context

None

Checklist

  • I have checked that there is no similar issue in the repo (required)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant