perf: support early-stopping in HF models #430

chanind · 2025-02-17T23:28:58Z

Description

This PR supports early stopping in the huggingface wrapper. This is a perf improvement, so we don't need to run the whole model when all we want to do is extract some activations from an intermediate layer.

Fixes #429

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
performance improvement

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

You have tested formatting, typing and tests

I have run make check-ci to check format and linting. (you can run make format to format code if needed.)

rhaps0dy · 2025-02-17T23:34:09Z

sae_lens/load_model.py

+        stop_hook = None
+        if stop_at_layer is not None:
+            if return_type != "logits":
+                raise NotImplementedError(


The fact that it's a NotImplementedError instead of eg ValueError or RuntimeError implies that it can be supported in the future -- is that true?

If it's true what is it -- returning (None, None) instead?

Yeah there's nothing keeping this from being implemented in the future - it's just that SAELens currently doesn't use more of TLens API, so there's no point in implementing support for more options here beyond what SAELens actually uses. This class is just a wrapper around Huggingface models to make them look like a TLens model for SAELens to work wtih.

codecov · 2025-02-17T23:38:10Z

Codecov Report

Attention: Patch coverage is 80.95238% with 8 lines in your changes missing coverage. Please review.

Project coverage is 74.55%. Comparing base (8760f9e) to head (c29576d).

Files with missing lines	Patch %	Lines
sae_lens/load_model.py	80.95%	3 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #430      +/-   ##
==========================================
+ Coverage   74.44%   74.55%   +0.10%     
==========================================
  Files          19       19              
  Lines        3146     3183      +37     
  Branches      456      464       +8     
==========================================
+ Hits         2342     2373      +31     
- Misses        648      650       +2     
- Partials      156      160       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

anthonyduong9 · 2025-03-02T01:02:24Z

sae_lens/load_model.py

@@ -72,6 +75,7 @@ def __init__(self, model: torch.nn.Module, tokenizer: PreTrainedTokenizerBase):
        super().__init__()
        self.model = model
        self.tokenizer = tokenizer
+        self.decoder_block_matcher = guess_decoder_block_matcher(model)


Is effectively getting the layer names from _guess_block_matcher_from_layers() the best we can do? I'm afraid it's error-prone and implicit.

perf: support early-stopping in HF models

c29576d

chanind requested a review from anthonyduong9 February 17, 2025 23:28

rhaps0dy reviewed Feb 17, 2025

View reviewed changes

anthonyduong9 reviewed Mar 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: support early-stopping in HF models #430

perf: support early-stopping in HF models #430

chanind commented Feb 17, 2025

rhaps0dy Feb 17, 2025

chanind Feb 17, 2025

codecov bot commented Feb 17, 2025

anthonyduong9 Mar 2, 2025

perf: support early-stopping in HF models #430

Are you sure you want to change the base?

perf: support early-stopping in HF models #430

Conversation

chanind commented Feb 17, 2025

Description

Type of change

Checklist:

You have tested formatting, typing and tests

rhaps0dy Feb 17, 2025

Choose a reason for hiding this comment

chanind Feb 17, 2025

Choose a reason for hiding this comment

codecov bot commented Feb 17, 2025

Codecov Report

anthonyduong9 Mar 2, 2025

Choose a reason for hiding this comment