-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug Report] SAE training tutorial metrics do not match linked run #276
Comments
+1, I've been toying around with the library to get results from the wandb tutorial run, as well as these runs I have replicated the hyperparameters that were set in the gpt2 runs (linked above) to no avail. I suspect that later versions of the library introduced some changes which needs different hyperparameters? I don't have a good theory. Side note: @naterush your wandb run is private, other users cannot see the results! |
I've also run it several times and not managed to get anything with good loss curves – it plateaus very quickly around MSE loss of 200 and L1 loss of 165. |
Odd, I'll take a look.
…On Thu, Oct 3, 2024, 10:34 AM Kaden Uhlig ***@***.***> wrote:
I've also run it several times and not managed to get anything with good
loss curves – it plateaus very quickly around MSE loss of 200 and L1 loss
of 165.
—
Reply to this email directly, view it on GitHub
<#276 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQPMYZ2B6U3EQCZUI77QPOTZZV5ZXAVCNFSM6AAAAABNSWINMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJRHE3DANJQGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Any updates? |
Describe the bug
Hey. Working through the training tutorial, and without any changes, I'm unable to train a basic SAE with loss numbers that are as good as linked. Not sure if this is numerical instability, or something's changed, or if my differences are actually not consequential -- so I'm opening this issue to get tot he bottom of it!
My steps:
Differences between my training run and yours
There are a lot more differences - but wondering if you have thoughts on why this is. I'm new to SAE work generally, so any helpful tips here would be appreciated.
Code example
See notebook here
System Info
Checklist
The text was updated successfully, but these errors were encountered: