Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about amateur context window #4

Open
marcofarina84 opened this issue Nov 14, 2022 · 3 comments
Open

Question about amateur context window #4

marcofarina84 opened this issue Nov 14, 2022 · 3 comments

Comments

@marcofarina84
Copy link

Dear @XiangLi1999 and @ari-holtzman,
if I understand correctly the paper, in section 3.4, mentions that the amateur (student) model is conditioned on a context window which starts from the last token of the prompt. I cannot find any trace of such a choice in the code, for instance here and here the whole input is passed to the amateur model, as seen by the expert too.
I cannot find the relative study in the ablation script either.

Am I missing some argument/logic that sets the amateur's context window somewhere else in the code?

Best,
Marco

@XiangLi1999
Copy link
Owner

It's handled here:

student_lm.prepare_inputs_for_generation = ignore_prefix_prepare_inputs_for_generation

@marcofarina84
Copy link
Author

Great, thanks!
Just one last clarification, I might be misunderstanding the code but it seems like the function is feeding to the amateur only the last generated token, so the amateur is computing $p(x_i|x_{i_1})$. Can you confirm it?
While section 3.4 of the paper seems to states that the amateur is conditioned on the last token of the prompt + all the generated tokens.

@XiangLi1999
Copy link
Owner

Hi,

I think the code is doing what section 3.4 states, conditioning on last token in prompt + generated tokens. You can verify this by printing the past_key_values argument. This works because of the caching implementation in huggingface, once a token is generated, it will be encoded as past_key_values to save some redundant computation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants