"Rank classification" in evaluation for multiple choices #42

yuchenlin · 2023-03-06T21:42:52Z

Hi,

Thanks for the repo! I was wondering if you would please point out which lines of code are for the "rank classification" idea used for evaluating the multiple-choice style tasks?

The paper describes it like this on Page 6:

For tasks that involve choosing the correct completion from several options (e.g. multiple choice
question answering), we follow Brown et al. (2020) and use rank classification to evaluate our
model: we compute the log-likelihood of each of the target options under the fine-tuned model and
select the option with the highest log-likelihood as the prediction. For simplicity, we do not apply
length normalization to the log-likelihoods of the target options.

Thank you!

yuchenlin · 2023-03-06T21:56:01Z

Ah I think I found it here in the forward function of the customized EncoderDecoderModel class:

t-zero/t0/model.py

Line 56 in 25c0761

def forward(self, batch) -> torch.Tensor:

However, I was wondering if you would please help give a short tutorial that how we can use the same idea to easily evaluate other LMs (say a fine-tuned BART) to make sure the comparisons are fair.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Rank classification" in evaluation for multiple choices #42

"Rank classification" in evaluation for multiple choices #42

yuchenlin commented Mar 6, 2023

yuchenlin commented Mar 6, 2023

"Rank classification" in evaluation for multiple choices #42

"Rank classification" in evaluation for multiple choices #42

Comments

yuchenlin commented Mar 6, 2023

yuchenlin commented Mar 6, 2023