Skip to content

Could you provide scripts to reproduce the results? #7

Open
@hzhwcmhf

Description

@hzhwcmhf

@XiangLi1999 Thanks for your great work!

I am trying to reproduce the results on wikitext but meet some problems.

I use your script:

python run_generation.py --model_name_or_path gpt2-xl --model_type gpt2 --length 256 --prompt_file wikitext --student_name_or_path gpt2 --st_coef 1.0   --student_temperature 0.5  --outfile outputs/temp_out.jsonl    --ignore_prefix no

And then evaluate the output file by:

python eval_script.py ./outputs/temp_out.jsonl

The output is

{'name': './outputs/temp_out.jsonl', 'rep-2': 9.5, 'rep-3': 1.87, 'rep-4': 0.4, 'diversity': 0.8845241939999999, 'mauve': 0.8812567264373257, 'coherence': 0.5913593170305366} (I disable other metrics)

which is different from reported results in the paper (coherence = 0.59 v.s. 0.69).

I find that ./outputs_ignorePrefix_ccnews_256/wikitext_results/wikitext_gpt2-1.0-t0.5_gpt2-xl_256.jsonl can produces correct metric values. May I ask two questions:

  1. What is the generation script used to produce the correct outputs?
  2. What does the values in wikitext_gpt2-1.0-t0.5_gpt2-xl_256.jsonl mean? For example, 256 seems output length, 0.5 is student temperature. What does 1.0 indicate?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions