How can I populate the FAIL_TO_PASS and PASS_TO_PASS fields? #287

ErsjanKeri · 2025-01-15T17:21:36Z

Describe the issue

I have been trying to recreate the dataset, but for other repositories.
I am testing out first if I can manage to do it for fastapi, since it is a big python repository.

So what I have done so far is:

I have fetched the pull requests utilising the /collect/ submodule, it has generated for me a jsonl file, lets call the location ./data/fastapi-task-instances.jsonl.all
Afterwards I went to /versioning/ submodule, calling the get_versions.py file on this dataset and now having the file: ./data/fastapi-task-instances.json
The last dataset containing all task instances and its proper version field added to the json I have converted to a hugging face dataset saved locally, ./data/fastapi_hf containing only a train split. The current keys are:

  "citation": "",
  "description": "",
  "features": {
    "repo": {
      "dtype": "string",
      "_type": "Value"
    },
    "pull_number": {
      "dtype": "int64",
      "_type": "Value"
    },
    "instance_id": {
      "dtype": "string",
      "_type": "Value"
    },
    "issue_numbers": {
      "feature": {
        "dtype": "string",
        "_type": "Value"
      },
      "_type": "Sequence"
    },
    "base_commit": {
      "dtype": "string",
      "_type": "Value"
    },
    "patch": {
      "dtype": "string",
      "_type": "Value"
    },
    "test_patch": {
      "dtype": "string",
      "_type": "Value"
    },
    "problem_statement": {
      "dtype": "string",
      "_type": "Value"
    },
    "hints_text": {
      "dtype": "string",
      "_type": "Value"
    },
    "created_at": {
      "dtype": "string",
      "_type": "Value"
    },
    "version": {
      "dtype": "string",
      "_type": "Value"
    }
  },
  "homepage": "",
  "license": ""
}

Afterwards in /inference/make_datasets/ submodule I have executed the create_text_dataset.py on this file, now adding the FAIL_TO_PASS and PASS_TO_PASS columns into the dataset too. I have saved the current dataset as: "./data/fastapi-text-ds", this one contains a train and validation split
Afterwards I executed run_api.py on ./data/fastapi-text-ds, which has generated me a new dataset containing the responses from the LLM, it is saved under: "./output/fastapi.jsonl" which contains the model_patch our main goal.

So the final step now would be to run: python3 -m swebench.harness.run_evaluation \ --dataset_name ./data/fastapi_hf \ --predictions_path ./outputs/output/fastapi.jsonl \ --max_workers 4 \ --run_id first_pilot_run --split train

So assuming all the steps have been correct (please let me know if I have missed any step), here I am facing two errors, first one I solved it was simply that dataset_name was not expected to be loaded from disk in the utils.py, but what the issue I am facing is that and I noticed lately is that the fields of the main dataset containing the PASS_TO_PASS and FAIL_TO_PASS are only empty strings.

error coming from test_spec.py line ~300

    def _from_json_or_obj(key: str) -> Any:
        """If key points to string, load with json"""
        if isinstance(instance[key], str):
            return json.loads(instance[key])
        return instance[key]

-> happening because instance["PASS_TO_PASS"] is equal with "", I believe it should not be the case.

I am not sure, where should the evaluation of the test_patch has taken place? In the paper it is described why this process happens but not exactly at which stage/part it should happen.
Would be very grateful of some support on this matter, if I have missed any steps and how to finally run the evaluation on the dataset I have currently collected.
Thank you in advance

Suggest an improvement to documentation

No response

The text was updated successfully, but these errors were encountered:

brad-kenstler · 2025-01-15T18:43:10Z

I couldn't find any logic for populating this field anywhere. Also curious!!!

ErsjanKeri · 2025-01-21T07:27:32Z

Is the repository not maintained anymore? because how to populate PASS_TO_PASS and FAIL_TOP_PASS is crucial for recreating the experiments on the research paper

ErsjanKeri added the documentation Improvements or additions to documentation label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I populate the FAIL_TO_PASS and PASS_TO_PASS fields? #287

How can I populate the FAIL_TO_PASS and PASS_TO_PASS fields? #287

ErsjanKeri commented Jan 15, 2025

brad-kenstler commented Jan 15, 2025

ErsjanKeri commented Jan 21, 2025

How can I populate the FAIL_TO_PASS and PASS_TO_PASS fields? #287

How can I populate the FAIL_TO_PASS and PASS_TO_PASS fields? #287

Comments

ErsjanKeri commented Jan 15, 2025

Describe the issue

Suggest an improvement to documentation

brad-kenstler commented Jan 15, 2025

ErsjanKeri commented Jan 21, 2025