Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncaught KeyError: 'lastCommit' #1853

Closed
severo opened this issue Nov 22, 2023 · 3 comments
Closed

Uncaught KeyError: 'lastCommit' #1853

severo opened this issue Nov 22, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@severo
Copy link
Collaborator

severo commented Nov 22, 2023

Seen in datasets-server
image

The exception was not caught by huggingface_hub. It seems to be temporary.

@severo severo changed the title KeyError: "'lastCommit'" KeyError: 'lastCommit' Nov 22, 2023
@severo severo changed the title KeyError: 'lastCommit' Uncaught "KeyError: 'lastCommit'" Nov 22, 2023
@severo severo changed the title Uncaught "KeyError: 'lastCommit'" Uncaught KeyError: 'lastCommit' Nov 22, 2023
@severo severo added the bug Something isn't working label Nov 22, 2023
@Wauplin
Copy link
Contributor

Wauplin commented Nov 22, 2023

Hmm, do you have the full value for line 16 (to know where it failed in huggingface_hub)?
I quickly ctrl+f the codebase for lastCommit and the 2 only time we parse it, we do kwargs.pop("lastCommit", None) => shouldn't fail with KeyError 🤔

@lhoestq
Copy link
Member

lhoestq commented Nov 22, 2023

{
    "_id": {
        "$oid": "655ba8c06fde7831e3431602"
    },
    "config": "20231101.ast",
    "dataset": "wikimedia/wikipedia",
    "kind": "config-parquet-and-info",
    "split": null,
    "content": {
        "error": "'lastCommit'"
    },
    "dataset_git_revision": "8cc706e4538bc6ec75df1bda9ea8c8057df1841f",
    "details": {
        "error": "'lastCommit'",
        "cause_exception": "KeyError",
        "cause_message": "'lastCommit'",
        "cause_traceback": [
            "Traceback (most recent call last):\n",
            "  File \"/src/services/worker/src/worker/job_manager.py\", line 170, in process\n    job_result = self.job_runner.compute()\n",
            "  File \"/src/services/worker/src/worker/job_runners/config/parquet_and_info.py\", line 1296, in compute\n    compute_config_parquet_and_info_response(\n",
            "  File \"/src/services/worker/src/worker/job_runners/config/parquet_and_info.py\", line 1155, in compute_config_parquet_and_info_response\n    builder = load_dataset_builder(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py\", line 1814, in load_dataset_builder\n    dataset_module = dataset_module_factory(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py\", line 1511, in dataset_module_factory\n    raise e1 from None\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py\", line 1488, in dataset_module_factory\n    return HubDatasetModuleFactoryWithoutScript(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py\", line 1080, in get_module\n    builder_configs, default_config_name = create_builder_configs_from_metadata_configs(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py\", line 558, in create_builder_configs_from_metadata_configs\n    config_data_files_dict = DataFilesDict.from_patterns(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/data_files.py\", line 686, in from_patterns\n    DataFilesList.from_patterns(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/data_files.py\", line 591, in from_patterns\n    resolve_pattern(\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/datasets/data_files.py\", line 345, in resolve_pattern\n    fs, _, _ = get_fs_token_paths(pattern, storage_options=storage_options)\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py\", line 640, in get_fs_token_paths\n    paths = [f for f in sorted(fs.glob(paths)) if not fs.isdir(f)]\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py\", line 602, in glob\n    allpaths = self.find(root, maxdepth=depth, withdirs=True, detail=True, **kwargs)\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py\", line 495, in find\n    for _, dirs, files in self.walk(path, maxdepth, detail=True, **kwargs):\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py\", line 418, in walk\n    listing = self.ls(path, detail=True, **kwargs)\n",
            "  File \"/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_file_system.py\", line 317, in ls\n    \"last_modified\": parse_datetime(tree_item[\"lastCommit\"][\"date\"]),\n",
            "KeyError: 'lastCommit'\n"
        ]
    },
    "error_code": "UnexpectedError",
    "http_status": {
        "$numberInt": "500"
    },
    "job_runner_version": {
        "$numberInt": "4"
    },
    "progress": null,
    "updated_at": {
        "$date": {
            "$numberLong": "1700605111421"
        }
    }
}

@Wauplin
Copy link
Contributor

Wauplin commented Nov 22, 2023

Thanks! This issue has been fixed by @mariosasko in #1809. The fix is already on main but not in 0.19.4.

@Wauplin Wauplin closed this as completed Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants