Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repository Not Found for url: https://huggingface.co/api/models/ds4sd/docling-models/revision/v2.1.0. #923

Open
thistleknot opened this issue Feb 8, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@thistleknot
Copy link

Bug

...

Steps to reproduce

from docling.document_converter import DocumentConverter

source = r"C:\Users\User\Documents\wiki\wiki\RPG\microlite\2020\Microlite2020-Core-Rules-1.02.pdf"  # document per local path or URL
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown())  # output: "## Docling Technical Report[...]"

...

Docling version

2.20.0
python 3.10
windows 11
...

Python version

...

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File [h:\python310\lib\site-packages\huggingface_hub\utils\_http.py:406](file:///H:/python310/lib/site-packages/huggingface_hub/utils/_http.py#line=405), in hf_raise_for_status(response, endpoint_name)
    405 try:
--> 406     response.raise_for_status()
    407 except HTTPError as e:

File [h:\python310\lib\site-packages\requests\models.py:1024](file:///H:/python310/lib/site-packages/requests/models.py#line=1023), in Response.raise_for_status(self)
   1023 if http_error_msg:
-> 1024     raise HTTPError(http_error_msg, response=self)

HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/ds4sd/docling-models/revision/v2.1.0

The above exception was the direct cause of the following exception:

RepositoryNotFoundError                   Traceback (most recent call last)
Cell In[2], line 5
      3 source = r"C:\Users\User\Documents\wiki\wiki\RPG\microlite\2020\Microlite2020-Core-Rules-1.02.pdf"  # document per local path or URL
      4 converter = DocumentConverter()
----> 5 result = converter.convert(source)
      6 print(result.document.export_to_markdown())  # output: "## Docling Technical Report[...]"

File [h:\python310\lib\site-packages\pydantic\_internal\_validate_call.py:38](file:///H:/python310/lib/site-packages/pydantic/_internal/_validate_call.py#line=37), in update_wrapper_attributes.<locals>.wrapper_function(*args, **kwargs)
     36 @functools.wraps(wrapped)
     37 def wrapper_function(*args, **kwargs):
---> 38     return wrapper(*args, **kwargs)

File [h:\python310\lib\site-packages\pydantic\_internal\_validate_call.py:111](file:///H:/python310/lib/site-packages/pydantic/_internal/_validate_call.py#line=110), in ValidateCallWrapper.__call__(self, *args, **kwargs)
    110 def __call__(self, *args: Any, **kwargs: Any) -> Any:
--> 111     res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
    112     if self.__return_pydantic_validator__:
    113         return self.__return_pydantic_validator__(res)

File [h:\python310\lib\site-packages\docling\document_converter.py:203](file:///H:/python310/lib/site-packages/docling/document_converter.py#line=202), in DocumentConverter.convert(self, source, headers, raises_on_error, max_num_pages, max_file_size, page_range)
    185 @validate_call(config=ConfigDict(strict=True))
    186 def convert(
    187     self,
   (...)
    193     page_range: PageRange = DEFAULT_PAGE_RANGE,
    194 ) -> ConversionResult:
    195     all_res = self.convert_all(
    196         source=[source],
    197         raises_on_error=raises_on_error,
   (...)
    201         page_range=page_range,
    202     )
--> 203     return next(all_res)

File [h:\python310\lib\site-packages\docling\document_converter.py:226](file:///H:/python310/lib/site-packages/docling/document_converter.py#line=225), in DocumentConverter.convert_all(self, source, headers, raises_on_error, max_num_pages, max_file_size, page_range)
    223 conv_res_iter = self._convert(conv_input, raises_on_error=raises_on_error)
    225 had_result = False
--> 226 for conv_res in conv_res_iter:
    227     had_result = True
    228     if raises_on_error and conv_res.status not in {
    229         ConversionStatus.SUCCESS,
    230         ConversionStatus.PARTIAL_SUCCESS,
    231     }:

File [h:\python310\lib\site-packages\docling\document_converter.py:261](file:///H:/python310/lib/site-packages/docling/document_converter.py#line=260), in DocumentConverter._convert(self, conv_input, raises_on_error)
    252 _log.info(f"Going to convert document batch...")
    254 # parallel processing only within input_batch
    255 # with ThreadPoolExecutor(
    256 #    max_workers=settings.perf.doc_batch_concurrency
    257 # ) as pool:
    258 #   yield from pool.map(self.process_document, input_batch)
    259 # Note: PDF backends are not thread-safe, thread pool usage was disabled.
--> 261 for item in map(
    262     partial(self._process_document, raises_on_error=raises_on_error),
    263     input_batch,
    264 ):
    265     elapsed = time.monotonic() - start_time
    266     start_time = time.monotonic()

File [h:\python310\lib\site-packages\docling\document_converter.py:302](file:///H:/python310/lib/site-packages/docling/document_converter.py#line=301), in DocumentConverter._process_document(self, in_doc, raises_on_error)
    298 valid = (
    299     self.allowed_formats is not None and in_doc.format in self.allowed_formats
    300 )
    301 if valid:
--> 302     conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)
    303 else:
    304     error_message = f"File format not allowed: {in_doc.file}"

File [h:\python310\lib\site-packages\docling\document_converter.py:323](file:///H:/python310/lib/site-packages/docling/document_converter.py#line=322), in DocumentConverter._execute_pipeline(self, in_doc, raises_on_error)
    319 def _execute_pipeline(
    320     self, in_doc: InputDocument, raises_on_error: bool
    321 ) -> ConversionResult:
    322     if in_doc.valid:
--> 323         pipeline = self._get_pipeline(in_doc.format)
    324         if pipeline is not None:
    325             conv_res = pipeline.execute(in_doc, raises_on_error=raises_on_error)

File [h:\python310\lib\site-packages\docling\document_converter.py:289](file:///H:/python310/lib/site-packages/docling/document_converter.py#line=288), in DocumentConverter._get_pipeline(self, doc_format)
    283 # TODO this will ignore if different options have been defined for the same pipeline class.
    284 if (
    285     pipeline_class not in self.initialized_pipelines
    286     or self.initialized_pipelines[pipeline_class].pipeline_options
    287     != pipeline_options
    288 ):
--> 289     self.initialized_pipelines[pipeline_class] = pipeline_class(
    290         pipeline_options=pipeline_options
    291     )
    292 return self.initialized_pipelines[pipeline_class]

File [h:\python310\lib\site-packages\docling\pipeline\standard_pdf_pipeline.py:88](file:///H:/python310/lib/site-packages/docling/pipeline/standard_pdf_pipeline.py#line=87), in StandardPdfPipeline.__init__(self, pipeline_options)
     73 if (ocr_model := self.get_ocr_model(artifacts_path=artifacts_path)) is None:
     74     raise RuntimeError(
     75         f"The specified OCR kind is not supported: {pipeline_options.ocr_options.kind}."
     76     )
     78 self.build_pipe = [
     79     # Pre-processing
     80     PagePreprocessingModel(
     81         options=PagePreprocessingOptions(
     82             images_scale=pipeline_options.images_scale
     83         )
     84     ),
     85     # OCR
     86     ocr_model,
     87     # Layout model
---> 88     LayoutModel(
     89         artifacts_path=artifacts_path,
     90         accelerator_options=pipeline_options.accelerator_options,
     91     ),
     92     # Table structure model
     93     TableStructureModel(
     94         enabled=pipeline_options.do_table_structure,
     95         artifacts_path=artifacts_path,
     96         options=pipeline_options.table_structure_options,
     97         accelerator_options=pipeline_options.accelerator_options,
     98     ),
     99     # Page assemble
    100     PageAssembleModel(options=PageAssembleOptions()),
    101 ]
    103 # Picture description model
    104 if (
    105     picture_description_model := self.get_picture_description_model(
    106         artifacts_path=artifacts_path
    107     )
    108 ) is None:

File [h:\python310\lib\site-packages\docling\models\layout_model.py:54](file:///H:/python310/lib/site-packages/docling/models/layout_model.py#line=53), in LayoutModel.__init__(self, artifacts_path, accelerator_options)
     51 device = decide_device(accelerator_options.device)
     53 if artifacts_path is None:
---> 54     artifacts_path = self.download_models() / self._model_path
     55 else:
     56     # will become the default in the future
     57     if (artifacts_path / self._model_repo_folder).exists():

File [h:\python310\lib\site-packages\docling\models\layout_model.py:89](file:///H:/python310/lib/site-packages/docling/models/layout_model.py#line=88), in LayoutModel.download_models(local_dir, force, progress)
     87 if not progress:
     88     disable_progress_bars()
---> 89 download_path = snapshot_download(
     90     repo_id="ds4sd/docling-models",
     91     force_download=force,
     92     local_dir=local_dir,
     93     revision="v2.1.0",
     94 )
     96 return Path(download_path)

File [h:\python310\lib\site-packages\huggingface_hub\utils\_validators.py:114](file:///H:/python310/lib/site-packages/huggingface_hub/utils/_validators.py#line=113), in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    111 if check_use_auth_token:
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File [h:\python310\lib\site-packages\huggingface_hub\_snapshot_download.py:232](file:///H:/python310/lib/site-packages/huggingface_hub/_snapshot_download.py#line=231), in snapshot_download(repo_id, repo_type, revision, cache_dir, local_dir, library_name, library_version, user_agent, proxies, etag_timeout, force_download, token, local_files_only, allow_patterns, ignore_patterns, max_workers, tqdm_class, headers, endpoint, local_dir_use_symlinks, resume_download)
    225     raise LocalEntryNotFoundError(
    226         "Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and "
    227         "outgoing traffic has been disabled. To enable repo look-ups and downloads online, set "
    228         "'HF_HUB_OFFLINE=0' as environment variable."
    229     ) from api_call_error
    230 elif isinstance(api_call_error, RepositoryNotFoundError) or isinstance(api_call_error, GatedRepoError):
    231     # Repo not found => let's raise the actual error
--> 232     raise api_call_error
    233 else:
    234     # Otherwise: most likely a connection issue or Hub downtime => let's warn the user
    235     raise LocalEntryNotFoundError(
    236         "An error happened while trying to locate the files on the Hub and we cannot find the appropriate"
    237         " snapshot folder for the specified revision on the local disk. Please check your internet connection"
    238         " and try again."
    239     ) from api_call_error

File [h:\python310\lib\site-packages\huggingface_hub\_snapshot_download.py:155](file:///H:/python310/lib/site-packages/huggingface_hub/_snapshot_download.py#line=154), in snapshot_download(repo_id, repo_type, revision, cache_dir, local_dir, library_name, library_version, user_agent, proxies, etag_timeout, force_download, token, local_files_only, allow_patterns, ignore_patterns, max_workers, tqdm_class, headers, endpoint, local_dir_use_symlinks, resume_download)
    146 try:
    147     # if we have internet connection we want to list files to download
    148     api = HfApi(
    149         library_name=library_name,
    150         library_version=library_version,
   (...)
    153         headers=headers,
    154     )
--> 155     repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token)
    156 except (requests.exceptions.SSLError, requests.exceptions.ProxyError):
    157     # Actually raise for those subclasses of ConnectionError
    158     raise

File [h:\python310\lib\site-packages\huggingface_hub\utils\_validators.py:114](file:///H:/python310/lib/site-packages/huggingface_hub/utils/_validators.py#line=113), in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    111 if check_use_auth_token:
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File [h:\python310\lib\site-packages\huggingface_hub\hf_api.py:2748](file:///H:/python310/lib/site-packages/huggingface_hub/hf_api.py#line=2747), in HfApi.repo_info(self, repo_id, revision, repo_type, timeout, files_metadata, expand, token)
   2746 else:
   2747     raise ValueError("Unsupported repo type.")
-> 2748 return method(
   2749     repo_id,
   2750     revision=revision,
   2751     token=token,
   2752     timeout=timeout,
   2753     expand=expand,  # type: ignore[arg-type]
   2754     files_metadata=files_metadata,
   2755 )

File [h:\python310\lib\site-packages\huggingface_hub\utils\_validators.py:114](file:///H:/python310/lib/site-packages/huggingface_hub/utils/_validators.py#line=113), in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    111 if check_use_auth_token:
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File [h:\python310\lib\site-packages\huggingface_hub\hf_api.py:2533](file:///H:/python310/lib/site-packages/huggingface_hub/hf_api.py#line=2532), in HfApi.model_info(self, repo_id, revision, timeout, securityStatus, files_metadata, expand, token)
   2531     params["expand"] = expand
   2532 r = get_session().get(path, headers=headers, timeout=timeout, params=params)
-> 2533 hf_raise_for_status(r)
   2534 data = r.json()
   2535 return ModelInfo(**data)

File [h:\python310\lib\site-packages\huggingface_hub\utils\_http.py:454](file:///H:/python310/lib/site-packages/huggingface_hub/utils/_http.py#line=453), in hf_raise_for_status(response, endpoint_name)
    435 elif error_code == "RepoNotFound" or (
    436     response.status_code == 401
    437     and response.request is not None
   (...)
    444     # => for now, we process them as `RepoNotFound` anyway.
    445     # See https://gist.github.com/Wauplin/46c27ad266b15998ce56a6603796f0b9
    446     message = (
    447         f"{response.status_code} Client Error."
    448         + "\n\n"
   (...)
    452         " make sure you are authenticated."
    453     )
--> 454     raise _format(RepositoryNotFoundError, message, response) from e
    456 elif response.status_code == 400:
    457     message = (
    458         f"\n\nBad request for {endpoint_name} endpoint:" if endpoint_name is not None else "\n\nBad request:"
    459     )

RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-67a77bf7-6e28b5cf441444b8142bc699;bb1fd4a4-a602-460b-b033-5d2d8f4f066e)

Repository Not Found for url: https://huggingface.co/api/models/ds4sd/docling-models/revision/v2.1.0.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid credentials in Authorization header

@thistleknot thistleknot added the bug Something isn't working label Feb 8, 2025
@huijunwu
Copy link

saw the same error

(venv-torch-2.4.1) pig@z790:~/study-notes-2023-05$ docling --version
Docling version: 2.21.0
Docling Core version: 2.18.0
Docling IBM Models version: 3.3.1
Docling Parse version: 3.3.0
Python: cpython-312 (3.12.3)
Platform: Linux-6.11.0-17-generic-x86_64-with-glibc2.39
(venv-torch-2.4.1) pig@z790:~/study-notes-2023-05$ docling-tools models download
Downloading layout model...                                                                                                                                             model_downloader.py:36
Traceback (most recent call last):
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
    response.raise_for_status()
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/ds4sd/docling-models/revision/v2.1.0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/bin/docling-tools", line 8, in <module>
    sys.exit(app())
             ^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/typer/main.py", line 338, in __call__
    raise e
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/typer/main.py", line 321, in __call__
    return get_command(self)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/typer/core.py", line 728, in main
    return _main(
           ^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/typer/core.py", line 197, in _main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/typer/main.py", line 703, in wrapper
    return callback(**use_params)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/docling/cli/models.py", line 77, in download
    output_dir = download_models(
                 ^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/docling/utils/model_downloader.py", line 37, in download_models
    LayoutModel.download_models(
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/docling/models/layout_model.py", line 89, in download_models
    download_path = snapshot_download(
                    ^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/_snapshot_download.py", line 232, in snapshot_download
    raise api_call_error
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/_snapshot_download.py", line 155, in snapshot_download
    repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 2704, in repo_info
    return method(
           ^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 2489, in model_info
    hf_raise_for_status(r)
  File "/home/pig/study-notes-2023-05/venv-torch-2.4.1/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 454, in hf_raise_for_status
    raise _format(RepositoryNotFoundError, message, response) from e
huggingface_hub.errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-67abc5e4-30f0316e65d15f3b7d25f956;4bcbcf5c-1f80-4119-8376-0ba7256bac08)

Repository Not Found for url: https://huggingface.co/api/models/ds4sd/docling-models/revision/v2.1.0.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid credentials in Authorization header
(venv-torch-2.4.1) pig@z790:~/study-notes-2023-05$ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants