Skip to content

List changed size during iteration #48

Closed
@azscrs

Description

@azscrs

Bug Report

dvc pull -R test_resources/
ERROR: unexpected error - list changed size during iteration

Description

Running the above command triggers a RuntimeError, causing the pull operation to stop. The behavior is nondeterministic, the same exact command (with exact same target imports) will succeed at times.

Reproduce

At "setup" time

  1. cd test_resources
  2. cd task
  3. mkdir dataset && cd dataset
  4. dvc import dataset/images
  5. dvc import dataset/labels/train.json
  6. dvc import dataset/labels/test.json
  7. repeat steps 3-5 for each dataset
  8. cd ..
  9. repeat steps 2-8 for each task

At build time

  1. dvc pull -R test_resources/

Expected

A test_resources/task/dataset/images
A test_resources/task/dataset/train.json
...
X files added and Y files fetched
Process exited with code 0

Environment information

Output of dvc doctor:

DEBUG: Version info for developers:
DVC version: 2.18.0 (pip)
Platform: Python 3.8.10 on Linux-5.15.0-1028-aws-aarch64-with-glibc2.29
Supports:
  webhdfs (fsspec = 2022.8.2),
  http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
  https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3)
Cache types: symlink
Cache directory: ext4 on /dev/root
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/root
Repo: dvc, git

Able to replicate also on Platform: Python 3.8.10 on Windows-10-10.0.17763-SP0

Additional Information (if any):
Build log

+ dvc pull -v -R test_resources/compatibility_tests
ERROR: unexpected error - list changed size during iteration
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/dvc/cli/__init__.py", line 185, in main
    ret = cmd.do_run()
  File "/usr/local/lib/python3.8/dist-packages/dvc/cli/command.py", line 22, in do_run
    return self.run()
  File "/usr/local/lib/python3.8/dist-packages/dvc/commands/data_sync.py", line 31, in run
    stats = self.repo.pull(
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 48, in wrapper
    return f(repo, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/pull.py", line 34, in pull
    processed_files_count = self.fetch(
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 48, in wrapper
    return f(repo, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/fetch.py", line 45, in fetch
    used = self.used_objs(
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 429, in used_objs
    for odb, objs in self.index.used_objs(
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/index.py", line 240, in used_objs
    for odb, objs in stage.get_used_objs(
  File "/usr/local/lib/python3.8/dist-packages/dvc/stage/__init__.py", line 663, in get_used_objs
    for odb, objs in out.get_used_objs(*args, **kwargs).items():
  File "/usr/local/lib/python3.8/dist-packages/dvc/output.py", line 946, in get_used_objs
    return self.get_used_external(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/dvc/output.py", line 1001, in get_used_external
    return dep.get_used_objs(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/dvc/dependency/repo.py", line 97, in get_used_objs
    used, _ = self._get_used_and_obj(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/dvc/dependency/repo.py", line 134, in _get_used_and_obj
    object_store, _, obj = build(
  File "/usr/local/lib/python3.8/dist-packages/dvc_data/build.py", line 241, in build
    details = fs.info(path)
  File "/usr/local/lib/python3.8/dist-packages/dvc_objects/fs/base.py", line 346, in info
    return self.fs.info(path)
  File "/usr/local/lib/python3.8/dist-packages/funcy/objects.py", line 50, in __get__
    return prop.__get__(instance, type)
  File "/usr/local/lib/python3.8/dist-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/usr/local/lib/python3.8/dist-packages/dvc/fs/dvc.py", line 454, in fs
    return _DvcFileSystem(**self.fs_args)
  File "/usr/local/lib/python3.8/dist-packages/fsspec/spec.py", line 76, in __call__
    obj = super().__call__(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/dvc/fs/dvc.py", line 138, in __init__
    self._datafss[key] = DataFileSystem(index=repo.index.data["repo"])
  File "/usr/local/lib/python3.8/dist-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/usr/local/lib/python3.8/dist-packages/dvc/repo/index.py", line 197, in data
    remote = self.repo.cloud.get_remote_odb(out.remote)
  File "/usr/local/lib/python3.8/dist-packages/dvc/data_cloud.py", line 41, in get_remote_odb
    return self._init_odb(name)
  File "/usr/local/lib/python3.8/dist-packages/dvc/data_cloud.py", line 64, in _init_odb
    return get_odb(cls(**config), fs_path, **config)
  File "/usr/local/lib/python3.8/dist-packages/dvc_objects/fs/base.py", line 78, in __init__
    self.fs_args.update(self._prepare_credentials(**kwargs))
  File "/usr/local/lib/python3.8/dist-packages/dvc_objects/fs/implementations/http/__init__.py", line 86, in _prepare_credentials
    client_kwargs["connector"] = aiohttp.TCPConnector(
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/connector.py", line 767, in __init__
    super().__init__(
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/connector.py", line 265, in __init__
    self._cleanup_closed()
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/connector.py", line 403, in _cleanup_closed
    self._cleanup_closed_handle = helpers.weakref_handle(
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/helpers.py", line 610, in weakref_handle
    return loop.call_at(when, _weakref_handle, (weakref.ref(ob), name))
  File "/usr/lib/python3.8/asyncio/base_events.py", line 705, in call_at
    heapq.heappush(self._scheduled, timer)
RuntimeError: list changed size during iteration
------------------------------------------------------------
Process exited with code 255

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions