Closed
Description
Bug Report
dvc pull -R test_resources/
ERROR: unexpected error - list changed size during iteration
Description
Running the above command triggers a RuntimeError, causing the pull operation to stop. The behavior is nondeterministic, the same exact command (with exact same target imports) will succeed at times.
Reproduce
At "setup" time
cd test_resources
cd task
mkdir dataset && cd dataset
dvc import dataset/images
dvc import dataset/labels/train.json
dvc import dataset/labels/test.json
- repeat steps 3-5 for each dataset
cd ..
- repeat steps 2-8 for each task
At build time
dvc pull -R test_resources/
Expected
A test_resources/task/dataset/images
A test_resources/task/dataset/train.json
...
X files added and Y files fetched
Process exited with code 0
Environment information
Output of dvc doctor
:
DEBUG: Version info for developers:
DVC version: 2.18.0 (pip)
Platform: Python 3.8.10 on Linux-5.15.0-1028-aws-aarch64-with-glibc2.29
Supports:
webhdfs (fsspec = 2022.8.2),
http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3)
Cache types: symlink
Cache directory: ext4 on /dev/root
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/root
Repo: dvc, git
Able to replicate also on Platform: Python 3.8.10 on Windows-10-10.0.17763-SP0
Additional Information (if any):
Build log
+ dvc pull -v -R test_resources/compatibility_tests
ERROR: unexpected error - list changed size during iteration
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/dvc/cli/__init__.py", line 185, in main
ret = cmd.do_run()
File "/usr/local/lib/python3.8/dist-packages/dvc/cli/command.py", line 22, in do_run
return self.run()
File "/usr/local/lib/python3.8/dist-packages/dvc/commands/data_sync.py", line 31, in run
stats = self.repo.pull(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 48, in wrapper
return f(repo, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/pull.py", line 34, in pull
processed_files_count = self.fetch(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 48, in wrapper
return f(repo, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/fetch.py", line 45, in fetch
used = self.used_objs(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 429, in used_objs
for odb, objs in self.index.used_objs(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/index.py", line 240, in used_objs
for odb, objs in stage.get_used_objs(
File "/usr/local/lib/python3.8/dist-packages/dvc/stage/__init__.py", line 663, in get_used_objs
for odb, objs in out.get_used_objs(*args, **kwargs).items():
File "/usr/local/lib/python3.8/dist-packages/dvc/output.py", line 946, in get_used_objs
return self.get_used_external(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/output.py", line 1001, in get_used_external
return dep.get_used_objs(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/dependency/repo.py", line 97, in get_used_objs
used, _ = self._get_used_and_obj(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/dependency/repo.py", line 134, in _get_used_and_obj
object_store, _, obj = build(
File "/usr/local/lib/python3.8/dist-packages/dvc_data/build.py", line 241, in build
details = fs.info(path)
File "/usr/local/lib/python3.8/dist-packages/dvc_objects/fs/base.py", line 346, in info
return self.fs.info(path)
File "/usr/local/lib/python3.8/dist-packages/funcy/objects.py", line 50, in __get__
return prop.__get__(instance, type)
File "/usr/local/lib/python3.8/dist-packages/funcy/objects.py", line 28, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
File "/usr/local/lib/python3.8/dist-packages/dvc/fs/dvc.py", line 454, in fs
return _DvcFileSystem(**self.fs_args)
File "/usr/local/lib/python3.8/dist-packages/fsspec/spec.py", line 76, in __call__
obj = super().__call__(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/fs/dvc.py", line 138, in __init__
self._datafss[key] = DataFileSystem(index=repo.index.data["repo"])
File "/usr/local/lib/python3.8/dist-packages/funcy/objects.py", line 28, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/index.py", line 197, in data
remote = self.repo.cloud.get_remote_odb(out.remote)
File "/usr/local/lib/python3.8/dist-packages/dvc/data_cloud.py", line 41, in get_remote_odb
return self._init_odb(name)
File "/usr/local/lib/python3.8/dist-packages/dvc/data_cloud.py", line 64, in _init_odb
return get_odb(cls(**config), fs_path, **config)
File "/usr/local/lib/python3.8/dist-packages/dvc_objects/fs/base.py", line 78, in __init__
self.fs_args.update(self._prepare_credentials(**kwargs))
File "/usr/local/lib/python3.8/dist-packages/dvc_objects/fs/implementations/http/__init__.py", line 86, in _prepare_credentials
client_kwargs["connector"] = aiohttp.TCPConnector(
File "/usr/local/lib/python3.8/dist-packages/aiohttp/connector.py", line 767, in __init__
super().__init__(
File "/usr/local/lib/python3.8/dist-packages/aiohttp/connector.py", line 265, in __init__
self._cleanup_closed()
File "/usr/local/lib/python3.8/dist-packages/aiohttp/connector.py", line 403, in _cleanup_closed
self._cleanup_closed_handle = helpers.weakref_handle(
File "/usr/local/lib/python3.8/dist-packages/aiohttp/helpers.py", line 610, in weakref_handle
return loop.call_at(when, _weakref_handle, (weakref.ref(ob), name))
File "/usr/lib/python3.8/asyncio/base_events.py", line 705, in call_at
heapq.heappush(self._scheduled, timer)
RuntimeError: list changed size during iteration
------------------------------------------------------------
Process exited with code 255
Metadata
Metadata
Assignees
Labels
No labels