Skip to content

Recursive Globbing at GCS bucket top level results in TypeError #431

Open
@FullMetalMeowchemist

Description

@FullMetalMeowchemist

The following works

path = CloudPath("gs://top-level-bucket-name/second-level/")
filepaths = path.rglob("*")
assert list(filepaths)

However, when I perform this action at the bucket root, for example

path = CloudPath("gs://top-level-bucket-name/")
filepaths = path.rglob("*")
assert list(filepaths)

I get the following error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[75], line 1
----> 1 list(paths)

File ~/.pyenv/versions/3.10.12/envs/dai/lib/python3.10/site-packages/cloudpathlib/cloudpath.py:497, in CloudPath.rglob(self, pattern, case_sensitive)
    492 pattern_parts = PurePosixPath(pattern).parts
    493 selector = _make_selector(
    494     ("**",) + tuple(pattern_parts), _posix_flavour, case_sensitive=case_sensitive
    495 )
--> 497 yield from self._glob(selector, True)

File ~/.pyenv/versions/3.10.12/envs/dai/lib/python3.10/site-packages/cloudpathlib/cloudpath.py:458, in CloudPath._glob(self, selector, recursive)
    457 def _glob(self, selector, recursive: bool) -> Generator[Self, None, None]:
--> 458     file_tree = self._build_subtree(recursive)
    460     root = _CloudPathSelectable(
    461         self.name,
    462         [],  # nothing above self will be returned, so initial parents is empty
    463         file_tree,
    464     )
    466     for p in selector.select_from(root):
    467         # select_from returns self.name/... so strip before joining

File ~/.pyenv/versions/3.10.12/envs/dai/lib/python3.10/site-packages/cloudpathlib/cloudpath.py:453, in CloudPath._build_subtree(self, recursive)
    450         continue
    452     nodes = (p for p in parts)
--> 453     _build_tree(file_tree, next(nodes, None), nodes, is_dir)
    455 return dict(file_tree)

File ~/.pyenv/versions/3.10.12/envs/dai/lib/python3.10/site-packages/cloudpathlib/cloudpath.py:441, in CloudPath._build_subtree.<locals>._build_tree(trunk, branch, nodes, is_dir)
    438     trunk[branch] = Tree() if is_dir else None  # leaf node
    440 else:
--> 441     _build_tree(trunk[branch], next_branch, nodes, is_dir)

File ~/.pyenv/versions/3.10.12/envs/dai/lib/python3.10/site-packages/cloudpathlib/cloudpath.py:441, in CloudPath._build_subtree.<locals>._build_tree(trunk, branch, nodes, is_dir)
    438     trunk[branch] = Tree() if is_dir else None  # leaf node
    440 else:
--> 441     _build_tree(trunk[branch], next_branch, nodes, is_dir)

TypeError: 'NoneType' object is not subscriptable

This only happens if I do a recursive glob. If I were to simply do a glob("*") it retrieves the root level blob paths.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions