You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The fsspec find interface accepts maxdepth, but the adlfs implementation doesn't actually use this parameter -- so it always recursively lists all files in the directory.
pyarrow's wrapper of fsspec filesystems uses fs.find to perform a list operation at a directory. This makes use of the maxdepth parameter when the user specifies recursive=False in pyarrow.
fs=pyarrow.fs.PyFileSystem(pyarrow.fs.FSSpecHandler(fsspec_fs))
file_infos=fs.get_file_info(
pyarrow.fs.FileSelector("path", recursive=False)
)
# All files under path will be recursively listed, rather than just the top level.
Easy fix is to just take the gcsfs implementation:
The fsspec find interface accepts maxdepth, but the adlfs implementation doesn't actually use this parameter -- so it always recursively lists all files in the directory.
The fsspec
find
interface acceptsmaxdepth
, but theadlfs
implementation doesn't actually use this parameter -- so it always recursively lists all files in the directory.See here:
adlfs/adlfs/spec.py
Line 1128 in f15c37a
pyarrow
's wrapper offsspec
filesystems usesfs.find
to perform a list operation at a directory. This makes use of themaxdepth
parameter when the user specifiesrecursive=False
in pyarrow.Easy fix is to just take the
gcsfs
implementation:https://github.com/fsspec/gcsfs/blob/ad684a5b3f25d46eeb5c3aebdbe647056a5e312b/gcsfs/core.py#L1441-L1444
The text was updated successfully, but these errors were encountered: