Closed
Description
reduce
returns a dask dataframe as its type, not a NestedFrame
.
Reproducing example:
import nested_pandas as npd
import numpy as np
import nested_dask as nd
import pandas as pd
import pyarrow as pa
a = npd.NestedFrame({"a": pd.Series([1,2,3], dtype=pd.ArrowDtype(pa.int64()))}, index=[0,0,1])
b = npd.NestedFrame({"b": pd.Series([1,2], dtype=pd.ArrowDtype(pa.int64()))}, index=[0,1])
ndf = b.add_nested(a, name="test")
nddf = nd.NestedFrame.from_pandas(ndf, npartitions=1)
def mean_arr(b, arr):
return {"b": b, "mean": np.mean(arr)}
reduced = nddf.reduce(mean_arr, "b", "test.a", meta={"b": int, "mean": float})
type(reduced)
Versions:
nested_dask = 0.2.1
nested_pandas = 0.2.2