Skip to content

reduce returns a Dask DataFrame, not a NestedFrame #54

Closed
@smcguire-cmu

Description

@smcguire-cmu

reduce returns a dask dataframe as its type, not a NestedFrame.

Reproducing example:

import nested_pandas as npd
import numpy as np
import nested_dask as nd
import pandas as pd
import pyarrow as pa

a = npd.NestedFrame({"a": pd.Series([1,2,3], dtype=pd.ArrowDtype(pa.int64()))}, index=[0,0,1])
b = npd.NestedFrame({"b": pd.Series([1,2], dtype=pd.ArrowDtype(pa.int64()))}, index=[0,1])

ndf = b.add_nested(a, name="test")
nddf = nd.NestedFrame.from_pandas(ndf, npartitions=1)

def mean_arr(b, arr):
    return {"b": b, "mean": np.mean(arr)}

reduced = nddf.reduce(mean_arr, "b", "test.a", meta={"b": int, "mean": float})
type(reduced)

Versions:
nested_dask = 0.2.1
nested_pandas = 0.2.2

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions