Skip to content

Improve support for reduce returning nested information #132

Closed
@dougbrn

Description

@dougbrn

Feature request
We expect users that are applying their functions via reduce to sometimes want more complex outputs. For example, if I'm applying Lombscargle to all lightcurves in my nested dataset, I may want a result that has a max power column, a frequency of max power column, and a nested column holding the full periodogram. Presently, we don't really support this as we primarily want the user to return a dictionary-style output, which will limit the user towards producing list columns rather than nested columns, for example:

from nested_pandas.datasets import generate_data
import numpy as np

ndf = generate_data(3,20)

def complex_output(flux):
    return {"max_flux":np.max(flux), "flux_quantiles":np.quantile(flux, [0.1,0.2,0.3,0.4,0.5])}

ndf.reduce(complex_output, "nested.flux")

	max_flux	flux_quantiles
0	98.744076	[15.293187217097268, 21.834338973710633, 25.02...
1	98.502034	[6.337989346945357, 8.019180689729948, 9.69707...
2	99.269021	[12.42551556001139, 15.901779148332189, 26.199...

With #131 it would be easier to convert this to a nested output, but I think ideally reduce would have a more native ability to produce nested structures for these types of functions. My thinking about how we might try to facilitate this, is to read more into the user defined dictionary output to determine the nestedframe. For example, a user could instead specify this reduce function:

def complex_output(flux):
    return {"max_flux":np.max(flux), "quantiles":{"flux_quantiles":np.quantile(flux, [0.1,0.2,0.3,0.4,0.5])}}

The json-like nesting of dataframes would signal that the user would like a "quantiles" nested column with a "flux_quantiles" field. I'm not entirely sure on the full implementation plan, but this seems most intuitive from the users perspective.

Also consider #101 as an alternate interface.

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions