Add an option to order by ascending/descending prediction in cumulative effect curves #204

HectorLira · 2022-07-25T22:23:13Z

Describe the feature and the current state.

In the causal validation module and the curves file, it would be useful to add an ascending parameter for the cumulative effect and cumulative gain curves.

The current state is to order predictions descending:

ordered_df = df.sort_values(prediction, ascending=False).reset_index(drop=True)

If we add an ascending: bool = False argument to the cumulative_effect_curve, cumulative_gain_curve, relative_cumulative_gain_curve, and effect_curves, a user could modify how these effects are computed, whether to do them ascending or descending by the prediction column.

Will this change a current behavior? How?

Not if the user does not explicitly change the argument to ascending=True. If they do, the cumulative effect or cumulative gain curves will be computed using an ascending ordering in the prediction column.

A model could output a prediction that is not necessarily positively related to the effect to be computed, so adding an option to order this relationship differently will allow for effects and gains with negatively related predictions and outcomes to be computed adequately.

One current workaround is to do this:

df["prediction"] = -df["prediction"]

and then the computation will be made adequately. But this seems like a hack and maybe something we want to solve more cleanly.

Additional Information

The new definition of cumulative_effect_curve would look like this:

@curry
def cumulative_effect_curve(df: pd.DataFrame,
                            treatment: str,
                            outcome: str,
                            prediction: str,
                            min_rows: int = 30,
                            steps: int = 100,
                            effect_fn: EffectFnType = linear_effect,
                            ascending: bool = False) -> np.ndarray:
    """
    Orders the dataset by prediction and computes the cumulative effect curve according to that ordering

    Parameters
    ----------
    df : Pandas' DataFrame
        A Pandas' DataFrame with target and prediction scores.

    treatment : Strings
        The name of the treatment column in `df`.

    outcome : Strings
        The name of the outcome column in `df`.

    prediction : Strings
        The name of the prediction column in `df`.

    min_rows : Integer
        Minimum number of observations needed to have a valid result.

    steps : Integer
        The number of cumulative steps to iterate when accumulating the effect

    effect_fn : function (df: pandas.DataFrame, treatment: str, outcome: str) -> int or Array of int
        A function that computes the treatment effect given a dataframe, the name of the treatment column and the name
        of the outcome column.

    ascending : bool
        Whether the prediction column should be ordered ascending or not. Default is False.


    Returns
    ----------
    cumulative effect curve: Numpy's Array
        The cumulative treatment effect according to the predictions ordering.
    """

    size = df.shape[0]
    ordered_df = df.sort_values(prediction, ascending=ascending).reset_index(drop=True)
    n_rows = list(range(min_rows, size, size // steps)) + [size]
    return np.array([effect_fn(ordered_df.head(rows), treatment, outcome) for rows in n_rows])

The text was updated successfully, but these errors were encountered:

HectorLira added the enhancement New feature or request label Jul 25, 2022

MarianaBlaz linked a pull request Dec 20, 2022 that will close this issue

Add ascending parameter causal validation #220

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to order by ascending/descending prediction in cumulative effect curves #204

Add an option to order by ascending/descending prediction in cumulative effect curves #204

HectorLira commented Jul 25, 2022 •

edited

Loading

Add an option to order by ascending/descending prediction in cumulative effect curves #204

Add an option to order by ascending/descending prediction in cumulative effect curves #204

Comments

HectorLira commented Jul 25, 2022 • edited Loading

Describe the feature and the current state.

Will this change a current behavior? How?

Additional Information

HectorLira commented Jul 25, 2022 •

edited

Loading