Revised metrics proposal #15

riedgar-ms · 2020-08-26T14:23:51Z

Create a fresh proposal for metrics, based on discussions in meetings, and focusing on 'what does the user type?' rather than the implementation.

Signed-off-by: Richard Edgar <[email protected]>

romanlutz

Thanks for putting this together! This is very thorough and explains the difficulties very well. I hope my comments make sense :-)

api/Updated-Metrics.md

romanlutz · 2020-08-29T02:28:33Z

api/Updated-Metrics.md

+
+We would provide operations to calculate differences in various ways (all of these results are a Pandas Series):
+```python
+>>> result.differences()


Very confusing to me. Defaults to relative_to="max"?
Why not let the user specify? Reads a lot better with the explicit specification as well.

Do you mean you'd like to have no default at all or a different one than "max"?

The user can specify, but I believe doing relative to max matches our current behaviour. I would be happy with not having a default too, if that was the consensus.

Our current behavior is that we are taking differences relative to the minimum not maximum (that's why all of our differences are positive). That's how I wrote the proposal here. In other words, the "relative to" part is whatever you subtract from all the group-level metric values.

But maybe it's confusing? An alternative term could be "origin". Another idea would be to say "from" instead of "relative_to"?

romanlutz · 2020-08-29T02:31:11Z

api/Updated-Metrics.md

+>>> result.differences(relative_to='overall', aggregate='max')
+0.1870
+>>> result.differences(relative_to='overall', abs=True, aggregate='max')
+0.2436


why is this different from the line above with no abs? The value was positive, so that should be identical, right?

Without abs the values were -0.2436 and 0.1870. With abs the values are 0.2436 and 0.1870.

romanlutz · 2020-08-29T02:31:41Z

api/Updated-Metrics.md

+
+There would be a similar method called `ratios()` on the `GroupedMetric` object:
+```python
+>>> result.ratios()


same thing, don't like the default at all :-(

And the same from me: I'm happy to change if there's agreement that requiring the argument makes more sense.

romanlutz · 2020-08-29T02:32:56Z

api/Updated-Metrics.md

+- `ratio_order=` determines how to build the ratio. Values are
+   - `sub_unity` to make larger value the denominator
+   - `super_unity` to make larger value the numerator
+   - `from_relative` to make the value specified by `relative_to=` the denominator


both this line and the next say relative_to, is that intended?

I am a bit worried about the complexity of the ratio_order argument, esp for new users.

Yes. The change is between numerator and denominator.

Is this a question of naming, or whether we need all the cases? I'm not overly happy with the names myself, but I do think that all four cases might be useful.

A bit of both, I guess... but I think you're right that you need all four cases.

api/Updated-Metrics.md

romanlutz · 2020-08-29T02:40:06Z

api/Updated-Metrics.md

+This would make the following all equivalent:
+```python
+>>> r1 = group_summary(sklearn.accuracy_score, y_true, y_pred, sensitive_features=A_sex)
+>>> r2 = group_summary('accuracy_score', y_true, y_pred, sensitive_features=A_sex)


I presume this maps to the sklearn metrics? What if I want to define my own metric? Will that also work (although not with the string, of course)?

Let me add an 'in addition' to the introductory sentence.

api/Updated-Metrics.md

romanlutz · 2020-08-29T02:43:36Z

api/Updated-Metrics.md

+
+The question is whether users prefer:
+```python
+>>> diff = group_summary(skm.recall_score, y_true, y_pred, sensitive_features=A).difference(aggregate='max')


Obviously I'm just one data point, but I strongly prefer this over the version below.

Indeed - people were leaning that way, but since there's always the question of 'Do we really need a new concept?' I just wanted to call this out.

adrinjalali

Thanks for the proposal. I think it's a good step. I'd wrap the lines though 🙈

adrinjalali · 2020-08-31T08:17:59Z

api/Updated-Metrics.md

+
+```python
+y_true = [0, 1, 0, 0, 1, 1, ...]
+y_pred = [1, 0, 0, 1, 0, 1, ...]


since we're dealing with metrics, should we separate floating point predictions (decision_function or predict) from discrete ones? (assuming some metrics need soft values and some work with hard values)

Maybe simply add :

Suggested change

y_pred = [1, 0, 0, 1, 0, 1, ...]

# or soft values, depending on what the underlying metric expects

y_pred = [1, 0, 0, 1, 0, 1, ...]

I don't think that we want separate group_summary() functions for different types of underlying metrics, but I agree about the need to highlight that we just slice up the y_pred list regardless of the types of its items, if that's what you meant. I was trying to write that below, but I'll make it more explicit.

api/Updated-Metrics.md

adrinjalali · 2020-08-31T08:21:44Z

api/Updated-Metrics.md

+weights = [ 1, 2, 3, 2, 2, 1, ...]
+```
+
+We actually seek to be very agnostic as to the contents of the `y_true` and `y_pred` arrays.


agnostic in what way? I think the proposal later explains this, but it may be easier to follow if that's moved up here.

I was meaning in exactly the same way as your suggested comment. Let me rephrase.

api/Updated-Metrics.md

adrinjalali · 2020-08-31T08:47:50Z

api/Updated-Metrics.md

+We propose adding an extra argument to `differences()` and `ratios()`, to provide a `condition_on=` argument.
+
+Suppose we have a DataFrame, `A_3` with three sensitive features: Sex, Race and Income Band (the latter having values 'Low' and 'High').
+This could represent a loan scenario where discrimination based on income is allowed, but within the income bands, other sensitive groups must be treated equally.


or to use abstract names and avoid mentioning sex, race or income alltogether.

adrinjalali · 2020-08-31T08:50:12Z

api/Updated-Metrics.md

+```
+This should generalise to the other methods described above.
+
+One open question is how extra arguments should be passed to the individual metric functions, including how to handle the `indexed_params=`.


you can look at https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep006/proposal.html and scikit-learn/scikit-learn#16079 to get some ideas.

If I'm reading that correctly, we would ditch the **kwargs, and have an explicit props= dictionary - and probably rename indexed_params= to sample_props=. When evaluating multiple metrics at once, I still think we'd need a list of dictionaries and a list of lists for these, though.

api/Updated-Metrics.md

Signed-off-by: Richard Edgar <[email protected]>

hildeweerts · 2020-08-27T12:54:58Z

api/Updated-Metrics.md

+
+We do not intend to change the API invoked by the user.
+What will change is the return type.
+Rather than a `Bunch`, we will return a `GroupedMetric` object, which can offer richer functionality.


I'm not sure whether we're already at the stage of discussing naming, but since we decided to discard .eval(), we might want to think about a name that better reflects that the class that's currently called GroupedMetric contains results/scores. The current name suggests that the class corresponds to some grouped metric itself rather than results/scores. Similarly, I'm not sure about group_summary(), because in my view GroupedMetric is not really a summary? My suggestion would be to come up with a few alternatives and ask some people who are not as deeply involved with the metrics API so far which one they find most intuitive.

This is a great time to be discussing naming, I think..... I'll see if I can come up with a few possibilities.

api/Updated-Metrics.md

hildeweerts · 2020-08-31T11:52:36Z

api/Updated-Metrics.md

+
+We would provide operations to calculate differences in various ways (all of these results are a Pandas Series):
+```python
+>>> result.differences()


Do you mean you'd like to have no default at all or a different one than "max"?

hildeweerts · 2020-08-31T11:57:18Z

api/Updated-Metrics.md

+- `ratio_order=` determines how to build the ratio. Values are
+   - `sub_unity` to make larger value the denominator
+   - `super_unity` to make larger value the numerator
+   - `from_relative` to make the value specified by `relative_to=` the denominator


I am a bit worried about the complexity of the ratio_order argument, esp for new users.

hildeweerts · 2020-08-31T11:59:38Z

api/Updated-Metrics.md

+
+The question is whether users prefer:
+```python
+>>> diff = group_summary(skm.recall_score, y_true, y_pred, sensitive_features=A).difference(aggregate='max')


api/Updated-Metrics.md

Signed-off-by: Richard Edgar <[email protected]>

api/Updated-Metrics.md

riedgar-ms · 2020-09-08T15:53:08Z

Meeting 2020-09-08

Thanks to all for attending today. The main agreement is that we should think about a 'data exploration' interface (provided by the object) and a 'scorer' interface (provided by wrapper functions for the objects).

More minor points:

condition_on= should produce columns (at construction time), and not just be specified at aggregation time. This is because it's natural to look down columns for anomalies, and not have to jump back and forth to check which row you're on. These could be done as 'simple' columns (and be mutually exclusive with doing multiple metrics at once), or apparently DataFrames can have a MultiIndex on their columns.
We don't need most wrapper functions (such as accuracy_score_group_summary() etc.). The key ones to have, and to have available for make_scorer() are ones like demographic_parity_difference() and equalized_odds_ratio()
Perhaps we don't need all the various options for ratios() and differences(). Indeed, perhaps these would be best handled in the wrapper functions, and keep the methods on the objects constrained?

Next Steps

I will take @MiroDudik 's list of scenarios from above, and start to rough out how things might look in each API:

Report one disaggregated metric in a data frame.
Report several disaggregated metrics in a data frame.
Report several performance and fairness metrics of several models in a data frame.
In abeyance Report several performance and fairness metrics as well as some disaggregated metrics of several models in a data frame.
Create a fairness-performance raster plot of several models.
Run sklearn.model_selection.cross_validate with a demographic parity (as a fairness metric) and balanced error rate (as the performance metric). We should perhaps review this in the context of Adrin's PR in sklearn?
Run GridSearchCV with demographic parity and balanced error, where the goal is to find the lowest-error model whose demographic parity is <= 0.05.

One thing which had been confusing me from before: These are not all expected to be one-liners, but may well be more complex. The idea is to identify which pieces are relevant to the metrics API, and whether we like how that looks.

Signed-off-by: Richard Edgar <[email protected]>

api/Updated-Metrics.md

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms · 2020-09-14T15:19:04Z

Meeting 2020-09-14

Thanks you @hildeweerts @adrinjalali @MiroDudik and @amueller for our meeting today.

We agreed that:

We should have separate .difference() and .difference_from_overall() methods (similarly for ratios) for producing the scalars
Although users could to result.by_group.max() we should provide .group_max() and .group_min() methods
We should keep make_derived_metric() but have the first argument be a string, selecting from the various difference and ratio methods

It also sounded like we should keep the current set of functions built with make_derived_metric() such as accuracy_score_group_min().

According to @adrinjalali , the PR for SLEP006 Option 4 will probably be in the release of SciKit-Learn next April. We should aim to be compatible with that in our scorer objects.

Next Steps

I will update this proposal based on the above points. I will also start a rough implementation of GroupedMetric so that we have actual code to play with.

Signed-off-by: Richard Edgar <[email protected]>

api/Updated-Metrics.md

Signed-off-by: Richard Edgar <[email protected]>

api/Updated-Metrics.md

riedgar-ms · 2020-09-21T15:59:11Z

Meeting 2020-09-21

Thanks to @adrinjalali @hildeweerts @amueller and @MiroDudik for our meeting today.

Points agreed:

Conditional features should go into the rows (but make sure that they are at a 'higher' level than the sensitive features)
Don't have separate .difference() and .difference_from_overall() methods, but instead have a .difference(method='min_max|to_overall') method, where we can add to the allowed values for the method= argument later (similarly for .ratios())
The .group_min(), group_max(), .difference() and .ratio() methods should produce one result per conditional feature class
Don't worry about supporting dictionaries of functions for now. Single functions and lists of functions are sufficient
Rather than have sample_param_names= in the constructor (as at present), better to have sample_params=dict() and slice those up
For now, make_derived_metric() should produce a single scalar, and not support conditional features

Still TBD:

Do we require users to use functools.partial() rather than having the params= argument to the GroupedMetric constructor? I felt that we were leaning towards keeping the params= argument, but I'm not sure there was consensus

Yet to be discussed:

What to do when particular intersections of conditional and sensitive features are empty. Do we skip that row entirely, or do we return None

adrinjalali · 2020-09-21T16:33:48Z

I was under the impression we're okay with restricting the metrics' signature and asking users to use partial instead.

…d-metrics

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms · 2020-09-28T15:46:15Z

Meeting 2020-09-18

Thanks to @adrinjalali @hildeweerts and @MiroDudik for our meeting today.

We started by discussing the .overall property. Currently, when there are no conditional features, this is a DataFrame with a single row with a special name of 'overall':

>>> target.overall
        recall_score precision_score
overall     0.733333        0.717391

The motivation behind this decision was to ensure that we have a consistent type returned. However, doing so comes at the cost of introducing a 'magic' row name of 'overall'. Making this choice also gives some trouble doing operations such as target.by_group - target.overall, where one gets results such as:

        recall_score precision_score
overall          NaN             NaN
pp               NaN             NaN
q                NaN             NaN

The intended operation in this case is actually:

target.by_group - target.overall.iloc[0]
     recall_score precision_score
SF 0                             
pp      0.0057971     -0.00905797
q     -0.00606061      0.00988142

Continuing with the .difference() method, there are a certain number of behind-the-scenes machinations. However, perhaps these are not necessary:

target.difference(method='to_overall')
        recall_score precision_score
overall   0.00606061      0.00988142
(target.by_group - target.overall.iloc[0]).abs().max()
recall_score       0.006061
precision_score    0.009881
dtype: float64

These are not a problem when there are conditional features, which ensures that there are matching portions of the DataFrame index. Together, these suggest that when there are no conditional features, the return type of .overall should be a Series, with entries corresponding to the metric names. @MiroDudik is going to experiment some more, to verify that operations performed regularly by data scientists are then simple.

We agreed that for now, we should drop the params= argument from the GroupedMetric constructor, and require users to use functools.partial if they need to set other parameters for the metrics (such as the required beta= for fbeta_score()).

When specifying multiple functions in the GroupedMetric constructor, the current pattern is:

fns = [skm.recall_score, skm.precision_score, skm.fbeta_score]
sample_params = [{'sample_weight': wgts},
                 {'sample_weight': wgts},
                 {'sample_weight': wgts}]

result = metrics.GroupedMetric(fns, Y_test, Y_pred,
                               sensitive_features=...,
                               conditional_features=...,
                               sample_params=sample_params)

We agreed that using dictionaries instead of arrays would make more sense. Something like:

fns = {
    'recall': skm.recall_score,
    'prec': skm.precision_score,
    'fbeta_0.5': functools.partial(skm.fbeta_score, beta=0.5) }
sample_params = {
   'recall' : {'sample_weight': wgts},
   'prec': {'sample_weight': wgts},
   'fbeta_0.5': {'sample_weight': wgts}
}

There is one question as to whether sample_params= should be a dictionary-of-dictionaries, or use double underscores (i.e. sample_params={ 'recall__sample_weight': wgts, 'prec__sample_weight': wgts, ... }) as happens in SciKit-Learn. There was a preference to the dictionary-of-dictionaries, though. We agreed that trying to have a special 'broadcast' key (to avoid having to repeat sample_weight multiple times) was probably not worth the extra complications and pitfalls.

Now that the API is pretty much finalised, I will open a new issue to discuss the naming.

riedgar-ms · 2020-10-06T16:03:01Z

Meeting 2020-10-06

Thanks to @adrinjalali @hildeweerts and @MiroDudik for our discussion today.

We had a long talk about make_derived_metric() and how it would be used. @MiroDudik has shared a hackmd with his concerns and a proposal, and we need to iterate on that. We did decide that the callable object returned from make_derived_metric() should support all the arguments of the base metric (even if we still require functools.partial() for GroupedMetric itself). We therefore need to tell make_derived_metric() which arguments need to be sliced up and which to just pass through. Since there are generally more of the latter, I will restore sample_param_names= to make_derived_metric().

One question which did arise: how to specify the method= argument for the aggregation approaches supported by make_derived_metric(). Should it be specified in the arguments of make_derived_metric(), or in the callable? The consensus was going towards the latter, but I would highlight that that 'reserves' another argument name which then can't be used by the base metric.

Related to these, although not essential for API design: do we provide all the accuracy_score_group_min() wrappers for users, or expect them to create them themselves (one line for each they want).

We agreed to keep 'empty' combinations of features in the result tables as NaNs; users can easily drop those rows if they want. In future, we might add an optional argument to drop them when the objects are created.

When it comes to naming, we still have issue #17 . I have the impression that no-one particularly likes GroupedMetric as a name, but there's a lack of better alternatives.

Next steps:

Continue with discussion of issues @MiroDudik raised. We need to close on this ASAP
I will move the new code out of experimental and replace the existing metrics implementation

riedgar-ms · 2020-10-12T16:01:21Z

Meeting 2020-10-12

Thanks to @MiroDudik and @adrinjalali for our meeting today. We agreed on the following:

The class will be renamed from GroupedMetric to MetricsFrame. We're using the plural 'Metrics' to make it plain that there won't be similar 'US' and 'Imperial' frames as well
conditional_feature= is still not quite agreed. @MiroDudik is going to ask for more feedback around the lab, but it appears there isn't much agreement. We may just have to pick something and stick with it
group_min() will stay as it is. We can't think of anything concise which has any less potential for ambiguity
Definitely keep the method= argument with that name
Have the valid values for method= be between_pairs and to_overall. We felt that minmax was reusing a different term, and that max_to_min for difference would need min_over_max for ratio, which is inconsistent. The new proposal has the form 'preposition-noun' for both. Again, we could be more precise, but at a huge conciseness cost
I will restore a default of method=between_pairs to match the current behaviour
We talked about sample_params= vs sample_props= but decided to stick with the former

We also discussed make_derived_metric() but have come to realise that that is a lot thornier that we'd appreciated. @adrinjalali highlighted some potential future problems we could be laying for ourselves, when it comes to passing arguments around. The simplest example of this is the method= argument on the returned callable, and how that can conflict with arguments to the base metric (it gets more complicated than that). This is an interaction point with the work @adrinjalali is currently doing in SciKit-Learn around routing arguments through pipelines, so we need to ensure that we're compatible and consistent with that.

Next steps

@riedgar-ms will start on the renaming of GroupedMetric to MetricsFrame tomorrow (13th) unless there are objections
@riedgar-ms will start writing proper docstrings, for review by all interested parties
@riedgar-ms will remove make_derived_metric() from the PR for now

Personally, I would even suggest that make_derived_metric() does not need to be in the v0.5.0 release. Yes, we'd lose some functionality compared to now, but we're almost certain to break existing code anyway. So long as the MetricFrame is present, users can implement their own versions of make_derived_metric() which work in their particular cases (while we work on the general case). We can even provide a sample in the docs, with a note that there are pitfalls.

…d-metrics

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms · 2020-10-19T15:31:13Z

Meeting 2020-10-19

Thanks to @adrinjalali , @hildeweerts and @MiroDudik for our meeting today. We were reviewing the current active PR implementing the new metrics API.

This PR does not include make_derived_metric() and we agreed to keep it that way
We appear to be converging on make_derived_metric() but we agreed not to hold up this PR for that
We also agreed that make_derived_metric() is not essential for a v0.5.0 release, although it would be good if we could get it ready in time. We can always do a v0.5.1 release afterwards

For the PR itself the following issues were raised:

@MiroDudik is unsure about the default naming of the sensitive and control features if there are none from the user. @MiroDudik is going to experiment, and propose an alternative (to the current S/CF [n] names)
@riedgar-ms will add the method= argument to the DP and EO functions
@riedgar-ms will fix the documentation of the control_feature argument to the MetricFrame constructor, so that it contains a reminder of what control features actually are. Right now, you need to read the main body of the constructor help text to see
@riedgar-ms will add sensitive_levels and control_levels properties to the MetricFrame, which will report the number of sensitive and control features in the data. This to to help users who want to stack and unstack the overall and by_group DataFrames

@hildeweerts raised other issues with the documentation format, pointing that the entirefairlearn.metrics API page is a bit of a mess (for example, it would be better to group properties and methods separately within a class). @hildeweerts pointed to the SciKit-Learn documentation as a possible model. However, we agreed that these criticisms, while absolutely important, were broader than just this PR. In the current PR, we will just make sure that the text which is there is correct.

@MiroDudik and @riedgar-ms will work to resolve the issues raised above, and get the PR merged. @adrinjalali and @hildeweerts will also review, but the merge will not be delayed for that (we can make further tweaks next week).

riedgar-ms · 2020-10-26T15:03:59Z

Meeting 2020-10-26

Thanks to @hildeweerts @adrinjalali and @MiroDudik for another productive meeting. We discussed the following:

What to do with control_levels when there are none

If the user wants to do their own manipulations of the by_group DataFrame, they are going to be calling Pandas methods such as max(levels=lvls), groupby(levels=lvls) and unstack(levels=lvls), and they are likely to want to set lvls to the value of the control_levels property. Unfortunately, Pandas isn't consistent in how these different methods behave when lvls is None or []. This raises the question of what control_levels should return if none were specified by the user

We decided that control_levels should be None if none were passed. @adrinjalali pointed out that this is safer due to the mutability of lists, and also conceptually consistent with the user not specifying a control_levels argument to the constructor.

Whether to allow strings for level names

This is related to the previous point. The levels= argument in Pandas can usually be a list of level names or indices, but Pandas allows names to be integers, giving a potential for ambiguity. We may revisit this in future, but we shall require our levels to be strings.

How to handle single callables

@MiroDudik pointed out that if a user makes a call such as:

acc = MetricsFrame(skm.accuracy_score, y_true, y_pred, sensitive_features=A)

then having to write things like

acc.overall['accuracy_score']
acc.by_group['accuracy_score']['group_0']
acc.group_max()['accuracy_score']

is tedious for people messing around with notebooks. It would be good to drop the extra redirection. However, @riedgar-ms was concerned that if a more general library is attempting to consume Fairlearn in a general way, then any logic we put in to handle the single function case would then have to be undone by that higher level library (introducing maintenance overhead and places for bugs to hide).

We agreed that if the user passes in a dictionary of callables, then we keep the same behaviour, even if there's only one entry in the dictionary. However, if the user passes in a callable, then DataFrames becomes Series, and Series become scalars, so that users don't have to keep typing [metric_name] to get at their results. @adrinjalali pointed out that this would behave like an overloaded function in languages which supported such things.

The initial implementation of this is likely to be messy (at this point in time, just set a flag and process all output before it's returned to the user). However, @riedgar-ms will provide a full set of tests, so that the internals can be improved in the future.

riedgar-ms · 2020-11-02T16:10:25Z

Meeting 2020-11-02

Thank you to @adrinjalali @hildeweerts and @MiroDudik for another productive discussion.

We agreed that the best way forward is to release what we have as v0.5.0 as soon as possible. @hildeweerts and @adrinjalali will take a look from a user perspective, but we won't hold a release except for showstopper bugs.

We also discussed make_derived_metric() now that it's reappearing. We won't hold v0.5.0 for this; it can go out with v0.5.1 a few weeks later. In the call, we adjusted some arguments. @riedgar-ms is going to have named arguments be required, and add documentation. Ways of getting ride of the manipulations of globals() will also be investigated, and reflecting on the user-provided function for a method= argument.

riedgar-ms added 10 commits August 25, 2020 12:33

Starting work

8c96695

Signed-off-by: Richard Edgar <[email protected]>

More text

531c195

Signed-off-by: Richard Edgar <[email protected]>

Working through next bit

66eab2f

Signed-off-by: Richard Edgar <[email protected]>

Adding another example

cf8ae38

Signed-off-by: Richard Edgar <[email protected]>

Add in multiple sensitive features and segmentation

1d8987c

Signed-off-by: Richard Edgar <[email protected]>

Sketch out the multiple metrics

6d5c5e3

Signed-off-by: Richard Edgar <[email protected]>

Forgot to add a line

00e0a1a

Signed-off-by: Richard Edgar <[email protected]>

Think a bit about multiple metrics and getting the names

5ce78ec

Signed-off-by: Richard Edgar <[email protected]>

Add a note about some convenience wrappers

9061c70

Signed-off-by: Richard Edgar <[email protected]>

Some notes on pitfalls

5cd4e88

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms requested a review from MiroDudik August 26, 2020 14:23

riedgar-ms added 2 commits August 27, 2020 14:46

Some more questions

75e1d1d

Signed-off-by: Richard Edgar <[email protected]>

Change Segmented Metrics to be Conditional Metrics

aac662c

Signed-off-by: Richard Edgar <[email protected]>

romanlutz reviewed Aug 29, 2020

View reviewed changes

adrinjalali reviewed Aug 31, 2020

View reviewed changes

riedgar-ms added 4 commits September 1, 2020 09:36

Working through some of the suggested changes

bf96d2b

Signed-off-by: Richard Edgar <[email protected]>

Some more fixes

416d427

Signed-off-by: Richard Edgar <[email protected]>

More fixes

fbb634f

Signed-off-by: Richard Edgar <[email protected]>

Fix an errant sex

77777ba

Signed-off-by: Richard Edgar <[email protected]>

hildeweerts reviewed Sep 1, 2020

View reviewed changes

riedgar-ms added 6 commits September 1, 2020 15:50

Typo fix

6d69fa9

Signed-off-by: Richard Edgar <[email protected]>

Update the metric_ property

13107f4

Signed-off-by: Richard Edgar <[email protected]>

Add extra clarifying note about datatypes for intersections

048e484

Signed-off-by: Richard Edgar <[email protected]>

Expand on note for conditional parity input types

9ff4e91

Signed-off-by: Richard Edgar <[email protected]>

Further updates to the text

5040649

Signed-off-by: Richard Edgar <[email protected]>

Add some suggestions for alternative names

210872e

Signed-off-by: Richard Edgar <[email protected]>

MiroDudik reviewed Sep 7, 2020

View reviewed changes

api/Updated-Metrics.md Outdated Show resolved Hide resolved

MiroDudik reviewed Sep 7, 2020

View reviewed changes

api/Updated-Metrics.md Show resolved Hide resolved

MiroDudik reviewed Sep 7, 2020

View reviewed changes

api/Updated-Metrics.md Show resolved Hide resolved

riedgar-ms added 2 commits September 10, 2020 12:04

Another correction

0f28fe3

Signed-off-by: Richard Edgar <[email protected]>

Some fixes to remove group_summary() (not yet complete)

2ea2037

Signed-off-by: Richard Edgar <[email protected]>

MiroDudik reviewed Sep 11, 2020

View reviewed changes

api/Updated-Metrics.md Show resolved Hide resolved

riedgar-ms added 4 commits September 11, 2020 16:35

Some extensive edits

41bb325

Signed-off-by: Richard Edgar <[email protected]>

Add make_grouped_scorer()

205c239

Signed-off-by: Richard Edgar <[email protected]>

Errant group_summary

70d8811

Signed-off-by: Richard Edgar <[email protected]>

Add link to SLEP006

96703c6

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms added 3 commits September 14, 2020 11:31

Minor update to notebook

7521261

Signed-off-by: Richard Edgar <[email protected]>

Some small updates to the proposal

f9ed32c

Signed-off-by: Richard Edgar <[email protected]>

Add make_derived_metric back in

091c47e

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms commented Sep 14, 2020

View reviewed changes

api/Updated-Metrics.md Outdated Show resolved Hide resolved

Add note about meetings

180370b

Signed-off-by: Richard Edgar <[email protected]>

MiroDudik reviewed Sep 21, 2020

View reviewed changes

api/Updated-Metrics.md Show resolved Hide resolved

MiroDudik reviewed Sep 21, 2020

View reviewed changes

api/Updated-Metrics.md Outdated Show resolved Hide resolved

riedgar-ms added 2 commits September 21, 2020 15:30

Merge remote-tracking branch 'upstream/master' into riedgar-ms/revise…

16983bf

…d-metrics

Update after today's discussion

acfacab

Signed-off-by: Richard Edgar <[email protected]>

riedgar-ms mentioned this pull request Sep 22, 2020

Re-revised metrics API fairlearn/fairlearn#582

Merged

riedgar-ms added 4 commits October 15, 2020 10:40

Merge remote-tracking branch 'upstream/master' into riedgar-ms/revise…

697eee0

…d-metrics

Update to reflect reality

f0cdb5c

Signed-off-by: Richard Edgar <[email protected]>

Remove uneeded notebook

4feb19f

Signed-off-by: Richard Edgar <[email protected]>

Fix the odd typo

5649f92

Signed-off-by: Richard Edgar <[email protected]>

	y_pred = [1, 0, 0, 1, 0, 1, ...]
	# or soft values, depending on what the underlying metric expects
	y_pred = [1, 0, 0, 1, 0, 1, ...]

Revised metrics proposal #15

Are you sure you want to change the base?

Revised metrics proposal #15

Uh oh!

Conversation

riedgar-ms commented Aug 26, 2020

Uh oh!

romanlutz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

riedgar-ms commented Sep 8, 2020 •

edited

Loading

riedgar-ms commented Sep 14, 2020 •

edited

Loading

riedgar-ms commented Sep 21, 2020 •

edited

Loading

riedgar-ms commented Sep 28, 2020 •

edited

Loading