-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.Rmd
158 lines (122 loc) · 5.51 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
title: The grammar of interactive explanatory model analysis
#subtitle: Landing Page
#date: "`r Sys.Date()`"
output: rmdformats::html_docco
---
<!-- Place this tag in your head or just before your close body tag. -->
<script async defer src="https://buttons.github.io/buttons.js"></script>
<style type="text/css">
.main-container {
max-width: 1250px;
}
@media (min-width: 992px) {
.col-md-offset-1 {
margin-left: 0%;
}
.col-md-10 {
width: 100%;
}
}
h1.title {
text-transform: none !important;
}
.r2d3 {
position: relative !important;
}
a {
color: #337ab7;
}
</style>
## Paper
Hubert Baniecki, Dariusz Parzych, Przemyslaw Biecek.
*The grammar of interactive explanatory model analysis*.
**Data Mining and Knowledge Discovery**, 2023. https://doi.org/10.1007/s10618-023-00924-w
## Abstract
*The growing need for in-depth analysis of predictive models leads to a series of new methods for explaining their local and global properties. Which of these methods is the best? It turns out that this is an ill-posed question. One cannot sufficiently explain a black-box machine learning model using a single method that gives only one perspective. Isolated explanations are prone to misunderstanding, leading to wrong or simplistic reasoning. This problem is known as the \emph{Rashomon effect} and refers to diverse, even contradictory, interpretations of the same phenomenon. Surprisingly, most methods developed for explainable and responsible machine learning focus on a~single-aspect of the model behavior. In contrast, we showcase the problem of explainability as an interactive and sequential analysis of a model. This paper proposes how different Explanatory Model Analysis (EMA) methods complement each other and discusses why it is essential to juxtapose them. The introduced process of Interactive EMA (IEMA) derives from the algorithmic side of explainable machine learning and aims to embrace ideas developed in cognitive sciences. We formalize the grammar of IEMA to describe potential human-model interaction. It is implemented in a widely used human-centered open-source software framework that adopts interactivity, customizability and automation as its main traits. We conduct a user study to evaluate the usefulness of IEMA, which indicates that an interactive sequential analysis of a model may increase the accuracy and confidence of human decision making.*
<img src="https://raw.githubusercontent.com/MI2DataLab/iema/main/figures/blackbox.png" alt="Black-Box" style="max-width:75%;">
## IEMA
<img src="https://raw.githubusercontent.com/MI2DataLab/iema/main/figures/iema.png" alt="The Grammar of Interactive Explanatory Model Analysis" style="max-width:75%;">
<img src="https://raw.githubusercontent.com/MI2DataLab/iema/main/figures/long.gif" alt="modelStudio.gif" style="max-width:99%;">
## User study
<img src="https://raw.githubusercontent.com/MI2DataLab/iema/main/figures/table4.png" style="max-width:75%;">
<img src="https://raw.githubusercontent.com/MI2DataLab/iema/main/figures/table5.png" style="max-width:75%;">
<img src="https://raw.githubusercontent.com/MI2DataLab/iema/main/figures/figure14.png" style="max-width:75%;">
## Dashboard
```{r echo=FALSE, warning=FALSE, message=FALSE}
library(DALEX)
data <- fifa
data$wage_eur <- data$overall <- data$potential <- data$nationality <- NULL
data$value_eur <- log10(data$value_eur)
set.seed(1313)
library(gbm)
# make a model
model <- gbm(value_eur ~ . ,
data = data,
n.trees = 300,
interaction.depth = 4,
distribution = "gaussian")
# wrap the model into an explainer
explainer <- DALEX::explain(
model,
data = data[,-1],
y = 10^data$value_eur,
predict_function = function(m,x) 10^predict(m, x, n.trees = 300),
label = 'gbm',
verbose = FALSE
)
library(modelStudio)
# use parallelMap to speed up the computation
options(
parallelMap.default.mode = "socket",
parallelMap.default.cpus = 4,
parallelMap.default.show.info = FALSE
)
# pick observations
fifa_selected <- data[1:40, ]
fifa_selected <- rbind(
fifa_selected["R. Lewandowski",],
fifa_selected[rownames(fifa_selected) != "R. Lewandowski",]
)
# make a studio for the model
iema_ms <- modelStudio(
explainer,
new_observation = fifa_selected,
B = 50, # raised for more detailed results, could be 15
parallel = TRUE,
rounding_function = signif, digits = 5,
options = ms_options(
#show_boxplot = FALSE,
margin_left = 160,
margin_ytitle = 100,
ms_title = "Interactive Studio for GBM model on FIFA-20 data"
)
)
iema_ms
```
Created using **modelStudio**: https://github.com/ModelOriented/modelStudio
## References
For a description of the Interactive EMA process, refer to our [DMKD article](https://doi.org/10.1007/s10618-023-00924-w):
```
@article{baniecki2023grammar,
title = {The grammar of interactive explanatory model analysis},
author = {Hubert Baniecki and Dariusz Parzych and Przemyslaw Biecek},
journal = {Data Mining and Knowledge Discovery},
year = {2023},
pages = {1--37},
url = {https://doi.org/10.1007/s10618-023-00924-w}
}
```
If you use `modelStudio`, please cite our [JOSS article](https://joss.theoj.org/papers/10.21105/joss.01798):
```
@article{baniecki2019modelstudio,
title = {{modelStudio: Interactive Studio with Explanations for ML Predictive Models}},
author = {Hubert Baniecki and Przemyslaw Biecek},
journal = {Journal of Open Source Software},
year = {2019},
volume = {4},
number = {43},
pages = {1798},
url = {https://doi.org/10.21105/joss.01798}
}
```