-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathEvaluation2.html
358 lines (345 loc) · 22.9 KB
/
Evaluation2.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
<!DOCTYPE HTML>
<!--
Spectral by HTML5 UP
html5up.net | @ajlkn
Free for personal and commercial use under the CCA 3.0 license (html5up.net/license)
-->
<html>
<head>
<title>Systems Engineering HCI</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<!--[if lte IE 8]><script src="assets/js/ie/html5shiv.js"></script><![endif]-->
<link rel="stylesheet" href="assets/css/main.css" />
<!--[if lte IE 8]><link rel="stylesheet" href="assets/css/ie8.css" /><![endif]-->
<!--[if lte IE 9]><link rel="stylesheet" href="assets/css/ie9.css" /><![endif]-->
</head>
<body>
<!-- Page Wrapper -->
<div id="page-wrapper">
<!-- Header -->
<header id="header">
<h1><a href="index.html">Systems Engineering </a></h1>
<nav id="nav">
<ul>
<li class="special">
<a href="#menu" class="menuToggle"><span>Menu</span></a>
<div id="menu">
<ul>
<li><a href="index.html">Overview</a></li>
<li><a href="Requirements2.html">Requirements</a></li>
<li><a href="Research2.html">Research</a></li>
<li><a href="HCI.html">HCI</a></li>
<li><a href="Design.html">Design</a></li>
<li><a href="Testing.html">Testing</a></li>
<li><a href="Evaluation2.html">Evaluation</a></li>
<li><a href="Management.html">Management</a></li>
</ul>
</div>
</li>
</ul>
</nav>
</header>
<!-- Main -->
<article id="main">
<header>
<h2>Evaluation</h2>
</header>
<section class="wrapper style5">
<div class="inner">
<section>
<h4>Summary of Achievements</h4>
<p>
Over the past 6 months, we managed to successfully implement all of the “must have” requirements as well as the “should have” ones. We unfortunately only completed 1 of the 2 “could have” functionalities due to time constraints.
</p>
<p>
The feature we were unable to implement is the following:” The chatbot is conversational and user can interact with it in a natural human-like way”. If this feature were implemented, the bot could have been able to have dialogs with the user by remembering what the user had previously said.
</p>
<section>
<img src="images/achievements.jpg" style="margin-left: 10%;width: 80%;">
</section>
<hr />
<h4>A List of Known Bugs</h4>
<p>
In terms of the bugs, we have spotted that our bot has a tautology problem on only one type of question. Sometimes, when the bot is asked a question about the energy price which included date and time, it does not extract the date entity and asks the user to input it once again. We believe this is due to the fact that the bot is struggling to understand if a number (for example 18) is the day of the month or 18 o’clock. We trained the bot more to make it distinguish between the hour and day of the month, but there are still times when it is faulty.
</p>
<section>
<div class="table-wrapper">
<table>
<thead>
<tr>
<th>ID</th>
<th>Bug Description</th>
<th>Priority</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>When asked a complete question about energy price, the bot still asks for the date and time</td>
<td>Medium</td>
</tr>
</tbody>
</table>
</div>
</section>
<hr />
<h4>Individual Contribution Table</h4>
<section>
<img src="images/individual.jpg" style="margin-left: 10%;width: 80%;">
</section>
<hr />
<h4>Critical evaluation of the project</h4>
<hr />
<h4>Chat Bot Evaluation</h4>
<h5>
User interface/User experience (if applicable)
</h5>
<p>
We believe that our bot will have a great impact on the users of Renewables.AI once the bot will be integrated in the online platform. Our main focus was to make the bot able to use a concise and simple vocabulary so that any user could use it without encountering any language difficulties. Moreover, the bot will respond to any question and it will inform the user if it cannot understand the question or if it needs additional information.
</p>
<p>
It is also important to note that the bot will be the first thing on the web app that interacts with the user, which will in turn trigger their curiosity and perhaps even start using it without exactly knowing what it does. When the users start to understand what the bot can do for them, they could be inclined to come back for future conversations.
</p>
<section>
<img src="images/bot1.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
As a result, having this chat bot on the platform will likely increase the average session length of the users on the platform. The bot could be very helpful for the users who want to quickly find out a specific piece of information because the user would not be required to browse through the platform to find it. Moreover, the bot’s ability to display graphs gives the user a clear insight into their plant’s performance data by showing trends and other important statistics.
</p>
<section>
<img src="images/bot2.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<h5>
Functionality
</h5>
<p>
The bot shows an almost perfect performance, by being able to understand the users’ requests and responding to them accordingly. The bot is thus not only able to ask questions whenever it does not recognise the request, but it can also access the database to get the right data and display it in a meaningful manner to the the user. The conversation between the bot and the user cannot yet be fully categorized as natural, human-like conversation since the chatbot’s main purpose is to aid the user with information. This component performs well as we conducted the performance and stress tests.
</p>
<h5>
Maintainability
</h5>
<p>
In terms of the maintenance cost for the bot, we calculated the approximate cost of keeping the chatbot and the “Renewables.AI” webapp running on Azure.
</p>
<p>
For the chatbot, we found out that the owners of Renewables.AI would have to spend approximately 250 pounds per month by choosing the standard package in order to deliver a responsive chatbot to the public.This package includes unlimited messages and hosting on Azure.
</p>
<p>
Moreover, in order to host the webapp on Azure, making sure that the confidential data is secured and that the application runs effectively, the owners would be required to spend approximately 550 pounds per month. That would include 250GB of data storage for the user data as well as solar energy information such as the data received from the DarkSky API.
</p>
<p>
The total maintenance costs would be low for the owners since it is a profitable application and numerous solar farm owners would invest in this application in order to help their businesses grow.
</p>
<hr />
<h4>
Project Management
</h4>
<p>
Our team managed to deliver the key requirements of the project on time to our clients. We used the Agile Methodology focusing on team collaboration and emphasizing incremental delivery. We used Messenger as a communication channel where we would update each other on the current status of our tasks, as well as to ask for help whenever needed. We also had a common Google Drive folder where we held important documents such as bi-weekly reports, project diagrams or video scripts. We also had a Microsoft Teams channel with our clients where we set up Skype or personal meetings alongside uploaded documents and reports regarding our work. We used Github as a version control hosting service, where we kept our code in private repositories.
</p>
<hr />
<h4>Data Science Model Evaluation</h4>
<p>
The data science part of the project involved us predicting the hourly energy output of solar energy plants for the next 48 hours, given 2 years of past hourly irradiance and energy data for a given plant as well as a 48 hour into the future hourly irradiance forecast.
</p>
<p>
We considered models from two categories of forecasting methods:
</p>
<ul>
<li>
Time series methods
</li>
<li>
Regression analysis.
</li>
</ul>
<p>
For the time series methods we considered:
</p>
<ul>
<li>
ARIMA
</li>
<li>
Prophet
</li>
</ul>
<p>
Whereas for regression analysis we considered:
</p>
<ul>
<li>
Linear regression
</li>
<li>
Random Forest regression
</li>
</ul>
<p>
We decided to evaluate and compare the performance of all 4 models above.
</p>
<p>
Before evaluating the performance of each of the models we decided to further explore the data so that we could better understand potential issues models may face.
</p>
<p>
The graph below shows the irradiance (kWh/m^2) against time for a 2 year time period for plant 3:
</p>
<section>
<img src="images/ds1.png" style="margin-left: 20%;width: 60%;">
</section>
<p>
The graph below shows the energy output (kWh) against time for a 2 year time period for plant 3:
</p>
<section>
<img src="images/ds2.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
The two graphs clearly indicates that there is strong positive linear correlation between the irradiance at a plant and the energy output of a plant. In fact, we calculated that Pearson product-moment correlation coefficient (PPMCC) for the two variables was 0.981 and the graph below shows the first 2 days for the data above with the irradiance plotted on the same graph as the energy output.
</p>
<section>
<img src="images/ds3.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
Therefore, we initially thought that, given the strong correlation between the two variables, a regression analysis based approach to the forecasting would produce the best results. However, thinking more deeply about the problem domain we discovered that a regression based approach would not always produce the best results. The reason for this can be seen in the graph above in the time period just before July. Where the is consistently high yet the energy output of the solar energy farm is 0 which indicates that the solar farm had an outage; perhaps it went down for maintenance work or broke unexpectedly.
</p>
<p>
Nonetheless, the correlation between irradiance and energy output breaks down for a period of time in the data above which means that using a regression analysis based method that only considers the irradiance present at the plant will produce inaccurate and forecasts the the plant during this time period. Whereas, a time series method that is based on temporal dependencies and past energy values will detect the plant outage and outperform a regression analysis based method. Moreover, a time series method does not need to be completely restrained whenever new data arrives whereas a regression analysis based model does which is extremely inefficient.
</p>
<h5>Model Evaluation of Linear Regression vs Random Forest Regression</h5>
<p>
In statistics, the coefficient of determination, denoted R2 is the proportion of the variance in the dependent variable that is predictable from the independent variable(s). It measures how much the predicted value varies from the actual value. The closer the value is to 1 the lower the variance. However, this does not guarantee that the model fits the data (bias) well because it could had overfitted the data provided.
</p>
<p>
The average R2 value for the Linear regression across all 5 plants was 0.974, whereas the average R2 value for the Random Forest regression across all 5 plants was 0.986.
The mean absolute error for Linear regression was: 423.2 whereas the mean absolute error for Random Forest regression was 321.4. The diagram below shows (out-of-box) the performance of the two types regressions compared to each other for a 48 hour prediction using test data. blue is Linear regression, red is Random Forest regression, and green is the actual energy output.
</p>
<section>
<img src="images/ds4.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
This suggests that the Random forest regression algorithm fits the data better and is likely to produce highly quality forecasts than the Linear regression algorithm. However, before drawing this conclusion from the metrics and the graph above, we plotted the residuals of the random forest algorithm as shown below.
</p>
<section>
<img src="images/ds5.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
And the density plot of the residuals are shown below.
</p>
<section>
<img src="images/ds6.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
This close to normal distribution shows that the random forest has low bias. Therefore the Random Forest regression model fits the data very well as it has low variance and low bias and should be outperforms the Linear regression model.
</p>
<h5>Comparison of Time Series Methods - ARIMA vs Prophet</h5>
<p>
The graphs below shows the average forecast given by the ARIMA model followed by the actual values.
</p>
<section>
<img src="images/ds7.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<section>
<img src="images/ds8.png" style="margin-left: 20%;width: 60%;">
</section>
<br>
<p>
The parameters used by the ARIMA model were selected automatically by R forecast package using auto.arima. ARIMA was able to identify the general shape of the data and produce a reasonable forecast. However, the mean absolute error (3534.2) calculated from using a rolling window of the dataset was significantly greater than the regression analysis based methods.
</p>
<p>
The graph below shows an example forecast generated using the Prophet (as well as a baseline comparison of the Random Forest regression forecast)
</p>
<section>
<img src="images/ds9.png" style="margin-left: 10%;width: 80%;">
</section>
<br>
<p>
In this example:
</p>
<p>
Mean absolute error (Prophet) = 42.6056837032
</p>
<p>
Mean absolute error (Random Forest regression) = 36.1351450921
</p>
<p>
Which meant that Prophet performanced almost as well as the Random Forest Algorithm.
After using a rolling window comparison of the dataset we discovered that on average:
</p>
<p>
Mean absolute error (Prophet) = 83.6056837032
</p>
<p>
Mean absolute error (Random Forest regression) = 49.1351450921
</p>
<p>
Which meant that the Random Forest regression had lower variance than the Prophet algorithm. However, after plotting the residuals for both algorithms it was clear that only the Prophet algorithm detected and reacted to outages of solar energy plants which is shown by the graph below.
</p>
<section>
<img src="images/ds10.png" style="margin-left: 10%;width: 80%;">
</section>
<br>
<p>
For this example:
</p>
<p>
Mean absolute error (Prophet) = 8.39729535258
</p>
<p>
Mean absolute error (Linear Regression) = 331.831629189
</p>
<h5>Model Selection</h5>
<p>
After extensive model evaluation (cross validation, residual checking, and baseline comparisons), in conclusion we decided to select the Prophet model for four main reasons:
</p>
<ol>
<li>
It takes into account plant outages
</li>
<li>
It provides an easy way to add changepoints to indicate changes in the trend of the data for example when a new solar panel is installed at a plant the trend in data changes
</li>
<li>
It allows (through its add holiday interface) for planned maintenance work dates to be incorporated into the model
</li>
<li>
Does not require complete retraining when new data arrives
</li>
</ol>
<hr />
<h4>
Future Work
</h4>
<p>
Supposing we had another 3 more months to work on our project, one of our main tasks would be to make the Data Science part and Chatbot System integration. We could also try to implement the last “could have” requirement : ” The chatbot is conversational and user can interact with it in a natural human-like way”. We could have implemented this feature by focusing on the language understanding aspect as well as making it keep track of the past questions that had been asked.
</p>
<p>
We believe that one possible future improvement for this project is to extend it on to other renewable energy sources such as hydro energy or wind power. That would be a great success for both Renewables.AI as well as the business owners.
</p>
<p>
One other improvement would be to come up with a more precise Data Science algorithm. A more precise algorithm means delivering a better service to the users of Renewables.AI. Furthermore, as more data is accumulated and new algorithms with better performance appear on the market, failing to update the Data Science algorithm would make Renewables.AI at one point out of date. That is one issue the owners of this project should take into account in the future.
</p>
<hr />
</section>
</div>
</section>
</article>
</div>
<!-- Scripts -->
<script src="assets/js/jquery.min.js"></script>
<script src="assets/js/jquery.scrollex.min.js"></script>
<script src="assets/js/jquery.scrolly.min.js"></script>
<script src="assets/js/skel.min.js"></script>
<script src="assets/js/util.js"></script>
<!--[if lte IE 8]><script src="assets/js/ie/respond.min.js"></script><![endif]-->
<script src="assets/js/main.js"></script>
</body>
</html>