forked from carpentries-incubator/deep-learning-intro
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2-keras.html
1466 lines (1422 loc) · 91 KB
/
2-keras.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<!-- START: inst/pkgdown/templates/layout.html --><!-- Generated by pkgdown: do not edit by hand --><html lang="en" data-bs-theme="auto"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><title>Introduction to deep learning: Classification by a neural network using Keras</title><meta name="viewport" content="width=device-width, initial-scale=1"><script src="assets/themetoggle.js"></script><link rel="stylesheet" type="text/css" href="assets/styles.css"><script src="assets/scripts.js" type="text/javascript"></script><!-- mathjax --><script type="text/x-mathjax-config">
MathJax.Hub.Config({
config: ["MMLorHTML.js"],
jax: ["input/TeX","input/MathML","output/HTML-CSS","output/NativeMML", "output/PreviewHTML"],
extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "fast-preview.js", "AssistiveMML.js", "a11y/accessibility-menu.js"],
TeX: {
extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
},
tex2jax: {
inlineMath: [['\\(', '\\)']],
displayMath: [ ['$$','$$'], ['\\[', '\\]'] ],
processEscapes: true
}
});
</script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><!-- Responsive Favicon for The Carpentries --><link rel="apple-touch-icon" sizes="180x180" href="favicons/incubator/apple-touch-icon.png"><link rel="icon" type="image/png" sizes="32x32" href="favicons/incubator/favicon-32x32.png"><link rel="icon" type="image/png" sizes="16x16" href="favicons/incubator/favicon-16x16.png"><link rel="manifest" href="favicons/incubator/site.webmanifest"><link rel="mask-icon" href="favicons/incubator/safari-pinned-tab.svg" color="#5bbad5"><meta name="msapplication-TileColor" content="#da532c"><meta name="theme-color" media="(prefers-color-scheme: light)" content="white"><meta name="theme-color" media="(prefers-color-scheme: dark)" content="black"></head><body>
<header id="top" class="navbar navbar-expand-md top-nav incubator"><svg xmlns="http://www.w3.org/2000/svg" class="d-none"><symbol id="check2" viewbox="0 0 16 16"><path d="M13.854 3.646a.5.5 0 0 1 0 .708l-7 7a.5.5 0 0 1-.708 0l-3.5-3.5a.5.5 0 1 1 .708-.708L6.5 10.293l6.646-6.647a.5.5 0 0 1 .708 0z"></path></symbol><symbol id="circle-half" viewbox="0 0 16 16"><path d="M8 15A7 7 0 1 0 8 1v14zm0 1A8 8 0 1 1 8 0a8 8 0 0 1 0 16z"></path></symbol><symbol id="moon-stars-fill" viewbox="0 0 16 16"><path d="M6 .278a.768.768 0 0 1 .08.858 7.208 7.208 0 0 0-.878 3.46c0 4.021 3.278 7.277 7.318 7.277.527 0 1.04-.055 1.533-.16a.787.787 0 0 1 .81.316.733.733 0 0 1-.031.893A8.349 8.349 0 0 1 8.344 16C3.734 16 0 12.286 0 7.71 0 4.266 2.114 1.312 5.124.06A.752.752 0 0 1 6 .278z"></path><path d="M10.794 3.148a.217.217 0 0 1 .412 0l.387 1.162c.173.518.579.924 1.097 1.097l1.162.387a.217.217 0 0 1 0 .412l-1.162.387a1.734 1.734 0 0 0-1.097 1.097l-.387 1.162a.217.217 0 0 1-.412 0l-.387-1.162A1.734 1.734 0 0 0 9.31 6.593l-1.162-.387a.217.217 0 0 1 0-.412l1.162-.387a1.734 1.734 0 0 0 1.097-1.097l.387-1.162zM13.863.099a.145.145 0 0 1 .274 0l.258.774c.115.346.386.617.732.732l.774.258a.145.145 0 0 1 0 .274l-.774.258a1.156 1.156 0 0 0-.732.732l-.258.774a.145.145 0 0 1-.274 0l-.258-.774a1.156 1.156 0 0 0-.732-.732l-.774-.258a.145.145 0 0 1 0-.274l.774-.258c.346-.115.617-.386.732-.732L13.863.1z"></path></symbol><symbol id="sun-fill" viewbox="0 0 16 16"><path d="M8 12a4 4 0 1 0 0-8 4 4 0 0 0 0 8zM8 0a.5.5 0 0 1 .5.5v2a.5.5 0 0 1-1 0v-2A.5.5 0 0 1 8 0zm0 13a.5.5 0 0 1 .5.5v2a.5.5 0 0 1-1 0v-2A.5.5 0 0 1 8 13zm8-5a.5.5 0 0 1-.5.5h-2a.5.5 0 0 1 0-1h2a.5.5 0 0 1 .5.5zM3 8a.5.5 0 0 1-.5.5h-2a.5.5 0 0 1 0-1h2A.5.5 0 0 1 3 8zm10.657-5.657a.5.5 0 0 1 0 .707l-1.414 1.415a.5.5 0 1 1-.707-.708l1.414-1.414a.5.5 0 0 1 .707 0zm-9.193 9.193a.5.5 0 0 1 0 .707L3.05 13.657a.5.5 0 0 1-.707-.707l1.414-1.414a.5.5 0 0 1 .707 0zm9.193 2.121a.5.5 0 0 1-.707 0l-1.414-1.414a.5.5 0 0 1 .707-.707l1.414 1.414a.5.5 0 0 1 0 .707zM4.464 4.465a.5.5 0 0 1-.707 0L2.343 3.05a.5.5 0 1 1 .707-.707l1.414 1.414a.5.5 0 0 1 0 .708z"></path></symbol></svg><a class="visually-hidden-focusable skip-link" href="#main-content">Skip to main content</a>
<div class="container-fluid top-nav-container">
<div class="col-md-8">
<div class="large-logo">
<img id="incubator-logo" alt="Carpentries Incubator" src="assets/images/incubator-logo.svg"></div>
</div>
<div class="selector-container">
<div id="theme-selector">
<li class="nav-item dropdown" id="theme-button-list">
<button class="btn btn-link nav-link px-0 px-lg-2 dropdown-toggle d-flex align-items-center" id="bd-theme" type="button" aria-expanded="false" data-bs-toggle="dropdown" data-bs-display="static" aria-label="Toggle theme (auto)">
<svg class="bi my-1 theme-icon-active"><use href="#circle-half"></use></svg><i data-feather="chevron-down"></i>
</button>
<ul class="dropdown-menu dropdown-menu-end" aria-labelledby="bd-theme-text"><li>
<button type="button" class="btn dropdown-item d-flex align-items-center" data-bs-theme-value="light" aria-pressed="false">
<svg class="bi me-2 theme-icon"><use href="#sun-fill"></use></svg>
Light
<svg class="bi ms-auto d-none"><use href="#check2"></use></svg></button>
</li>
<li>
<button type="button" class="btn dropdown-item d-flex align-items-center" data-bs-theme-value="dark" aria-pressed="false">
<svg class="bi me-2 theme-icon"><use href="#moon-stars-fill"></use></svg>
Dark
<svg class="bi ms-auto d-none"><use href="#check2"></use></svg></button>
</li>
<li>
<button type="button" class="btn dropdown-item d-flex align-items-center active" data-bs-theme-value="auto" aria-pressed="true">
<svg class="bi me-2 theme-icon"><use href="#circle-half"></use></svg>
Auto
<svg class="bi ms-auto d-none"><use href="#check2"></use></svg></button>
</li>
</ul></li>
</div>
<div class="dropdown" id="instructor-dropdown">
<button class="btn btn-secondary dropdown-toggle bordered-button" type="button" id="dropdownMenu1" data-bs-toggle="dropdown" aria-expanded="false">
<i aria-hidden="true" class="icon" data-feather="eye"></i> Learner View <i data-feather="chevron-down"></i>
</button>
<ul class="dropdown-menu" aria-labelledby="dropdownMenu1"><li><button class="dropdown-item" type="button" onclick="window.location.href='instructor/2-keras.html';">Instructor View</button></li>
</ul></div>
</div>
</div>
<hr></header><nav class="navbar navbar-expand-xl bottom-nav incubator" aria-label="Main Navigation"><div class="container-fluid nav-container">
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle Navigation">
<span class="navbar-toggler-icon"></span>
<span class="menu-title">Menu</span>
</button>
<div class="nav-logo">
<img class="small-logo" alt="Carpentries Incubator" src="assets/images/incubator-logo-sm.svg"></div>
<div class="lesson-title-md">
Introduction to deep learning
</div>
<div class="search-icon-sm">
<!-- TODO: do not show until we have search
<i role="img" aria-label="Search the All In One page" data-feather="search"></i>
-->
</div>
<div class="desktop-nav">
<ul class="navbar-nav me-auto mb-2 mb-lg-0"><li class="nav-item">
<span class="lesson-title">
Introduction to deep learning
</span>
</li>
<li class="nav-item">
<a class="nav-link" href="key-points.html">Key Points</a>
</li>
<li class="nav-item">
<a class="nav-link" href="reference.html#glossary">Glossary</a>
</li>
<li class="nav-item">
<a class="nav-link" href="profiles.html">Learner Profiles</a>
</li>
<li class="nav-item dropdown">
<button class="nav-link dropdown-toggle" id="navbarDropdown" data-bs-toggle="dropdown" aria-expanded="false">
More <i data-feather="chevron-down"></i>
</button>
<ul class="dropdown-menu" aria-labelledby="navbarDropdown"><li><a class="dropdown-item" href="reference.html">Reference</a></li>
</ul></li>
</ul></div>
<!--
<form class="d-flex col-md-2 search-form">
<fieldset disabled>
<input class="form-control me-2 searchbox" type="search" placeholder="" aria-label="">
<button class="btn btn-outline-success tablet-search-button" type="submit">
<i class="search-icon" data-feather="search" role="img" aria-label="Search the All In One page"></i>
</button>
</fieldset>
</form>
-->
<a id="search-button" class="btn btn-primary" href="aio.html" role="button" aria-label="Search the All In One page">Search the All In One page</a>
</div><!--/div.container-fluid -->
</nav><div class="col-md-12 mobile-title">
Introduction to deep learning
</div>
<aside class="col-md-12 lesson-progress"><div style="width: 13%" class="percentage">
13%
</div>
<div class="progress incubator">
<div class="progress-bar incubator" role="progressbar" style="width: 13%" aria-valuenow="13" aria-label="Lesson Progress" aria-valuemin="0" aria-valuemax="100">
</div>
</div>
</aside><div class="container">
<div class="row">
<!-- START: inst/pkgdown/templates/navbar.html -->
<div id="sidebar-col" class="col-lg-4">
<div id="sidebar" class="sidebar">
<nav aria-labelledby="flush-headingEleven"><button role="button" aria-label="close menu" alt="close menu" aria-expanded="true" aria-controls="sidebar" class="collapse-toggle" data-collapse="Collapse " data-episodes="Episodes ">
<i class="search-icon" data-feather="x" role="img"></i>
</button>
<div class="sidebar-inner">
<div class="row mobile-row" id="theme-row-mobile">
<div class="col" id="theme-selector">
<li class="nav-item dropdown" id="theme-button-list">
<button class="btn btn-link nav-link px-0 px-lg-2 dropdown-toggle d-flex align-items-center" id="bd-theme" type="button" aria-expanded="false" data-bs-toggle="dropdown" data-bs-display="static" aria-label="Toggle theme (auto)">
<svg class="bi my-1 theme-icon-active"><use href="#circle-half"></use></svg><span class="d-lg-none ms-1" id="bd-theme-text">Toggle Theme</span>
</button>
<ul class="dropdown-menu dropdown-menu-right" aria-labelledby="bd-theme-text"><li>
<button type="button" class="btn dropdown-item d-flex align-items-center" data-bs-theme-value="light" aria-pressed="false">
<svg class="bi me-2 theme-icon"><use href="#sun-fill"></use></svg>
Light
<svg class="bi ms-auto d-none"><use href="#check2"></use></svg></button>
</li>
<li>
<button type="button" class="btn dropdown-item d-flex align-items-center" data-bs-theme-value="dark" aria-pressed="false">
<svg class="bi me-2 theme-icon"><use href="#moon-stars-fill"></use></svg>
Dark
<svg class="bi ms-auto d-none"><use href="#check2"></use></svg></button>
</li>
<li>
<button type="button" class="btn dropdown-item d-flex align-items-center active" data-bs-theme-value="auto" aria-pressed="true">
<svg class="bi me-2 theme-icon"><use href="#circle-half"></use></svg>
Auto
<svg class="bi ms-auto d-none"><use href="#check2"></use></svg></button>
</li>
</ul></li>
</div>
</div>
<div class="row mobile-row">
<div class="col">
<div class="sidenav-view-selector">
<div class="accordion accordion-flush" id="accordionFlush9">
<div class="accordion-item">
<h2 class="accordion-header" id="flush-headingNine">
<button class="accordion-button collapsed" id="instructor" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapseNine" aria-expanded="false" aria-controls="flush-collapseNine">
<i id="eye" aria-hidden="true" class="icon" data-feather="eye"></i> Learner View
</button>
</h2>
<div id="flush-collapseNine" class="accordion-collapse collapse" aria-labelledby="flush-headingNine" data-bs-parent="#accordionFlush2">
<div class="accordion-body">
<a href="instructor/2-keras.html">Instructor View</a>
</div>
</div>
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
</div><!--div.sidenav-view-selector -->
</div><!--/div.col -->
<hr></div><!--/div.mobile-row -->
<div class="accordion accordion-flush" id="accordionFlush11">
<div class="accordion-item">
<button id="chapters" class="accordion-button show" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapseEleven" aria-expanded="false" aria-controls="flush-collapseEleven">
<h2 class="accordion-header chapters" id="flush-headingEleven">
EPISODES
</h2>
</button>
<div id="flush-collapseEleven" class="accordion-collapse show collapse" aria-labelledby="flush-headingEleven" data-bs-parent="#accordionFlush11">
<div class="accordion-body">
<div class="accordion accordion-flush" id="accordionFlush1">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading1">
<a href="index.html">Summary and Setup</a>
</div><!--/div.accordion-header-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush2">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading2">
<a href="1-introduction.html">1. Introduction</a>
</div><!--/div.accordion-header-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlushcurrent">
<div class="accordion-item">
<div class="accordion-header" id="flush-headingcurrent">
<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapsecurrent" aria-expanded="true" aria-controls="flush-collapsecurrent">
<span class="visually-hidden">Current Chapter</span>
<span class="current-chapter">
2. Classification by a neural network using Keras
</span>
</button>
</div><!--/div.accordion-header-->
<div id="flush-collapsecurrent" class="accordion-collapse collapse show" aria-labelledby="flush-headingcurrent" data-bs-parent="#accordionFlushcurrent">
<div class="accordion-body">
<ul><li><a href="#introduction">Introduction</a></li>
<li><a href="#formulateoutline-the-problem-penguin-classification">1. Formulate/outline the problem: penguin classification</a></li>
<li><a href="#identify-inputs-and-outputs">2. Identify inputs and outputs</a></li>
<li><a href="#prepare-data">3. Prepare data</a></li>
<li><a href="#build-an-architecture-from-scratch-or-choose-a-pretrained-model">4. Build an architecture from scratch or choose a pretrained
model</a></li>
<li><a href="#choose-a-loss-function-and-optimizer">5. Choose a loss function and optimizer</a></li>
<li><a href="#train-model">6. Train model</a></li>
<li><a href="#perform-a-predictionclassification">7. Perform a prediction/classification</a></li>
<li><a href="#measuring-performance">8. Measuring performance</a></li>
<li><a href="#refine-the-model">9. Refine the model</a></li>
<li><a href="#share-model">10. Share model</a></li>
</ul></div><!--/div.accordion-body-->
</div><!--/div.accordion-collapse-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush4">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading4">
<a href="3-monitor-the-model.html">3. Monitor the training process</a>
</div><!--/div.accordion-header-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush5">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading5">
<a href="4-advanced-layer-types.html">4. Advanced layer types</a>
</div><!--/div.accordion-header-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush6">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading6">
<a href="5-transfer-learning.html">5. Transfer learning</a>
</div><!--/div.accordion-header-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush7">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading7">
<a href="6-outlook.html">6. Outlook</a>
</div><!--/div.accordion-header-->
</div><!--/div.accordion-item-->
</div><!--/div.accordion-flush-->
</div>
</div>
</div>
<hr class="half-width"><div class="accordion accordion-flush lesson-resources" id="accordionFlush12">
<div class="accordion-item">
<h2 class="accordion-header" id="flush-headingTwelve">
<button class="accordion-button collapsed" id="lesson-resources" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapseTwelve" aria-expanded="false" aria-controls="flush-collapseTwelve">
RESOURCES
</button>
</h2>
<div id="flush-collapseTwelve" class="accordion-collapse collapse" aria-labelledby="flush-headingTwelve" data-bs-parent="#accordionFlush12">
<div class="accordion-body">
<ul><li>
<a href="key-points.html">Key Points</a>
</li>
<li>
<a href="reference.html#glossary">Glossary</a>
</li>
<li>
<a href="profiles.html">Learner Profiles</a>
</li>
<li><a href="reference.html">Reference</a></li>
</ul></div>
</div>
</div>
</div>
<hr class="half-width lesson-resources"><a href="aio.html">See all in one page</a>
<hr class="d-none d-sm-block d-md-none"><div class="d-grid gap-1">
</div>
</div><!-- /div.accordion -->
</div><!-- /div.sidebar-inner -->
</nav></div><!-- /div.sidebar -->
</div><!-- /div.sidebar-col -->
<!-- END: inst/pkgdown/templates/navbar.html-->
<!-- START: inst/pkgdown/templates/content-instructor.html -->
<div class="col-xl-8 col-lg-12 primary-content">
<nav class="lesson-content mx-md-4" aria-label="Previous and Next Chapter"><!-- content for small screens --><div class="d-block d-sm-block d-md-none">
<a class="chapter-link" href="1-introduction.html"><i aria-hidden="true" class="small-arrow" data-feather="arrow-left"></i>Previous</a>
<a class="chapter-link float-end" href="3-monitor-the-model.html">Next<i aria-hidden="true" class="small-arrow" data-feather="arrow-right"></i></a>
</div>
<!-- content for large screens -->
<div class="d-none d-sm-none d-md-block">
<a class="chapter-link" href="1-introduction.html" rel="prev">
<i aria-hidden="true" class="small-arrow" data-feather="arrow-left"></i>
Previous: Introduction
</a>
<a class="chapter-link float-end" href="3-monitor-the-model.html" rel="next">
Next: Monitor the training...
<i aria-hidden="true" class="small-arrow" data-feather="arrow-right"></i>
</a>
</div>
<hr></nav><main id="main-content" class="main-content"><div class="container lesson-content">
<h1>Classification by a neural network using Keras</h1>
<p>Last updated on 2024-12-03 |
<a href="https://github.com/carpentries-incubator/deep-learning-intro/edit/main/episodes/2-keras.Rmd" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
<div class="text-end">
<button role="button" aria-pressed="false" tabindex="0" id="expand-code" class="pull-right" data-expand="Expand All Solutions " data-collapse="Collapse All Solutions "> Expand All Solutions <i aria-hidden="true" data-feather="plus"></i></button>
</div>
<div class="overview card">
<h2 class="card-header">Overview</h2>
<div class="row g-0">
<div class="col-md-4">
<div class="card-body">
<div class="inner">
<h3 class="card-title">Questions</h3>
<ul><li>How do I compose a neural network using Keras?</li>
<li>How do I train this network on a dataset?</li>
<li>How do I get insight into learning process?</li>
<li>How do I measure the performance of the network?</li>
</ul></div>
</div>
</div>
<div class="col-md-8">
<div class="card-body">
<div class="inner bordered">
<h3 class="card-title">Objectives</h3>
<ul><li>Use the deep learning workflow to structure the notebook</li>
<li>Explore the dataset using pandas and seaborn</li>
<li>Identify the inputs and outputs of a deep neural network.</li>
<li>Use one-hot encoding to prepare data for classification in
Keras</li>
<li>Describe a fully connected layer</li>
<li>Implement a fully connected layer with Keras</li>
<li>Use Keras to train a small fully connected network on prepared
data</li>
<li>Interpret the loss curve of the training process</li>
<li>Use a confusion matrix to measure the trained networks’ performance
on a test set</li>
</ul></div>
</div>
</div>
</div>
</div>
<section><h2 class="section-heading" id="introduction">Introduction<a class="anchor" aria-label="anchor" href="#introduction"></a></h2>
<hr class="half-width"><p>In this episode we will learn how to create and train a neural
network using Keras to solve a simple classification task.</p>
<p>The goal of this episode is to quickly get your hands dirty in
actually defining and training a neural network, without going into
depth of how neural networks work on a technical or mathematical level.
We want you to go through the full deep learning workflow once before
going into more details.</p>
<p>In fact, this is also what we would recommend you to do when working
on real-world problems: First quickly build a working pipeline, while
taking shortcuts. Then, slowly make the pipeline more advanced while you
keep on evaluating the approach.</p>
<p>In <a href="3-monitor-the-model.html">episode 3</a> we will expand on
the concepts that are lightly introduced in this episode. Some of these
concepts include: how to monitor the training progress and how
optimization works.</p>
<p>As a reminder below are the steps of the deep learning workflow:</p>
<ol style="list-style-type: decimal"><li>Formulate / Outline the problem</li>
<li>Identify inputs and outputs</li>
<li>Prepare data</li>
<li>Choose a pretrained model or start building architecture from
scratch</li>
<li>Choose a loss function and optimizer</li>
<li>Train the model</li>
<li>Perform a Prediction/Classification</li>
<li>Measure performance</li>
<li>Refine the model</li>
<li>Save model</li>
</ol><p>In this episode we will focus on a minimal example for each of these
steps, later episodes will build on this knowledge to go into greater
depth for some or all of these steps.</p>
<div id="gpu-usage" class="callout">
<div class="callout-square">
<i class="callout-icon" data-feather="bell"></i>
</div>
<div id="gpu-usage" class="callout-inner">
<h3 class="callout-title">GPU usage</h3>
<div class="callout-content">
<p>For this lesson having a <a href="https://glosario.carpentries.org/en/#gpu" class="external-link">GPU (graphics processing
unit)</a> available is not needed. We specifically use very small toy
problems so that you do not need one. However, Keras will use your GPU
automatically when it is available. Using a GPU becomes necessary when
tackling larger datasets or complex problems which require a more
complex neural network.</p>
</div>
</div>
</div>
</section><section><h2 class="section-heading" id="formulateoutline-the-problem-penguin-classification">1. Formulate/outline the problem: penguin classification<a class="anchor" aria-label="anchor" href="#formulateoutline-the-problem-penguin-classification"></a></h2>
<hr class="half-width"><p>In this episode we will be using the <a href="https://zenodo.org/record/3960218" class="external-link">penguin dataset</a>. This is a
dataset that was published in 2020 by Allison Horst and contains data on
three different species of the penguins.</p>
<p>We will use the penguin dataset to train a neural network which can
classify which species a penguin belongs to, based on their physical
characteristics.</p>
<div id="goal" class="callout">
<div class="callout-square">
<i class="callout-icon" data-feather="bell"></i>
</div>
<div id="goal" class="callout-inner">
<h3 class="callout-title">Goal</h3>
<div class="callout-content">
<p>The goal is to predict a penguins’ species using the attributes
available in this dataset.</p>
</div>
</div>
</div>
<p>The <code>palmerpenguins</code> data contains size measurements for
three penguin species observed on three islands in the Palmer
Archipelago, Antarctica. The physical attributes measured are flipper
length, beak length, beak width, body mass, and sex.</p>
<figure><img src="fig/palmer_penguins.png" title="Palmer Penguins" alt="Illustration of the three species of penguins found in the Palmer Archipelago, Antarctica: Chinstrap, Gentoo and Adele" class="figure mx-auto d-block"><div class="figcaption"><em>Artwork by <span class="citation">@allison_horst</span></em></div>
</figure><figure><img src="fig/culmen_depth.png" title="Culmen Depth" alt='Illustration of how the beak dimensions were measured. In the raw data, bill dimensions are recorded as "culmen length" and "culmen depth". The culmen is the dorsal ridge atop the bill.' class="figure mx-auto d-block"><div class="figcaption"><em>Artwork by <span class="citation">@allison_horst</span></em></div>
</figure><p>These data were collected from 2007 - 2009 by Dr. Kristen Gorman with
the <a href="https://lternet.edu/site/palmer-antarctica-lter/" class="external-link">Palmer
Station Long Term Ecological Research Program</a>, part of the <a href="https://lternet.edu/" class="external-link">US Long Term Ecological Research
Network</a>. The data were imported directly from the <a href="https://edirepository.org/" class="external-link">Environmental Data Initiative</a>
(EDI) Data Portal, and are available for use by CC0 license (“No Rights
Reserved”) in accordance with the <a href="https://lternet.edu/data-access-policy/" class="external-link">Palmer Station Data
Policy</a>.</p>
</section><section><h2 class="section-heading" id="identify-inputs-and-outputs">2. Identify inputs and outputs<a class="anchor" aria-label="anchor" href="#identify-inputs-and-outputs"></a></h2>
<hr class="half-width"><p>To identify the inputs and outputs that we will use to design the
neural network we need to familiarize ourselves with the dataset. This
step is sometimes also called data exploration.</p>
<p>We will start by importing the <a href="https://seaborn.pydata.org/" class="external-link">Seaborn</a> library that will help us
get the dataset and visualize it. Seaborn is a powerful library with
many visualizations. Keep in mind it requires the data to be in a pandas
dataframe, luckily the datasets available in seaborn are already in a
pandas dataframe.</p>
<div class="codewrapper sourceCode" id="cb1">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="im">import</span> seaborn <span class="im">as</span> sns</span></code></pre>
</div>
<p>We can load the penguin dataset using</p>
<div class="codewrapper sourceCode" id="cb2">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a>penguins <span class="op">=</span> sns.load_dataset(<span class="st">'penguins'</span>)</span></code></pre>
</div>
<p>This will give you a pandas dataframe which contains the penguin
data.</p>
<div class="section level3">
<h3 id="inspecting-the-data">Inspecting the data<a class="anchor" aria-label="anchor" href="#inspecting-the-data"></a></h3>
<p>Using the pandas <code>head</code> function gives us a quick look at
the data:</p>
<div class="codewrapper sourceCode" id="cb3">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a>penguins.head()</span></code></pre>
</div>
<table style="width:100%;" class="table"><colgroup><col width="6%"><col width="14%"><col width="13%"><col width="17%"><col width="11%"><col width="11%"><col width="11%"><col width="11%"></colgroup><thead><tr class="header"><th align="right"></th>
<th align="right">species</th>
<th align="right">island</th>
<th align="right">bill_length_mm</th>
<th align="right">bill_depth_mm</th>
<th align="right">flipper_length_mm</th>
<th align="right">body_mass_g</th>
<th align="right">sex</th>
</tr></thead><tbody><tr class="odd"><td align="right">0</td>
<td align="right">Adelie</td>
<td align="right">Torgersen</td>
<td align="right">39.1</td>
<td align="right">18.7</td>
<td align="right">181.0</td>
<td align="right">3750.0</td>
<td align="right">Male</td>
</tr><tr class="even"><td align="right">1</td>
<td align="right">Adelie</td>
<td align="right">Torgersen</td>
<td align="right">39.5</td>
<td align="right">17.4</td>
<td align="right">186.0</td>
<td align="right">3800.0</td>
<td align="right">Female</td>
</tr><tr class="odd"><td align="right">2</td>
<td align="right">Adelie</td>
<td align="right">Torgersen</td>
<td align="right">40.3</td>
<td align="right">18.0</td>
<td align="right">195.0</td>
<td align="right">3250.0</td>
<td align="right">Female</td>
</tr><tr class="even"><td align="right">3</td>
<td align="right">Adelie</td>
<td align="right">Torgersen</td>
<td align="right">NaN</td>
<td align="right">NaN</td>
<td align="right">NaN</td>
<td align="right">NaN</td>
<td align="right">NaN</td>
</tr><tr class="odd"><td align="right">4</td>
<td align="right">Adelie</td>
<td align="right">Torgersen</td>
<td align="right">36.7</td>
<td align="right">19.3</td>
<td align="right">193.0</td>
<td align="right">3450.0</td>
<td align="right">Female</td>
</tr></tbody></table><p>We can use all columns as features to predict the species of the
penguin, except for the <code>species</code> column itself.</p>
<p>Let’s look at the shape of the dataset:</p>
<div class="codewrapper sourceCode" id="cb4">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>penguins.shape</span></code></pre>
</div>
<p>There are 344 samples and 7 columns (plus the index column), so 6
features.</p>
</div>
<div class="section level3">
<h3 id="visualization">Visualization<a class="anchor" aria-label="anchor" href="#visualization"></a></h3>
<p>Looking at numbers like this usually does not give a very good
intuition about the data we are working with, so let us create a
visualization.</p>
<div class="section level4">
<h4 id="pair-plot">Pair Plot<a class="anchor" aria-label="anchor" href="#pair-plot"></a></h4>
<p>One nice visualization for datasets with relatively few attributes is
the Pair Plot. This can be created using <code>sns.pairplot(...)</code>.
It shows a scatterplot of each attribute plotted against each of the
other attributes. By using the <code>hue='species'</code> setting for
the pairplot the graphs on the diagonal are layered kernel density
estimate plots for the different values of the <code>species</code>
column.</p>
<div class="codewrapper sourceCode" id="cb5">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a>sns.pairplot(penguins, hue<span class="op">=</span><span class="st">"species"</span>)</span></code></pre>
</div>
<figure><img src="fig/pairplot.png" title="Pair Plot" alt="Grid of scatter plots and histograms comparing observed values of the four physicial attributes (features) measured in the penguins sampled. Scatter plots illustrate the distribution of values observed for each pair of features. On the diagonal, where one feature would be compared with itself, histograms are displayed that show the distribution of values observed for that feature, coloured according to the species of the individual sampled. The pair plot shows distinct but overlapping clusters of data points representing the different species, with no pair of features providing a clean separation of clusters on its own." class="figure mx-auto d-block"></figure><div id="pairplot" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="pairplot" class="callout-inner">
<h3 class="callout-title">Pairplot</h3>
<div class="callout-content">
<p>Take a look at the pairplot we created. Consider the following
questions:</p>
<ul><li>Is there any class that is easily distinguishable from the
others?</li>
<li>Which combination of attributes shows the best separation for all 3
class labels at once?</li>
<li>(optional) Create a similar pairplot, but with
<code>hue="sex"</code>. Explain the patterns you see. Which combination
of features distinguishes the two sexes best?</li>
</ul></div>
</div>
</div>
<div id="accordionSolution1" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
<h4 class="accordion-header" id="headingSolution1"> Show me the solution </h4>
</button>
<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
<div class="accordion-body">
<ul><li>The plots show that the green class, Gentoo is somewhat more easily
distinguishable from the other two.</li>
<li>The other two seem to be separable by a combination of bill length
and bill depth (other combinations are also possible such as bill length
and flipper length).</li>
</ul><p>Answer to optional question:</p>
<div class="codewrapper sourceCode" id="cb6">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a>sns.pairplot(penguins, hue<span class="op">=</span><span class="st">'sex'</span>)</span></code></pre>
</div>
<figure><img src="fig/02_sex_pairplot.png" title="Pair plot grouped by sex" alt="Grid of scatter plots and histograms comparing observed values of the four physicial attributes (features) measured in the penguins sampled, with data points coloured according to the sex of the individual sampled. The pair plot shows similarly-shaped distribution of values observed for each feature in male and female penguins, with the distribution of measurements for females skewed towards smaller values." class="figure mx-auto d-block"></figure><p>You see that for each species females have smaller bills and
flippers, as well as a smaller body mass. You would need a combination
of the species and the numerical features to successfully distinguish
males from females. The combination of <code>bill_depth_mm</code> and
<code>body_mass_g</code> gives the best separation.</p>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="section level3">
<h3 id="input-and-output-selection">Input and Output Selection<a class="anchor" aria-label="anchor" href="#input-and-output-selection"></a></h3>
<p>Now that we have familiarized ourselves with the dataset we can
select the data attributes to use as input for the neural network and
the target that we want to predict.</p>
<p>In the rest of this episode we will use the
<code>bill_length_mm</code>, <code>bill_depth_mm</code>,
<code>flipper_length_mm</code>, <code>body_mass_g</code> attributes. The
target for the classification task will be the <code>species</code>.</p>
<div id="data-exploration" class="callout">
<div class="callout-square">
<i class="callout-icon" data-feather="bell"></i>
</div>
<div id="data-exploration" class="callout-inner">
<h3 class="callout-title">Data Exploration</h3>
<div class="callout-content">
<p>Exploring the data is an important step to familiarize yourself with
the problem and to help you determine the relevant inputs and
outputs.</p>
</div>
</div>
</div>
</div>
</section><section><h2 class="section-heading" id="prepare-data">3. Prepare data<a class="anchor" aria-label="anchor" href="#prepare-data"></a></h2>
<hr class="half-width"><p>The input data and target data are not yet in a format that is
suitable to use for training a neural network.</p>
<p>For now we will only use the numerical features
<code>bill_length_mm</code>, <code>bill_depth_mm</code>,
<code>flipper_length_mm</code>, <code>body_mass_g</code> only, so let’s
drop the categorical columns:</p>
<div class="codewrapper sourceCode" id="cb7">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="co"># Drop categorical columns</span></span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a>penguins_filtered <span class="op">=</span> penguins.drop(columns<span class="op">=</span>[<span class="st">'island'</span>, <span class="st">'sex'</span>])</span></code></pre>
</div>
<div class="section level3">
<h3 id="clean-missing-values">Clean missing values<a class="anchor" aria-label="anchor" href="#clean-missing-values"></a></h3>
<p>During the exploration phase you may have noticed that some rows in
the dataset have missing (NaN) values, leaving such values in the input
data will ruin the training, so we need to deal with them. There are
many ways to deal with missing values, but for now we will just remove
the offending rows by adding a call to <code>dropna()</code>:</p>
<div class="codewrapper sourceCode" id="cb8">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="co"># Drop the rows that have NaN values in them</span></span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a>penguins_filtered <span class="op">=</span> penguins_filtered.dropna()</span></code></pre>
</div>
<p>Finally, we select only the features</p>
<div class="codewrapper sourceCode" id="cb9">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="co"># Extract columns corresponding to features</span></span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a>features <span class="op">=</span> penguins_filtered.drop(columns<span class="op">=</span>[<span class="st">'species'</span>])</span></code></pre>
</div>
</div>
<div class="section level3">
<h3 id="prepare-target-data-for-training">Prepare target data for training<a class="anchor" aria-label="anchor" href="#prepare-target-data-for-training"></a></h3>
<p>Second, the target data is also in a format that cannot be used in
training. A neural network can only take numerical inputs and outputs,
and learns by calculating how “far away” the species predicted by the
neural network is from the true species.</p>
<p>When the target is a string category column as we have here, we need
to transform this column into a numerical format first. Again, there are
many ways to do this. We will be using the one-hot encoding. This
encoding creates multiple columns, as many as there are unique values,
and puts a 1 in the column with the corresponding correct class, and 0’s
in the other columns. For instance, for a penguin of the Adelie species
the one-hot encoding would be 1 0 0.</p>
<p>Fortunately, Pandas is able to generate this encoding for us.</p>
<div class="codewrapper sourceCode" id="cb10">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a><span class="im">import</span> pandas <span class="im">as</span> pd</span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a>target <span class="op">=</span> pd.get_dummies(penguins_filtered[<span class="st">'species'</span>])</span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a>target.head() <span class="co"># print out the top 5 to see what it looks like.</span></span></code></pre>
</div>
<div id="one-hot-encoding" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="one-hot-encoding" class="callout-inner">
<h3 class="callout-title">One-hot encoding</h3>
<div class="callout-content">
<p>How many output neurons will our network have now that we one-hot
encoded the target class?</p>
<ul><li>A: 1</li>
<li>B: 2</li>
<li>C: 3</li>
</ul></div>
</div>
</div>
<div id="accordionSolution2" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
<h4 class="accordion-header" id="headingSolution2"> Show me the solution </h4>
</button>
<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
<div class="accordion-body">
<p>C: 3, one for each output variable class</p>
</div>
</div>
</div>
</div>
</div>
<div class="section level3">
<h3 id="split-data-into-training-and-test-set">Split data into training and test set<a class="anchor" aria-label="anchor" href="#split-data-into-training-and-test-set"></a></h3>
<p>Finally, we will split the dataset into a training set and a test
set. As the names imply we will use the training set to train the neural
network, while the test set is kept separate. We will use the test set
to assess the performance of the trained neural network on unseen
samples. In many cases a validation set is also kept separate from the
training and test sets (i.e. the dataset is split into 3 parts). This
validation set is then used to select the values of the parameters of
the neural network and the training methods. For this episode we will
keep it at just a training and test set however.</p>
<p>To split the cleaned dataset into a training and test set we will use
a very convenient function from sklearn called
<code>train_test_split</code>.</p>
<p>This function takes a number of parameters which are extensively
explained in <a href="https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html" class="external-link">the
scikit-learn documentation</a> : - The first two parameters are the
dataset (in our case <code>features</code>) and the corresponding
targets (i.e. defined as target). - Next is the named parameter
<code>test_size</code> this is the fraction of the dataset that is used
for testing, in this case <code>0.2</code> means 20% of the data will be
used for testing. - <code>random_state</code> controls the shuffling of
the dataset, setting this value will reproduce the same results
(assuming you give the same integer) every time it is called. -
<code>shuffle</code> which can be either <code>True</code> or
<code>False</code>, it controls whether the order of the rows of the
dataset is shuffled before splitting. It defaults to <code>True</code>.
- <code>stratify</code> is a more advanced parameter that controls how
the split is done. By setting it to <code>target</code> the train and
test sets the function will return will have roughly the same
proportions (with regards to the number of penguins of a certain
species) as the dataset.</p>
<div class="codewrapper sourceCode" id="cb11">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split</span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a></span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a>X_train, X_test, y_train, y_test <span class="op">=</span> train_test_split(features, target, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">0</span>, shuffle<span class="op">=</span><span class="va">True</span>, stratify<span class="op">=</span>target)</span></code></pre>
</div>
</div>
</section><section><h2 class="section-heading" id="build-an-architecture-from-scratch-or-choose-a-pretrained-model">4. Build an architecture from scratch or choose a pretrained
model<a class="anchor" aria-label="anchor" href="#build-an-architecture-from-scratch-or-choose-a-pretrained-model"></a></h2>
<hr class="half-width"><div class="section level3">
<h3 id="keras-for-neural-networks">Keras for neural networks<a class="anchor" aria-label="anchor" href="#keras-for-neural-networks"></a></h3>
<p>Keras is a machine learning framework with ease of use as one of its
main features. It is part of the tensorflow python package and can be
imported using <code>from tensorflow import keras</code>.</p>
<p>Keras includes functions, classes and definitions to define deep
learning models, cost functions and optimizers (optimizers are used to
train a model).</p>
<p>Before we move on to the next section of the workflow we need to make
sure we have Keras imported. We do this as follows:</p>
<div class="codewrapper sourceCode" id="cb12">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras</span></code></pre>
</div>
<p>For this episode it is useful if everyone gets the same results from
their training. Keras uses a random number generator at certain points
during its execution. Therefore we will need to set two random seeds,
one for numpy and one for tensorflow:</p>
<div class="codewrapper sourceCode" id="cb13">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb13-1"><a href="#cb13-1" tabindex="-1"></a><span class="im">from</span> numpy.random <span class="im">import</span> seed</span>
<span id="cb13-2"><a href="#cb13-2" tabindex="-1"></a>seed(<span class="dv">1</span>)</span>
<span id="cb13-3"><a href="#cb13-3" tabindex="-1"></a>keras.utils.set_random_seed(<span class="dv">2</span>)</span></code></pre>
</div>
</div>
<div class="section level3">
<h3 id="build-a-neural-network-from-scratch">Build a neural network from scratch<a class="anchor" aria-label="anchor" href="#build-a-neural-network-from-scratch"></a></h3>
<p>Now we will build a neural network from scratch, which is
surprisingly straightforward using Keras.</p>
<p>With Keras you compose a neural network by creating layers and
linking them together. For now we will only use one type of layer called
a fully connected or Dense layer. In Keras this is defined by the
<code>keras.layers.Dense</code> class.</p>
<p>A dense layer has a number of neurons, which is a parameter you can
choose when you create the layer. When connecting the layer to its input
and output layers every neuron in the dense layer gets an edge
(i.e. connection) to <strong><em>all</em></strong> of the input neurons
and <strong><em>all</em></strong> of the output neurons. The hidden
layer in the image in the introduction of this episode is a Dense
layer.</p>
<p>The input in Keras also gets special treatment, Keras automatically
calculates the number of inputs and outputs a layer needs and therefore
how many edges need to be created. This means we need to inform Keras
how big our input is going to be. We do this by instantiating a
<code>keras.Input</code> class and tell it how big our input is, thus
the number of columns it contains.</p>
<div class="codewrapper sourceCode" id="cb14">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb14-1"><a href="#cb14-1" tabindex="-1"></a>inputs <span class="op">=</span> keras.Input(shape<span class="op">=</span>(X_train.shape[<span class="dv">1</span>],))</span></code></pre>
</div>
<p>We store a reference to this input class in a variable so we can pass
it to the creation of our hidden layer. Creating the hidden layer can
then be done as follows:</p>
<div class="codewrapper sourceCode" id="cb15">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb15-1"><a href="#cb15-1" tabindex="-1"></a>hidden_layer <span class="op">=</span> keras.layers.Dense(<span class="dv">10</span>, activation<span class="op">=</span><span class="st">"relu"</span>)(inputs)</span></code></pre>
</div>
<p>The instantiation here has 2 parameters and a seemingly strange
combination of parentheses, so let us take a closer look. The first
parameter <code>10</code> is the number of neurons we want in this
layer, this is one of the hyperparameters of our system and needs to be
chosen carefully. We will get back to this in the section on refining
the model.</p>
<p>The second parameter is the activation function to use. We choose
<code>relu</code> which returns 0 for inputs that are 0 and below and
the identity function (returning the same value) for inputs above 0.
This is a commonly used activation function in deep neural networks that
is proven to work well.</p>
<p>Next we see an extra set of parenthenses with inputs in them. This
means that after creating an instance of the Dense layer we call it as
if it was a function. This tells the Dense layer to connect the layer
passed as a parameter, in this case the inputs.</p>
<p>Finally we store a reference in the <code>hidden_layer</code>
variable so we can pass it to the output layer in a minute.</p>
<p>Now we create another layer that will be our output layer. Again we
use a Dense layer and so the call is very similar to the previous
one.</p>
<div class="codewrapper sourceCode" id="cb16">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb16-1"><a href="#cb16-1" tabindex="-1"></a>output_layer <span class="op">=</span> keras.layers.Dense(<span class="dv">3</span>, activation<span class="op">=</span><span class="st">"softmax"</span>)(hidden_layer)</span></code></pre>
</div>
<p>Because we chose the one-hot encoding, we use three neurons for the
output layer.</p>
<p>The <code>softmax</code> activation ensures that the three output
neurons produce values in the range (0, 1) and they sum to 1. We can
interpret this as a kind of ‘probability’ that the sample belongs to a
certain species.</p>
<p>Now that we have defined the layers of our neural network we can
combine them into a Keras model which facilitates training the
network.</p>
<div class="codewrapper sourceCode" id="cb17">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb17-1"><a href="#cb17-1" tabindex="-1"></a>model <span class="op">=</span> keras.Model(inputs<span class="op">=</span>inputs, outputs<span class="op">=</span>output_layer)</span>
<span id="cb17-2"><a href="#cb17-2" tabindex="-1"></a>model.summary()</span></code></pre>
</div>
<p>The model summary here can show you some information about the neural
network we have defined.</p>
<div id="trainable-and-non-trainable-parameters" class="callout">
<div class="callout-square">
<i class="callout-icon" data-feather="bell"></i>
</div>
<div id="trainable-and-non-trainable-parameters" class="callout-inner">
<h3 class="callout-title">Trainable and non-trainable parameters</h3>
<div class="callout-content">
<p>Keras distinguishes between two types of weights, namely:</p>
<ul><li><p>trainable parameters: these are weights of the neurons that are
modified when we train the model in order to minimize our loss function
(we will learn about loss functions shortly!).</p></li>
<li><p>non-trainable parameters: these are weights of the neurons that
are not changed when we train the model. These could be for many reasons
- using a pre-trained model, choice of a particular filter for a
convolutional neural network, and statistical weights for batch
normalization are some examples.</p></li>
</ul><p>If these reasons are not clear right away, don’t worry! In later
episodes of this course, we will touch upon a couple of these
concepts.</p>
</div>
</div>
</div>
<div id="create-the-neural-network" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="create-the-neural-network" class="callout-inner">
<h3 class="callout-title">Create the neural network</h3>
<div class="callout-content">
<p>With the code snippets above, we defined a Keras model with 1 hidden
layer with 10 neurons and an output layer with 3 neurons.</p>
<ol style="list-style-type: decimal"><li>How many parameters does the resulting model have?</li>
<li>What happens to the number of parameters if we increase or decrease
the number of neurons in the hidden layer?</li>
</ol><div class="section level4">
<h4 id="optional-keras-sequential-vs-functional-api">(optional) Keras Sequential vs Functional API<a class="anchor" aria-label="anchor" href="#optional-keras-sequential-vs-functional-api"></a></h4>
<p>So far we have used the <a href="https://keras.io/guides/functional_api/" class="external-link">Functional API</a> of
Keras. You can also implement neural networks using <a href="https://keras.io/guides/sequential_model/" class="external-link">the Sequential
model</a>. As you can read in the documentation, the Sequential model is
appropriate for <strong>a plain stack of layers</strong> where each
layer has <strong>exactly one input tensor and one output
tensor</strong>.</p>
<ol start="3" style="list-style-type: decimal"><li>(optional) Use the Sequential model to implement the same
network</li>
</ol></div>
</div>
</div>
</div>
<div id="accordionSolution3" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution3" aria-expanded="false" aria-controls="collapseSolution3">
<h4 class="accordion-header" id="headingSolution3"> Show me the solution </h4>
</button>
<div id="collapseSolution3" class="accordion-collapse collapse" aria-labelledby="headingSolution3" data-bs-parent="#accordionSolution3">
<div class="accordion-body">
<p>Have a look at the output of <code>model.summary()</code>:</p>
<div class="codewrapper sourceCode" id="cb18">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb18-1"><a href="#cb18-1" tabindex="-1"></a>model.summary()</span></code></pre>
</div>
<div class="codewrapper">
<h3 class="code-label">OUTPUT<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="output" tabindex="0"><code>Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 4)] 0
_________________________________________________________________
dense (Dense) (None, 10) 50
_________________________________________________________________
dense_1 (Dense) (None, 3) 33
=================================================================
Total params: 83
Trainable params: 83
Non-trainable params: 0
_________________________________________________________________</code></pre>
</div>
<p>The model has 83 trainable parameters. Each of the 10 neurons in the
in the <code>dense</code> hidden layer is connected to each of the 4
inputs in the input layer resulting in 40 weights that can be trained.
The 10 neurons in the hidden layer are also connected to each of the 3
outputs in the <code>dense_1</code> output layer, resulting in a further
30 weights that can be trained. By default <code>Dense</code> layers in
Keras also contain 1 bias term for each neuron, resulting in a further
10 bias values for the hidden layer and 3 bias terms for the output
layer. <code>40+30+10+3=83</code> trainable parameters.</p>
<p>If you increase the number of neurons in the hidden layer the number
of trainable parameters in both the hidden and output layer increases or
decreases in accordance with the number of neurons added. Each extra
neuron has 4 weights connected to the input layer, 1 bias term, and 3
weights connected to the output layer. So in total 8 extra
parameters.</p>
<p><em>The name in quotes within the string
<code>Model: "model_1"</code> may be different in your view; this detail
is not important.</em></p>
<div class="section level4">
<h4 id="optional-keras-sequential-vs-functional-api-1">(optional) Keras Sequential vs Functional API<a class="anchor" aria-label="anchor" href="#optional-keras-sequential-vs-functional-api-1"></a></h4>
<ol start="3" style="list-style-type: decimal"><li>This implements the same model using the Sequential API:</li>
</ol><div class="codewrapper sourceCode" id="cb20">
<h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
</h3>
<pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb20-1"><a href="#cb20-1" tabindex="-1"></a>model <span class="op">=</span> keras.Sequential(</span>
<span id="cb20-2"><a href="#cb20-2" tabindex="-1"></a> [</span>
<span id="cb20-3"><a href="#cb20-3" tabindex="-1"></a> keras.Input(shape<span class="op">=</span>(X_train.shape[<span class="dv">1</span>],)),</span>
<span id="cb20-4"><a href="#cb20-4" tabindex="-1"></a> keras.layers.Dense(<span class="dv">10</span>, activation<span class="op">=</span><span class="st">"relu"</span>),</span>
<span id="cb20-5"><a href="#cb20-5" tabindex="-1"></a> keras.layers.Dense(<span class="dv">3</span>, activation<span class="op">=</span><span class="st">"softmax"</span>),</span>
<span id="cb20-6"><a href="#cb20-6" tabindex="-1"></a> ]</span>
<span id="cb20-7"><a href="#cb20-7" tabindex="-1"></a>)</span></code></pre>
</div>
<p>We will use the Functional API for the remainder of this course,
since it is more flexible and more explicit.</p>
</div>
</div>
</div>
</div>
</div>
<div id="how-to-choose-an-architecture" class="callout">
<div class="callout-square">
<i class="callout-icon" data-feather="bell"></i>
</div>
<div id="how-to-choose-an-architecture" class="callout-inner">
<h3 class="callout-title">How to choose an architecture?</h3>
<div class="callout-content">
<p>Even for this small neural network, we had to make a choice on the
number of hidden neurons. Other choices to be made are the number of
layers and type of layers (as we will see later). You might wonder how
you should make these architectural choices. Unfortunately, there are no
clear rules to follow here, and it often boils down to a lot of trial
and error. However, it is recommended to look what others have done with
similar datasets and problems. Another best practice is to start with a
relatively simple architecture. Once running start to add layers and
tweak the network to see if performance increases.</p>