differences for PR #554

carpentries-lab · Feb 11, 2025 · 14f8633 · 14f8633
1 parent 6ec3429
commit 14f8633
Show file tree

Hide file tree

Showing 7 changed files with 259 additions and 160 deletions.
diff --git a/2-keras.md b/2-keras.md
@@ -297,7 +297,7 @@ because the accuracy of a model depends on the data used to train and test it.
 This is a good time for switching instructor and/or a break.
 :::
 
-## 4. Build an architecture from scratch or choose a pretrained model
+## 4. Build an architecture from scratch
 
 ### Keras for neural networks
 
@@ -551,6 +551,7 @@ If your data and problem is very similar to what others have done, you can often
 Even if your problem is different, but the data type is common (for example images), you can use a pretrained network and finetune it for your problem.
 A large number of openly available pretrained networks can be found on [Hugging Face](https://huggingface.co/models) (especially LLMs), [MONAI](https://monai.io/) (medical imaging), the [Model Zoo](https://modelzoo.co/), [pytorch hub](https://pytorch.org/hub/) or [tensorflow hub](https://www.tensorflow.org/hub/).
 
+We will cover the concept of Transfer Learning in [episode 5](./5-transfer-learning.html)
 
 ## 5. Choose a loss function and optimizer
 We have now designed a neural network that in theory we should be able to

diff --git a/4-advanced-layer-types.md b/4-advanced-layer-types.md
diff --git a/config.yaml b/config.yaml
@@ -0,0 +1,84 @@
+#------------------------------------------------------------
+# Values for this lesson.
+#------------------------------------------------------------
+
+# Which carpentry is this (swc, dc, lc, or cp)?
+# swc: Software Carpentry
+# dc: Data Carpentry
+# lc: Library Carpentry
+# cp: Carpentries (to use for instructor training for instance)
+# incubator: The Carpentries Incubator
+carpentry: 'incubator'
+
+# Overall title for pages.
+title: 'Introduction to deep learning'
+
+# Date the lesson was created (YYYY-MM-DD, this is empty by default)
+created: '2020-10-17'
+
+# Comma-separated list of keywords for the lesson
+keywords: 'deep learning, keras, lesson, The Carpentries, neural networks'
+
+# Life cycle stage of the lesson
+# possible values: pre-alpha, alpha, beta, stable
+life_cycle: 'stable'
+
+# License of the lesson materials (recommended CC-BY 4.0)
+license: 'CC-BY 4.0'
+
+# Link to the source repository for this lesson
+source: 'https://github.com/carpentries-incubator/deep-learning-intro'
+
+# Default branch of your lesson
+branch: 'main'
+
+# Who to contact if there are any issues
+contact: '[email protected]'
+
+# Navigation ------------------------------------------------
+#
+# Use the following menu items to specify the order of
+# individual pages in each dropdown section. Leave blank to
+# include all pages in the folder.
+#
+# Example -------------
+#
+# episodes:
+# - introduction.md
+# - first-steps.md
+#
+# learners:
+# - setup.md
+#
+# instructors:
+# - instructor-notes.md
+#
+# profiles:
+# - one-learner.md
+# - another-learner.md
+
+# Order of episodes in your lesson
+episodes:
+- 1-introduction.Rmd
+- 2-keras.Rmd
+- 3-monitor-the-model.Rmd
+- 4-advanced-layer-types.Rmd
+- 5-transfer-learning.Rmd
+- 6-outlook.Rmd
+
+
+# Information for Learners
+learners:
+
+# Information for Instructors
+instructors:
+
+# Learner Profiles
+profiles:
+
+# Customisation ---------------------------------------------
+#
+# This space below is where custom yaml items (e.g. pinning
+# sandpaper and varnish versions) should live
+
+
diff --git a/fig/.gitkeep b/fig/.gitkeep
diff --git a/md5sum.txt b/md5sum.txt
@@ -4,12 +4,12 @@
 "config.yaml" "bb28d4820055b928caa1334d348e821c" "site/built/config.yaml" "2025-01-28"
 "index.md" "b326559d728c59d2e77a1db0b8a63ab9" "site/built/index.md" "2025-01-28"
 "links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2025-01-28"
-"paper.md" "b9b02225264924ebca972bb856b576f8" "site/built/paper.md" "2025-01-28"
+"paper.md" "ed839d47b75120d3dc891714a4bf7023" "site/built/paper.md" "2025-02-11"
 "workshops.md" "912f39df323e22bb14340184cdd139f6" "site/built/workshops.md" "2025-01-28"
 "episodes/1-introduction.Rmd" "8dabfa4853b660c8bfcb0aea5f435029" "site/built/1-introduction.md" "2025-01-28"
-"episodes/2-keras.Rmd" "48db1b67a077752535af89e6304628f4" "site/built/2-keras.md" "2025-01-28"
+"episodes/2-keras.Rmd" "3a0dd23ee03f389dd484f2b7a4bea522" "site/built/2-keras.md" "2025-02-11"
 "episodes/3-monitor-the-model.Rmd" "93984f2bd862ddc2f10ba6749950b719" "site/built/3-monitor-the-model.md" "2025-01-28"
-"episodes/4-advanced-layer-types.Rmd" "1496dc333f50201032cdb6229bb73f1f" "site/built/4-advanced-layer-types.md" "2025-01-28"
+"episodes/4-advanced-layer-types.Rmd" "97b49e9dad76479bcfe608f0de2d52a4" "site/built/4-advanced-layer-types.md" "2025-02-11"
 "episodes/5-transfer-learning.Rmd" "5808f2218c3f2d2d400e1ec1ad9f1f3c" "site/built/5-transfer-learning.md" "2025-01-28"
 "episodes/6-outlook.Rmd" "007728216562f3b52b983ff1908af5b7" "site/built/6-outlook.md" "2025-01-28"
 "instructors/bonus-material.md" "382832ea4eb097fc7781cb36992c1955" "site/built/bonus-material.md" "2025-01-28"
@@ -18,6 +18,5 @@
 "instructors/schedule.md" "da1a92ad7102c42c77abb5170ace55fd" "site/built/schedule.md" "2025-01-28"
 "instructors/survey-templates.md" "ea5d46e7b54d335f79e57a7bc31d1c5c" "site/built/survey-templates.md" "2025-01-28"
 "learners/reference.md" "e47218673643f23431c540c2d3b27868" "site/built/reference.md" "2025-01-28"
-"learners/setup.md" "bfa55b568e02eb2657ec08d5bf351db9" "site/built/setup.md" "2025-01-28"
+"learners/setup.md" "024fc5f3c5897d4e1080af6413596f7b" "site/built/setup.md" "2025-02-11"
 "profiles/learner-profiles.md" "ef0f26dd0874387d80ed3fd468b99e23" "site/built/learner-profiles.md" "2025-01-28"
-"renv/profiles/lesson-requirements/renv.lock" "051df60e0199289301fe3776a0611901" "site/built/renv.lock" "2025-01-28"
diff --git a/paper.md b/paper.md
@@ -83,9 +83,9 @@ The lesson starts by explaining the basic concepts of neural networks,
 and then guides learners through the different steps of a deep learning workflow.  
 After following this lesson, 
 learners will be able to prepare data for deep learning, 
-implement a basic deep learning model in Python with Keras, 
-monitor and troubleshoot the training process, and implement different layer types, 
-such as convolutional layers.
+implement a basic deep learning model in Python with Keras,
+and monitor and troubleshoot the training process.
+In addition, they will be able to implement and understand different layer types, such as convolutional layers and dropout layers, and apply transfer learning.
 
 We use data with permissive licenses and designed for real world use cases:
 
@@ -148,16 +148,20 @@ and these can even be included at the level of the lesson content.
 In addition, the Carpentries Workbench prioritises accessibility of the content, for example by having clearly visible figure captions
 and promoting alt-texts for pictures.
 
-The lesson is split into a general introduction, and 3 episodes that cover 3 distinct increasingly more complex deep learning problems.
+The lesson is split into a general introduction, and 4 episodes that cover 3 distinct increasingly more complex deep learning problems.
 Each of the deep learning problems is approached using the same 10-step deep learning workflow (https://carpentries-incubator.github.io/deep-learning-intro/1-introduction.html#deep-learning-workflow).
 By going through the deep learning cycle three times with different problems, learners become increasingly confident in applying this deep learning workflow to their own projects.
+We end with an outlook episode. Firstly, the outlook eposide discusses a real-world application of deep learning in chemistry [@huber_ms2deepscore_2021]. In addition, it discusses bias in datasets, large language models, and good practices for organising deep learning projects. Finally, we end with ideas for next steps after finishing the lesson.
 
 # Feedback
-This course was taught 12 times over the course of 3 years, both online and in-person, by the Netherlands eScience Center
+This course was taught 13 times over the course of 4 years, both online and in-person, by the Netherlands eScience Center
 (Netherlands, https://www.esciencecenter.nl/) and Helmholz-Zentrum Dresden-Rossendorf (Germany, https://www.hzdr.de/).
 Apart from the core group of contributors, the workshop was also taught at 3 independent institutes, namely:
 University of Wisconson-Madison (US, https://www.wisc.edu/), University of Auckland (New Zealand, https://www.auckland.ac.nz/), 
 and EMBL Heidelberg (Germany, https://www.embl.org/sites/heidelberg/).
+
+An up-to-date list of workshops using this lesson can be found in a `workshops.md` file in the GitHub repository (https://github.com/carpentries-incubator/deep-learning-intro/blob/main/workshops.md).
+
 In general, adoption of the lesson material by the instructors not involved in the project went well.
 The feedback gathered from our own and others' teachings was used to polish the lesson further.
 
@@ -193,6 +197,13 @@ The results from these 2 workshops are a good representation of the general feed
 Table 2: Quality of the different episodes of the workshop as rated by students from 2 workshops taught at the Netherlands eScience Center. 
 The results from these 2 workshops are a good representation of the general feedback we get when teaching this workshop.
 
+## Carpentries Lab review process
+Prior to submitting this paper the lesson went through the substantial review in the process of becoming an official Carpentries Lab (https://carpentries-lab.org/) lesson. This led to a number of improvements to the lesson. In general the accessibility and user-friendliness improved, for example by updating alt-texts and using more beginner-friendly and clearer wording. Additionally, the instructor notes were improved and many missing explanations of important deep learning concepts were added to the lesson. 
+
+Most importantly, the reviewers pointed out that the CIFAR-10 [@noauthor_cifar-10_nodate] dataset that we initially used does not have a license. We were surprised to find out that this dataset, that is one of the most widely used datasets in the field of machine learning and deep learning, is actually unethically scraped from the internet without permission from image owners. As an alternative we now use 'Dollar street 10' [@van_der_burg_dollar_2024], a dataset that was adapted for this lesson from The Dollar Street Dataset (@gaviria_rojas_dollar_2022). The Dollar Street Dataset is representative and contains accurate demographic information to ensure their robustness and fairness, especially for smaller subpopulations. In addition, it is a great entry to teach learners about ethical AI and bias in datasets.
+
+You can find all details of the review process on GitHub: https://github.com/carpentries-lab/reviews/issues/25.
+
 # Conclusion
 This lesson can be taught as a stand-alone workshop to students already familiar with machine learning and Python.
 It can also be taught in a broader curriculum after an introduction to Python programming (for example: @azalee_bostroem_software_2016) 
@@ -208,6 +219,7 @@ Nidhi Gowdra (University of Auckland, New Zealand, https://www.auckland.ac.nz/),
 Renato Alves and Lisanna Paladin (EMBL Heidelberg, Germany, https://www.embl.org/sites/heidelberg/),
 that piloted this workshop at their institutes.
 We thank the Carpentries for providing such a great framework for developing this lesson material.
+We thank Sarah Brown, Johanna Bayer, and Mike Laverick for giving us excellent feedback on the lesson during the Carpentries Lab review process.
 We thank all students enrolled in the workshops that were taught using this lesson material for providing us with feedback.
 
 # References
diff --git a/setup.md b/setup.md
@@ -80,21 +80,9 @@ Remember that you need to activate your environment every time you restart your
 python3 -m pip install jupyter seaborn scikit-learn pandas tensorflow pydot
 ```
 
-::: spoiler
-
-### Advanced: TensorFlow with support for Mac M1/M2/M3
-
-Recent Macs have special chips (M1/M2/M3) that can accelerate deep learning processes.
-Apple has developed the [tensorflow-metal](https://developer.apple.com/metal/tensorflow-plugin/) package to support these chips in TensorFlow.
-This is not supported by the standard TensorFlow installation, and not required for this lesson.
-
-Nevertheless, you can install the on top of the standard `tensorflow`:
+Note for MacOS users: there is a package `tensorflow-metal` which accelerates the training of machine learning models with TensorFlow on a recent Mac with a Silicon chip (M1/M2/M3).
+However, the installation is currently broken in the most recent version (as of January 2025), see the [developer forum](https://developer.apple.com/forums/thread/772147).
 
-```shell
-python -m pip install tensorflow-metal
-```
-
-:::
 :::
 
 ::: spoiler