neuromatch · SamueleBolotta · Aug 7, 2024 · Aug 7, 2024 · Aug 7, 2024 · Aug 7, 2024
diff --git a/tutorials/W2D3_Microlearning/W2D3_Tutorial1.ipynb b/tutorials/W2D3_Microlearning/W2D3_Tutorial1.ipynb
@@ -924,7 +924,9 @@
     "execution": {}
    },
    "source": [
-    "Below, we provide an implementation of the node perturbation algorithm, so that you will be able to compare it to the weight perturbation algorithm in subsequent sections. Running this code will take around 9 minutes--you can move on to subsequent sections while you wait!"
+    "Below, we provide an implementation of the node perturbation algorithm, so that you will be able to compare it to the weight perturbation algorithm in subsequent sections. Running this code will take around 9 minutes--you can move on to subsequent sections while you wait!\n",
+    "\n",
+    "One important detail: there are two different notions of efficiency we could consider here: 1) sample efficiency and 2) runtime efficiency. Node perturbation is more sample efficient: in general it brings the loss lower with fewer samples than weight perturbation. However, our particular implementation of node perturbation runs a little slower than weight perturbation, so you could argue that it has worse runtime efficiency. This is just due to the fact that these algorithms were implemented by different people, and the author for node perturbation exploited python parallel computation a little less effectively."
    ]
   },
   {

diff --git a/tutorials/W2D3_Microlearning/instructor/W2D3_Tutorial1.ipynb b/tutorials/W2D3_Microlearning/instructor/W2D3_Tutorial1.ipynb
@@ -926,7 +926,9 @@
     "execution": {}
    },
    "source": [
-    "Below, we provide an implementation of the node perturbation algorithm, so that you will be able to compare it to the weight perturbation algorithm in subsequent sections. Running this code will take around 9 minutes--you can move on to subsequent sections while you wait!"
+    "Below, we provide an implementation of the node perturbation algorithm, so that you will be able to compare it to the weight perturbation algorithm in subsequent sections. Running this code will take around 9 minutes--you can move on to subsequent sections while you wait!\n",
+    "\n",
+    "One important detail: there are two different notions of efficiency we could consider here: 1) sample efficiency and 2) runtime efficiency. Node perturbation is more sample efficient: in general it brings the loss lower with fewer samples than weight perturbation. However, our particular implementation of node perturbation runs a little slower than weight perturbation, so you could argue that it has worse runtime efficiency. This is just due to the fact that these algorithms were implemented by different people, and the author for node perturbation exploited python parallel computation a little less effectively."
    ]
   },
   {

diff --git a/tutorials/W2D3_Microlearning/student/W2D3_Tutorial1.ipynb b/tutorials/W2D3_Microlearning/student/W2D3_Tutorial1.ipynb
@@ -900,7 +900,9 @@
     "execution": {}
    },
    "source": [
-    "Below, we provide an implementation of the node perturbation algorithm, so that you will be able to compare it to the weight perturbation algorithm in subsequent sections. Running this code will take around 9 minutes--you can move on to subsequent sections while you wait!"
+    "Below, we provide an implementation of the node perturbation algorithm, so that you will be able to compare it to the weight perturbation algorithm in subsequent sections. Running this code will take around 9 minutes--you can move on to subsequent sections while you wait!\n",
+    "\n",
+    "One important detail: there are two different notions of efficiency we could consider here: 1) sample efficiency and 2) runtime efficiency. Node perturbation is more sample efficient: in general it brings the loss lower with fewer samples than weight perturbation. However, our particular implementation of node perturbation runs a little slower than weight perturbation, so you could argue that it has worse runtime efficiency. This is just due to the fact that these algorithms were implemented by different people, and the author for node perturbation exploited python parallel computation a little less effectively."
    ]
   },
   {