Update tutorial text.

milvus-io · Oct 26, 2023 · 0716ec4 · 0716ec4
1 parent 689f73f
commit 0716ec4
Showing 1 changed file with 37 additions and 9 deletions.
diff --git a/solutions/nlp/recommender_system/recommender_system.ipynb b/solutions/nlp/recommender_system/recommender_system.ipynb
@@ -237,24 +237,54 @@
    "source": [
     "So, with the meta data stored in Redis, it's time to calculate the embeddings and add them to Milvus.\n",
     "\n",
-    "First, you need a collection to store them in. Create a simple one that stores the movie ID and embeddings for in the **Movies** field.\n",
+    "First, you need a collection to store them in. Create a simple one that stores the title and embeddings for in the **Movies** field, while also allowing dynamic fields. You'll use the dynamic fields for metadata.\n",
     "\n",
-    "Then, you'll index that field to make searches more efficent."
+    "Then, you'll index the embedding field to make searches more efficent."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 70,
    "id": "aa7ab317-a6f9-48bf-9c80-1792537c99ab",
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Collection created.\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "alloc_timestamp unimplemented, ignore it\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Collection indexed!\n"
+     ]
+    }
+   ],
    "source": [
     "COLLECTION_NAME = 'film_vectors'\n",
     "PARTITION_NAME = 'Movie'\n",
     "\n",
+    "# Here's our record schema\n",
+    "\"\"\"\n",
+    "\"title\": Film title,\n",
+    "\"overview\": description,\n",
+    "\"release_date\": film release date,\n",
+    "\"genres\": film generes,\n",
+    "\"embedding\": embedding\n",
+    "\"\"\"\n",
+    "\n",
     "id = FieldSchema(name='title', dtype=DataType.VARCHAR, max_length=500, is_primary=True)\n",
     "field = FieldSchema(name='embedding', dtype=DataType.FLOAT_VECTOR, dim=384)\n",
-    "#meta = FieldSchema(name='Meta', dtype=DataType.JSON)\n",
     "\n",
     "schema = CollectionSchema(fields=[id, field], description=\"movie recommender: film vectors\", enable_dynamic_field=True)\n",
     "\n",
@@ -329,9 +359,7 @@
    "id": "447355dd-b82b-4660-b192-f614918901fa",
    "metadata": {},
    "source": [
-    "Now, you can create the embeddings. This dataset is too large to send to Milvus in a single insert statement, but sending them one at a time would create unnecessary network traffic and add too much time. So, this code uses batches. You can play with the batch size to suit your individual needs and preferences.\n",
-    "\n",
-    "A few movies will fail for ids that cannot be cast to integers. You could fix this above with a schema change or by verifying their format. "
+    "Now, you can create the embeddings. This dataset is too large to send to Milvus in a single insert statement, but sending them one at a time would create unnecessary network traffic and add too much time. So, this code uses batches. You can play with the batch size to suit your individual needs and preferences."
    ]
   },
   {
@@ -374,7 +402,7 @@
     "\n",
     "First, you need a transformer to convert the user's search string to an embedding. For this, **embed_search** takes their criteria and passed it to the same transformer you used to populate Milvus.\n",
     "\n",
-    "Milvus will return a set of movie ids. You need to use them to retrieve data about those ids from Redis. This happens in **collate_results**.\n",
+    "By setting the title and overview fields in the return set, you can simply print the result set for the user.\n",
     "\n",
     "Finally, **search_for_movies** performs the actual vector search, using the other two functions for support."
    ]