Parameter search notebook improvements (#46)

- add results table - add visualization of results - small improvements
ml6team · Dec 19, 2023 · e191976 · e191976
1 parent 2206a6d
commit e191976
Show file tree

Hide file tree

Showing 3 changed files with 117 additions and 28 deletions.
diff --git a/src/evaluation.ipynb b/src/evaluation.ipynb
@@ -140,9 +140,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "> ⚠️ **For Apple M1/M2 chip users**:\n",
+    "> ⚠️ For Apple M1/M2 chip users:\n",
     "> \n",
-    "> - In Docker Desktop Dashboard `Settings -> Features in development`, make sure to uncheck `Use containerid` for pulling and storing images. More info [here](https://docs.docker.com/desktop/settings/mac/#beta-features)\n",
+    "> - In Docker Desktop Dashboard `Settings -> Features in development`, make sure to **un**check `Use containerd` for pulling and storing images. More info [here](https://docs.docker.com/desktop/settings/mac/#beta-features)\n",
     "> - Make sure that Docker uses linux/amd64 platform and not arm64 (cell below should take care of that)"
    ]
   },

diff --git a/src/parameter_search.ipynb b/src/parameter_search.ipynb
@@ -68,9 +68,7 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
+   "metadata": {},
    "source": [
     "## Set up environment"
    ]
@@ -135,18 +133,16 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
-    "from utils import get_host_ip, create_directory_if_not_exists, output_results, run_parameters_search"
+    "from utils import get_host_ip, create_directory_if_not_exists, run_parameter_search, get_results"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
+   "metadata": {},
    "source": [
     "## Spin up the Weaviate vector store"
    ]
@@ -157,7 +153,7 @@
    "source": [
     "> ⚠️ For Apple M1/M2 chip users:\n",
     "> \n",
-    "> - In Docker Desktop Dashboard `Settings -> Features in development`, make sure to uncheck `Use containerid` for pulling and storing images. More info [here](https://docs.docker.com/desktop/settings/mac/#beta-features)\n",
+    "> - In Docker Desktop Dashboard `Settings -> Features in development`, make sure to **un**check `Use containerd` for pulling and storing images. More info [here](https://docs.docker.com/desktop/settings/mac/#beta-features)\n",
     "> - Make sure that Docker uses linux/amd64 platform and not arm64 (cell below should take care of that)"
    ]
   },
@@ -269,7 +265,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "> 💡 This notebook defaults to the first 1000 rows of the [wikitext](https://huggingface.co/datasets/wikitext) dataset for demonstration purposes, but you can load your own dataset using one the other load components available on the [**Fondant Hub**](https://fondant.ai/en/latest/components/hub/#component-hub) or by creating your own [**custom load component**](https://fondant.ai/en/latest/guides/implement_custom_components/). Keep in mind that changing the dataset implies that you also need to change the evaluation dataset used in the evaluation pipeline. "
+    "> 💡 This notebook defaults to the first 1000 rows of the [wikitext](https://huggingface.co/datasets/wikitext) dataset for demonstration purposes, but you can load your own dataset using one the other load components available on the [**Fondant Hub**](https://fondant.ai/en/latest/components/hub/#component-hub) or by creating your own [**custom load component**](https://fondant.ai/en/latest/guides/implement_custom_components/). Keep in mind that changing the dataset implies that you also need to change the [evaluation dataset](evaluation_datasets/wikitext_1000_q.csv) used in the evaluation pipeline. "
    ]
   },
   {
@@ -283,7 +279,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Select parameters to search over\n",
+    "Select parameters to search over and values to try\n",
     "\n",
     "- `chunk_sizes`: size of each text chunk, in number of characters ([chunk text](https://github.com/ml6team/fondant/tree/main/components/chunk_text) component)\n",
     "- `chunk_overlaps`: overlap between chunks ([chunk text](https://github.com/ml6team/fondant/tree/main/components/chunk_text) component)\n",
@@ -298,10 +294,10 @@
    "outputs": [],
    "source": [
     "# parameter search\n",
-    "chunk_sizes = [256, 512]\n",
-    "chunk_overlaps = [10, 50]\n",
-    "embed_models = [(\"huggingface\",\"all-MiniLM-L6-v2\"), (\"huggingface\", \"BAAI/bge-base-en-v1.5\")]\n",
-    "top_ks = [2, 5]"
+    "chunk_sizes = [256]\n",
+    "chunk_overlaps = [100,150]\n",
+    "embed_models = [(\"huggingface\",\"all-MiniLM-L6-v2\")]\n",
+    "top_ks = [2]"
    ]
   },
   {
@@ -358,7 +354,7 @@
     "\n",
     "> 💡 Use a GPU to speed up the embedding step (when not using an external API)\n",
     "\n",
-    "> 💡 Steps that have been processed before are cached and will be skipped in subsequent runs.\n"
+    "> 💡 Steps that have been processed before are cached and will be skipped in subsequent runs which speeds up processing.\n"
    ]
   },
   {
@@ -367,7 +363,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "parameters_search_results = run_parameters_search(\n",
+    "parameter_search_results = run_parameter_search(\n",
     "    extra_volumes=extra_volumes,\n",
     "    fixed_args=fixed_args,\n",
     "    fixed_index_args=fixed_index_args,\n",
@@ -390,7 +386,96 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Compare the performance of your runs below"
+    "Compare the performance of your runs below. The default evaluation component uses [Ragas](https://github.com/explodinggradients/ragas) and provides the following two performance measures [context precision](https://docs.ragas.io/en/latest/concepts/metrics/context_precision.html) and [context relevancy](https://docs.ragas.io/en/latest/concepts/metrics/context_relevancy.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def show_selected_results(df):\n",
+    "    columns_to_show = ['rag_config_name', 'chunk_size', 'chunk_overlap', 'embed_model', 'top_k', 'context_precision', 'context_relevancy']\n",
+    "    results_to_show = df[columns_to_show].sort_values('context_precision', ascending=False).set_index('rag_config_name').head(20)\n",
+    "    print(f'Showing top {len(results_to_show)} results')\n",
+    "    return results_to_show"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "# all results\n",
+    "results_df = output_results(results=parameter_search_results)\n",
+    "\n",
+    "# selected columns & rows\n",
+    "show_selected_results(results_df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Visualize Results"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Make sure plotly is installed"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install -q \"plotly\" --disable-pip-version-check && echo \"Plotly installed successfully\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "def add_embed_model_numerical_column(df):\n",
+    "    df['embed_model_numerical'] = pd.factorize(df['embed_model'])[0] + 1\n",
+    "    return df\n",
+    "\n",
+    "def show_legend_embed_models(df):\n",
+    "    columns_to_show = ['embed_model','embed_model_numerical']\n",
+    "    df = df[columns_to_show].drop_duplicates().set_index('embed_model_numerical')\n",
+    "    df.index.name = ''\n",
+    "    return df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# add column with numerical representation of embedding models\n",
+    "results_df = add_embed_model_numerical_column(results_df)\n",
+    "\n",
+    "# show legend\n",
+    "show_legend_embed_models(results_df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Plot results**"
    ]
   },
   {
@@ -399,8 +484,12 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "results_df = output_results(results=parameters_search_results)\n",
-    "results_df"
+    "import plotly.express as px\n",
+    "dimensions = ['chunk_size', 'chunk_overlap', 'embed_model_numerical', 'top_k', 'context_precision']\n",
+    "fig = px.parallel_coordinates(results_df, color=\"context_precision\",\n",
+    "                              dimensions=dimensions,\n",
+    "                              color_continuous_scale=px.colors.sequential.Bluered)\n",
+    "fig.show()"
    ]
   },
   {

diff --git a/src/utils.py b/src/utils.py
@@ -154,8 +154,8 @@ def extract_timestamp(folder_name):
     return datetime.strptime(timestamp_str, "%Y%m%d%H%M%S")
 
 
-# Output pipelines evaluations results dataframe
-def output_results(results):
+# Collect pipeline evaluations in results dataframe
+def get_results(results):
     flat_results = []
 
     for entry in results:
@@ -177,7 +177,7 @@ def output_results(results):
     return pd.DataFrame(flat_results)
 
 
-def run_parameters_search(  # noqa: PLR0913
+def run_parameter_search(  # noqa: PLR0913
     extra_volumes,
     fixed_args,
     fixed_index_args,
@@ -233,7 +233,7 @@ def run_parameters_search(  # noqa: PLR0913
             weaviate_class=index_config_class_name,
         )
 
-    parameters_search_results = []
+    parameter_search_results = []
     for i, (index_dict, top_k) in enumerate(
         itertools.product(indexes, top_ks),
         start=1,
@@ -277,9 +277,9 @@ def run_parameters_search(  # noqa: PLR0913
         results_dict.update(fixed_index_args)
         results_dict.update(fixed_eval_args)
 
-        parameters_search_results.append(results_dict)
+        parameter_search_results.append(results_dict)
 
-    return parameters_search_results
+    return parameter_search_results
 
 
 # index pipeline runner