rasbt · mniyas · Jan 22, 2022
diff --git a/docs/sources/CHANGELOG.md b/docs/sources/CHANGELOG.md
@@ -21,6 +21,7 @@ The CHANGELOG for the current development version is available at
 
 - The `mlxtend.evaluate.bootstrap_point632_score` now supports `fit_params`. ([#861](https://github.com/rasbt/mlxtend/pull/861))
 - The `mlxtend/plotting/decision_regions.py` function now has a `contourf_kwargs` for matplotlib to change the look of the decision boundaries if desired. ([#881](https://github.com/rasbt/mlxtend/pull/881) via [[pbloem](https://github.com/pbloem)])
+- The `mlxtend.frequent_patterns.metrics` provides **Kulczynski metric** and **Imbalance Ratio** metrics as `kulczynski_measure` and `imbalance_ratio` ([#840](https://github.com/rasbt/mlxtend/issues/840))
 
 ##### Changes
 

diff --git a/docs/sources/user_guide/frequent_patterns/metrics.ipynb b/docs/sources/user_guide/frequent_patterns/metrics.ipynb
@@ -0,0 +1,389 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Evaluating quality of Association Rules"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Overview"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "A strong association rule may or may not be interesting for a specific application. Some measures have been developed to help evaluate association rules. `mlxtend` implements two such measures, Kulczynski Measure and Imbalance Ratio."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Kulczynski Measure:\n",
+    "\n",
+    "The Kulczynski measure $K_{A,B}$ can be interpreted as the average between the confidence that $A ⇒ B$ and the confidence that $B ⇒ A$\n",
+    "\n",
+    "The Kulczynski measure $K_{A,B} ∈ [0, 1]$ of the itemsets $A ⊆ I$ and\n",
+    "$B ⊆ I$ such that $A ∩ B = \\varnothing$ is given by\n",
+    "\n",
+    "$$K_{A,B} = \\frac{V_{A⇒B} + V_{B⇒A}}{2}$$\n",
+    "\n",
+    "$$K_{A,B} = \\frac{1}{2} \\Bigg[\\frac{sup(A \\cup B)}{sup(A)} + \\frac{sup(A \\cup B)}{sup(B)} \\Bigg]$$\n",
+    "\n",
+    "- If $K_{A,B} = 0$, then $A ⊆ T$ implies that $B \\nsubseteq T$ for any transaction $T$\n",
+    "- If $K_{A,B} = 1$, then $A ⊆ T$ implies that $B ⊆ T$ for any transaction $T$\n",
+    "- Note that the Kulczynski measure is symmetric: $K_{A,B} = K_{B,A}$"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Imbalance Ratio:\n",
+    "The imbalance ratio $I_{A,B}$ can be interpreted as the ratio between the absolute difference between the support count of $A$ and the support count of $B$ and the number of transactions that contain $A$, $B$, or both $A$ and $B$\n",
+    "- The imbalance ratio $I_{A,B} ∈ [0, 1]$ of the itemsets $A ⊆ I$ and $B ⊆ I$ is given by\n",
+    "\n",
+    "$$I_{A,B} =\\frac{|N_A − N_B|}{N_A + N_B − N_{A∪B}}$$\n",
+    "- If $I_{A,B} = 0$, then $A$ and $B$ have the same support\n",
+    "- If $I_{A,B} = 1$, then either $A$ or $B$ has zero support\n",
+    "- Note that the imbalance ratio is symmetric: $I_{A,B} = I_{B,A}$"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## References\n",
+    "\n",
+    "[1] Chapter 6 of J. Han, M. Kamber, J. Pei, “Data Mining: Concepts and Techniques”, 3rd edition, Elsevier/Morgan Kaufmann, 2012"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example 1 -- Evaluate Kulczynski Measure of an Association rule:\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>antecedents</th>\n",
+       "      <th>consequents</th>\n",
+       "      <th>antecedent support</th>\n",
+       "      <th>consequent support</th>\n",
+       "      <th>support</th>\n",
+       "      <th>confidence</th>\n",
+       "      <th>lift</th>\n",
+       "      <th>leverage</th>\n",
+       "      <th>conviction</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>(Eggs)</td>\n",
+       "      <td>(Kidney Beans)</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>0.00</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>(Kidney Beans)</td>\n",
+       "      <td>(Eggs)</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.80</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>0.00</td>\n",
+       "      <td>1.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>(Eggs)</td>\n",
+       "      <td>(Onion)</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.75</td>\n",
+       "      <td>1.25</td>\n",
+       "      <td>0.12</td>\n",
+       "      <td>1.6</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>(Onion)</td>\n",
+       "      <td>(Eggs)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.25</td>\n",
+       "      <td>0.12</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>(Milk)</td>\n",
+       "      <td>(Kidney Beans)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>0.00</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>(Onion)</td>\n",
+       "      <td>(Kidney Beans)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>0.00</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>(Yogurt)</td>\n",
+       "      <td>(Kidney Beans)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>0.00</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7</th>\n",
+       "      <td>(Eggs, Onion)</td>\n",
+       "      <td>(Kidney Beans)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>0.00</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>8</th>\n",
+       "      <td>(Eggs, Kidney Beans)</td>\n",
+       "      <td>(Onion)</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.75</td>\n",
+       "      <td>1.25</td>\n",
+       "      <td>0.12</td>\n",
+       "      <td>1.6</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>9</th>\n",
+       "      <td>(Onion, Kidney Beans)</td>\n",
+       "      <td>(Eggs)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.25</td>\n",
+       "      <td>0.12</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>10</th>\n",
+       "      <td>(Eggs)</td>\n",
+       "      <td>(Onion, Kidney Beans)</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.75</td>\n",
+       "      <td>1.25</td>\n",
+       "      <td>0.12</td>\n",
+       "      <td>1.6</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>11</th>\n",
+       "      <td>(Onion)</td>\n",
+       "      <td>(Eggs, Kidney Beans)</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>0.8</td>\n",
+       "      <td>0.6</td>\n",
+       "      <td>1.00</td>\n",
+       "      <td>1.25</td>\n",
+       "      <td>0.12</td>\n",
+       "      <td>inf</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "              antecedents            consequents  antecedent support  \\\n",
+       "0                  (Eggs)         (Kidney Beans)                 0.8   \n",
+       "1          (Kidney Beans)                 (Eggs)                 1.0   \n",
+       "2                  (Eggs)                (Onion)                 0.8   \n",
+       "3                 (Onion)                 (Eggs)                 0.6   \n",
+       "4                  (Milk)         (Kidney Beans)                 0.6   \n",
+       "5                 (Onion)         (Kidney Beans)                 0.6   \n",
+       "6                (Yogurt)         (Kidney Beans)                 0.6   \n",
+       "7           (Eggs, Onion)         (Kidney Beans)                 0.6   \n",
+       "8    (Eggs, Kidney Beans)                (Onion)                 0.8   \n",
+       "9   (Onion, Kidney Beans)                 (Eggs)                 0.6   \n",
+       "10                 (Eggs)  (Onion, Kidney Beans)                 0.8   \n",
+       "11                (Onion)   (Eggs, Kidney Beans)                 0.6   \n",
+       "\n",
+       "    consequent support  support  confidence  lift  leverage  conviction  \n",
+       "0                  1.0      0.8        1.00  1.00      0.00         inf  \n",
+       "1                  0.8      0.8        0.80  1.00      0.00         1.0  \n",
+       "2                  0.6      0.6        0.75  1.25      0.12         1.6  \n",
+       "3                  0.8      0.6        1.00  1.25      0.12         inf  \n",
+       "4                  1.0      0.6        1.00  1.00      0.00         inf  \n",
+       "5                  1.0      0.6        1.00  1.00      0.00         inf  \n",
+       "6                  1.0      0.6        1.00  1.00      0.00         inf  \n",
+       "7                  1.0      0.6        1.00  1.00      0.00         inf  \n",
+       "8                  0.6      0.6        0.75  1.25      0.12         1.6  \n",
+       "9                  0.8      0.6        1.00  1.25      0.12         inf  \n",
+       "10                 0.6      0.6        0.75  1.25      0.12         1.6  \n",
+       "11                 0.8      0.6        1.00  1.25      0.12         inf  "
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "from mlxtend.preprocessing import TransactionEncoder\n",
+    "from mlxtend.frequent_patterns import apriori, association_rules\n",
+    "from mlxtend.frequent_patterns import metrics\n",
+    "\n",
+    "dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],\n",
+    "           ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],\n",
+    "           ['Milk', 'Apple', 'Kidney Beans', 'Eggs'],\n",
+    "           ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],\n",
+    "           ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]\n",
+    "\n",
+    "te = TransactionEncoder()\n",
+    "te_ary = te.fit_transform(dataset)\n",
+    "df = pd.DataFrame(te_ary, columns=te.columns_)\n",
+    "freq_items = apriori(df, min_support=0.6, use_colnames=True)\n",
+    "rules = association_rules(freq_items, metric=\"confidence\", min_threshold=0.7)\n",
+    "rules"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.875"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a = frozenset(['Onion'])\n",
+    "b = frozenset(['Kidney Beans', 'Eggs'])\n",
+    "metrics.kulczynski_measure(rules, a, b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example 2 -- Evaluate Imabalance Ratio of an Association rule:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.2500000000000001"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a = frozenset(['Onion'])\n",
+    "b = frozenset(['Kidney Beans', 'Eggs'])\n",
+    "metrics.imbalance_ratio(freq_items, a, b)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.2"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}