another minor update to feature_binarizer_from_trees.ipynb

Be-Secure · Mar 11, 2020 · 0be6808 · 0be6808
1 parent ca927ee
commit 0be6808
Showing 1 changed file with 4 additions and 4 deletions.
diff --git a/examples/rbm/feature_binarizer_from_trees.ipynb b/examples/rbm/feature_binarizer_from_trees.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# `FeatureBinarizerFromTrees`\n",
     "\n",
-    "The `FeatureBinarizerFromTrees` transformer binarizes features for BooleanRuleCG (BRCG), LogisticRuleRegression (LogRR), and LinearRuleRegression (LinearRR) models. It generates binary features based on the splits in fitted decision trees. This approach naturally creates optimal thresholds and returns only important features. Compared to `FeatureBinarizer`, the `FeatureBinarizerFromTrees` transformer reduces the number of features required to produce an accurate model. Not only does this shorten training times, but more importantly, it often results in simpler rule sets.\n",
+    "The `FeatureBinarizerFromTrees` transformer binarizes features for BooleanRuleCG (BRCG), LogisticRuleRegression (LogRR), and LinearRuleRegression (LinearRR) models. It generates binary features (i.e. rules) based on the splits in fitted decision trees. This approach naturally creates optimal thresholds and returns only important features. Compared to `FeatureBinarizer`, the `FeatureBinarizerFromTrees` transformer reduces the number of features required to produce an accurate model. Not only does this shorten training times, but more importantly, it often results in simpler rule sets.\n",
     "\n",
     "This notebook demonstrates basic `FeatureBinarizerFromTrees`, compares `FeatureBinarizer`, and concludes with a formal performance comparison."
    ]
@@ -360,7 +360,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The model trains in around 10 seconds and appears to improve accuracy significantly. Though more features improved the fit in this case, it is important to point out that more features is not always better. For both explainability and accuracy, we suggest starting with a small number of features. From there, increase the number of features incrementally until accuracy plateaus or the explanation is sufficient."
+    "The model trains in around 10 seconds and appears to improve accuracy significantly. Though more features improved the fit in this case, it is important to point out that more features are not always better. For both explainability and accuracy, we suggest starting with a small number of features. From there, increase the number of features incrementally until accuracy plateaus or the explanation is sufficient."
    ]
   },
   {
@@ -398,7 +398,7 @@
    "source": [
     "## Using `FeatureBinarizerFromTrees` with Linear Models\n",
     "\n",
-    "To use `FeatureBinarizerFromTrees` with LogRR and LinearRR, set `returnOrd=True`. The transformer will return a standardized data frame of ordinal features in addition to the binarized features. The standardized features can then be passed to the linear model to improve accuracy. (Make sure to set `useOrd=True` for the linear model.)"
+    "To use `FeatureBinarizerFromTrees` with LogRR and LinearRR, set `returnOrd=True`. Like the standard `FeatureBinarizer`, the transformer will return a standardized data frame of ordinal features in addition to the binarized features. The standardized features can then be passed to the linear model to improve accuracy. (Make sure to set `useOrd=True` for the linear model.)"
    ]
   },
   {
@@ -604,7 +604,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The standard `FeatureBinarizer` creates thresholds by binning the data in a user-specified number of quantiles. The default setting of 9 thresholds creates 1,528 features for these data when negations are enabled. This is a very large feature space."
+    "The standard `FeatureBinarizer` creates thresholds by binning the data into a user-specified number of quantiles. The default setting of 9 thresholds creates 1,528 features for these data when negations are enabled. This is a very large feature space."
    ]
   },
   {