Skip to content

Commit c95a34e

Browse files
author
Joshua Gornall
authored
Update batch-to-online.ipynb (online-ml#597)
#Simple spelling error on line 20 ```diff - logisitc +logistic ```
1 parent ce811be commit c95a34e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/examples/batch-to-online.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
"\n",
1818
"The whole point of machine learning is to *learn from data*. In *supervised learning* you want to learn how to predict a target $y$ given a set of features $X$. Meanwhile in an unsupervised learning there is no target, and the goal is rather to identify patterns and trends in the features $X$. At this point most people tend to imagine $X$ as a somewhat big table where each row is an observation and each column is a feature, and they would be quite right. Learning from tabular data is part of what's called *batch learning*, which basically that all of the data is available to our learning algorithm at once. Multiple libraries have been created to handle the batch learning regime, with one of the most prominent being Python's [scikit-learn](https://scikit-learn.org/stable/).\n",
1919
"\n",
20-
"As a simple example of batch learning let's say we want to learn to predict if a women has breast cancer or not. We'll use the [breast cancer dataset available with scikit-learn](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html). We'll learn to map a set of features to a binary decision using a [logistic regression](https://www.wikiwand.com/en/Logistic_regression). Like many other models based on numerical weights, logisitc regression is sensitive to the scale of the features. Rescaling the data so that each feature has mean 0 and variance 1 is generally considered good practice. We can apply the rescaling and fit the logistic regression sequentially in an elegant manner using a [Pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html). To measure the performance of the model we'll evaluate the average [ROC AUC score](https://www.wikiwand.com/en/Receiver_operating_characteristic) using a 5 fold [cross-validation](https://www.wikiwand.com/en/Cross-validation_(statistics)). "
20+
"As a simple example of batch learning let's say we want to learn to predict if a women has breast cancer or not. We'll use the [breast cancer dataset available with scikit-learn](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html). We'll learn to map a set of features to a binary decision using a [logistic regression](https://www.wikiwand.com/en/Logistic_regression). Like many other models based on numerical weights, logistic regression is sensitive to the scale of the features. Rescaling the data so that each feature has mean 0 and variance 1 is generally considered good practice. We can apply the rescaling and fit the logistic regression sequentially in an elegant manner using a [Pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html). To measure the performance of the model we'll evaluate the average [ROC AUC score](https://www.wikiwand.com/en/Receiver_operating_characteristic) using a 5 fold [cross-validation](https://www.wikiwand.com/en/Cross-validation_(statistics)). "
2121
]
2222
},
2323
{

0 commit comments

Comments
 (0)