Skip to content

Latest commit

 

History

History
21 lines (18 loc) · 700 Bytes

File metadata and controls

21 lines (18 loc) · 700 Bytes

Sebastian Raschka, 2015

Python Machine Learning - Code Examples

Chapter 4 - Building Good Training Sets – Data Preprocessing

  • Dealing with missing data
    • Eliminating samples or features with missing values
    • Imputing missing values
    • Understanding the scikit-learn estimator API
  • Handling categorical data
    • Mapping ordinal features
    • Encoding class labels
    • Performing one-hot encoding on nominal features
  • Partitioning a dataset in training and test sets
  • Bringing features onto the same scale
  • Selecting meaningful features
    • Sparse solutions with L1 regularization
    • Sequential feature selection algorithms
  • Assessing feature importance with random forests
  • Summary