Multicollinearity:

This has been an issue for some Machine Learning models, although few algorithms are not affected by multicollinearity, but at the least , reducing redundant features will make the model less expensive in terms of computational power.

This function looks to reduce feature space based on correlation between features and at the same time , looking at the correlation between features and target variable

A simple rule of thumb is , say your feature A and B are highly correlated , then we need to drop on of the features. We will drop the feature (say feature B) that has the :

More average correlation with all other variables in the rest of the data set
Less correlation with Target , then the other Variable

Although I found it useful , yet simply running this code may not be efficient alone, because it does not consider the impact of feature interaction alone. I am developing another comprehensive preprocessing function that will take care of this issue. I will post that late on. For now, it is better to use this code once you have done your feature engineering / feature interactions

Also, this is supposed to work for regression and two class classification problems

In the end , please let me know if there are any glitches, room for improvements (i am pretty sure that there are many ) etc, after all , we all learn from each other’s mistakes :-)

Thanks Fahad

Instructions:

Function takes three arguments

Data : Panda's Data Frame is required
Threshold: The minimum level of coorelation that you want to see between variables , between 0 - 1 , absolute values
Target : specify the target column (y)

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
FE_with_Corr_Git.ipynb		FE_with_Corr_Git.ipynb
README.md		README.md
__init__.py		__init__.py
fe_with_corr_git.py		fe_with_corr_git.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multicollinearity:

Instructions:

About

Releases

Packages

Contributors 2

Languages

mfahadakbar/Feature_selection_with_Corr

Folders and files

Latest commit

History

Repository files navigation

Multicollinearity:

Instructions:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages