Feature Selection using Bayesian Optimization

Instructions

Install Python dependencies

Required Python version: 3.6.9

Install virtualenv

pip3 install virtualenv==20.4.2

Create new virtualenv environment

virtualenv -p 3.6.9 virtualenv-feature-selection

Activate environment

source virtualenv-feature-selection/bin/activate

Install dependencies

pip3 install -r requirements.txt

A custom version of scikit-optimize has to be installed. Replace the following files in the folder "virtualenv-feature-selection/lib/python3.6/site-packages/skopt" with the respective files in https://github.com/patrickehrler/scikit-optimize/tree/master/skopt

space/space.py
optimizer/base.py
optimizer/optimizer.py

Run experiment

Experiment settings (datasets, number of features, cross-validation, ...) can be specified in "config.py".

Iteration experiment

This exeriment runs only Bayesian optimization. No convergence criterion is applied, only the maxmimum number of iterations. The result consists of the training and testing scores of each iteration. The results can be used to examine the convergence of a particular Bayesian approach.

python3 iter_experiment.py

Comparison Experiment

This experiment runs Bayesian optimization (incl. convergence criterion) and all comparison approaches. The result consists of the final training and testing scores.

python3 comparison_experiment.py

Results are stored in the folder "results/".

To leave the virtualenv environment enter "deactivate".

Create plots

All plots of the thesis can be created using the Jupyter notebook: jupyter-notebook/evaluation.ipynb. They are then saved as PDF in the folders "jupyter-notebook/iter_experiment_plots" and "jupyter-notebook/comparison_experiment_plots".

Repository Structure

jupyter-notebook/ Folder that consist the experiment of the proposal presentation and visualization for the final work.
results/ Folder where the results of the experiments are saved.
approaches.py Consists of dictionaries where all bayesian and comparison approaches are listed (including all adjustments)
bayesian_algorithms.py Implements the actual Bayesian optimization based on two different libraries.
callback.py Custom convergence callback function for scikit-optimize Bayesian optimization.
comparison_algorithms.py Implements different filter, wrapper and embedded feature selection techniques based on various libraries.
comparison_experiment.py Runs all possible Bayesian and comparison approaches (including CV), then stores results in "results/comparison_bayesian_experiment/".
config.py Configuration file where number of iterations, number of cross-validation splits, number of features and the used datasets can be set.
get_dataset_list_script.py Script that filters the datasets on openml.org based on the criteria set. Stores result in "/results/datasets/".
iter_experiment.py Runs all Bayesian approaches and saves score for each iteration. Output can be used to visualize convergence.
utils.py Utility functions concerning scores, convertion of vectors and estimators.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feature Selection using Bayesian Optimization

Instructions

Install Python dependencies

Run experiment

Iteration experiment

Comparison Experiment

Create plots

Repository Structure

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
jupyter-notebook		jupyter-notebook
results		results
.gitignore		.gitignore
approaches.py		approaches.py
bayesian_algorithms.py		bayesian_algorithms.py
callback.py		callback.py
comparison_algorithms.py		comparison_algorithms.py
comparison_experiment.py		comparison_experiment.py
config.py		config.py
get_dataset_list_script.py		get_dataset_list_script.py
iter_experiment.py		iter_experiment.py
readme.md		readme.md
requirements.txt		requirements.txt
skopt_searchspace_example.py		skopt_searchspace_example.py
utils.py		utils.py

patrickehrler/meta-modelling-feature-selection

Folders and files

Latest commit

History

Repository files navigation

Feature Selection using Bayesian Optimization

Instructions

Install Python dependencies

Run experiment

Iteration experiment

Comparison Experiment

Create plots

Repository Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages