Machine-Learning-Decision-Trees

Group Members:
Edward Chen (ec1221)
Mubaarak Khan (mmk120)
Omar Ben-Gacem (ob420)

Running the Project

main.py has been setup to take in 3 arguments <dataset path>, <depth>, and <operating_mode>

Use python main.py <dataset path> <depth> <operating_mode> to run the project in a potential mode. All possible modes are described below
e.g:
python main.py "WIFI_db/clean_dataset.txt" 20 "metrics"

Operating Modes

operating_mode	Function
"show_tree"	generates an interactive MatPlotLib figure to view the tree. Note that all nodes will appear as circles, and upon clicking on the node, a popup will appear and give information on the node
"metrics"	Update the tables of the Confusion Matrix and the Performance Metrics and place them in the figures file. This may take a minutes to run.
"depth_benchmark"	Update the figures regarding the performance of the decision tree for various depths. Note in this instance the depth field is ignored, and instead plots all data from 4 to 70. NOTE: This computation takes a long time to run, see Reccomended Depth Parameters For the output of this test when run in advance
"normal"	Plot the normal distribution of the accuracy for all three Classification Methods over 10 Folds

Hyperparameter Tuning - Depth

Instead of providing a speific <depth> integer argument to build the tree with, inputing tune as the argument will run hyperparameter tuning on the dataset first to find the optimal depth and will proceed to use that value. NOTE: This is a very resource intensive task.

Reccomended Depth Parameters

Hyperparameter Tuning runs multiple trials (plotted below) to find the optimal depth parameter for both the clean and noisy datasets. The Plots are shown below

Clean Dataset Depth Optimisation	Noisy Dataset Depth Optimisation

Note the differing scales in the figures

Both Plots Plateau for a certain depth. When running the clean dataset, it is reccommended to use a depth of 20, and for the noisy dataset, it is reccomended to use a depth of 50. This will give the best performance of the model on each dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
WIFI_db		WIFI_db
figures		figures
.gitignore		.gitignore
Decision_Tree_Classifier.py		Decision_Tree_Classifier.py
Other_Classifiers.py		Other_Classifiers.py
README.md		README.md
Testbenches.py		Testbenches.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine-Learning-Decision-Trees

Running the Project

Operating Modes

Hyperparameter Tuning - Depth

Reccomended Depth Parameters

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

edzchen747/Machine-Learning-Decision-Trees

Folders and files

Latest commit

History

Repository files navigation

Machine-Learning-Decision-Trees

Running the Project

Operating Modes

Hyperparameter Tuning - Depth

Reccomended Depth Parameters

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages