Authors: Artyom Abramovich, Aviv Kotek and Omer Jacobi
In the bottom of the module main.py uncomment the relevant function and run the file.
- run_similarity - run the file and follow the instructions afterwards to get the most similar players for the player of your liking.
- plot_similarity - run the file and follow the instructions afterwards to get the word cloud of most similar players for the player of your liking.
- plot_pca - run the file and get the plot of the [Midfielders, Defenders, Attackers, Goalkeepers] clusters of our FIFA19 data.
- plot_clustering - run the file and get the plot of our data clustered to 4 clusters.
- run_clustering - run the file and get output of main PCA components for PC-1 and PC-2.
- clusters_distribution - run the file and get the clusters distribution per every player position.
- determine_num_of_clusters - run the file and get the elbow plor for k-means.
- make_predictions - run our Transfer Benefit Prediction model.
Description of project files: Our project has 2 main directories:
- data_parsing directory:
This directory holds all the files relevant to data analysis, clustering and Player Similarities part of the project.
- csv directory: This directory contains our FIFA19 dataset, one of the csvs is the original dataset and the other is processed and changed by us for project use.
- clustering.py - Code responsible for data clustering tasks.
- constants.py - Different constants (e.g column names) that were used in different .py files in our project.
- processed.csv - FIFA19 processed csv for Players Similarity use.
- similarity.py - Code responsible for Player Similarity tasks.
- utils.py - Different function for use in different .py files throughout our project.
- visualization.py - Code responsilbe for some of our project visualizations.
- predictor directory:
- predictor.py - The file with the our Transfer Benefit Prediction model.
- csv directory:
- prediction_data.csv - The input data for our Transfer Benefit Prediction model.
- transfers.csv - All of the transfers we managed to find using the available data.
- The rest of the csv files were middle points between raw-data and transfer.csv creation.
- The rest of the .py files were used just to process FIFA08-FIFA16 dataset and create prediction_data.csv.
- If you wish to run the project please take a look at requirements.txt that lists all the package dependencies.
- The csvs for Transfer Benefit Prediction part were created using database.sql file which is not present in the repository. If you wish to get it, please download it from here.