This repository includes research and findings demonstrating the data processing cycle that includes:
- Reading, cleaning, and pre-processing raw data,
- Exploratory Data Analysis (EDA) with statistical and graphical techniques,
- Feature engineering for numeric and categorical features,
- Model training and evaluation.
The contents of the repo are:
- Code: Jupyter notebooks are stored in
cyberdata_mlai
, - Data: included in directory
cyberdata_mlai/data
, - Packages: the file
requirements.txt
includes all required packages. These can be installed usingpip install -r requirements.txt
.
Please follow the instructions in vs-code-ml if you want to setup a development environment using VS Code and Jupyter. The Jupyter notebooks can be used in Google Colab, however the packages needed will need to be installed from scratch.