https://www.kaggle.com/datasets/spscientist/students-performance-in-exams/discussion The dataset is a representation of fictional data about high school students' performance on a multi-subject exam, based on environmental and economic status factors. TO DO: Define end result goals, functionality wise as well as statistically/visually (graphs/charts).
Import data CSV into Python using Pandas
Understand data structure; iterate through data, assign columns to variables, etc.
Potentially organize/categorize data (e.g., by gender, parental college).
Use Scikit-learn to implement algorithms.
Decision Trees, Random Forests, Neural Networks Split the data into training and testing sets. Train each model using the training data. Evaluate data models
accuracy, precision, recall, etc. Create data visualization
ex. matplotlib, create graphs, charts, etc. Analyze data results
Relationships between variables-- correlation, causation, similarities, etc. Document code, and keep track of results/statistics
Create Report
TO INSTALL REQUIREMENTS: pip install -r requirements.txt