Welcome to my collection of data science and machine learning tasks. This repository contains a series of single-notebook solutions to various data-related problems.
These tasks represent a range of data analysis and machine learning exercises, typically completed within a single Jupyter notebook. Some of these may be derived from interview challenges I've encountered, while others are practice exercises or small personal projects.
Key points about this collection:
- Each notebook focuses on a specific data analysis or machine learning task
- Tasks vary in complexity but are generally designed to be concise and focused
- The solutions demonstrate my approach to data problems and coding style
- Data used is either synthetic, from public datasets, or modified to ensure privacy
Feel free to explore the notebooks. If you have any questions about the approaches used or would like to discuss any of the tasks, please don't hesitate to reach out!
- Clone the repository:
git clone https://github.com/bab-git/data-science-interviews.git
- Python Environment Setup: Ensure you have Python 3.12.0 installed. You can install it using pyenv:
pyenv install 3.12.0
pyenv local 3.12.0
- Use
make
to set up your environment (strict setup recommended):
- Option A — Strict (preferred for reproducibility)
make setup # for strict requirements
- Option B — Flexible (use with caution)
make setup-flex # for flexible requirements
⚠️ This is more forgiving with package versions but may require minor code fixes depending on Python version.
- Run a specific task notebook:
make notebook TASK=task_folder_name
This will:
- Check that your Python version is 3.12.0
- Install any task-specific
requirements.txt
if available - Launch Jupyter Notebook in the given task directory
- To reset the environment:
make clean
- Predict Advertisement Response - A supervised model to predict the response of residents to direct mailing advertisements.
- Predictive Modeling for Material Strength - a regression-based predictive model to estimate the material strength.
- Recipe Recommender for Grocery Apps: Presentation slides detailing the business proposal, system architecture, and implementation strategy for a Recipe Recommender System.
- Customer Satisfaction Prediction: A classification model to predict customer satisfaction of an online store based on demographic, transactional, and behavioral data.
- Hotel Staff Size Estimation via Regression: A regression model to estimate the optimal number of hotel staff for prospective buyers based on various hotel characteristics.
- [Other tasks will be added soon]
- Data Cleaning and Preprocessing
- Exploratory Data Analysis (EDA)
- Feature Engineering
- Machine Learning (Supervised and Unsupervised)
- Deep Learning
- Time Series Analysis
- Data Visualization
- Python
- Pandas, NumPy
- Scikit-learn
- TensorFlow, PyTorch
- Matplotlib, Seaborn
- Jupyter Notebooks
While this repository is primarily for showcasing my work, I welcome discussions and suggestions. Feel free to open an issue if you have any questions or ideas for improvement.
This project is licensed under the MIT License - see the LICENSE file for details.
LinkedIn: https://www.linkedin.com/in/bhosseini/
GitHub: https://github.com/bab-git/