Using a publicly available list of wines and taster reviews, we used a series of Natural Language Processing steps to enrich the data set to include wine type and flavor profiles. Next, a terminal app will take your answers to a series of questions and return wine recommendations. Additional steps were taken to scrape a local grocery store's inventory to see which of these wines were available locally.
See recommender.gif for a screen capture of the terminal app, ProjectPresentation.pdf for presentation, and Write_Up.pdf for full report.
- Download Repo
- Run Application Script: filters.py
- Follow CLI prompts
- Sample Output: recommendations.txt
Original dataset: winemag-data-130k-v2.csv Source: https://www.kaggle.com/christopheiv/winemagdata130k
Original grape Variety list : grape_list.csv Source: https://en.wikipedia.org/wiki/List_of_grape_varieties
- Parser script: grape_list_parser.py
- Output: grape_list_parsed.csv
- Unit test: grape_list_parser_test.py
- Matching script: grape_id.py
- Unit test: grape_id_test.py
- Output: wine_with_type.csv
- Definitions: 3. NLP Definitions.pdf
- NLP Test on Subset: Desc_Test_1_Box.ipynb
- Chunked Categorizer Script: Add_Categories.py
- Output: wine_with_flavors.csv
- Scraper: grocer_scraper.py (modify for your grocer's website)
- Output: grocery_list.csv
- Comparer Script: grocer_availability.py
- Output: wine_with_availability.csv
- Notebook: reviewer_analysis.ipynb
This a continuation of a group final project for CS 5010: Python for Data Analysis with contributors Nikki Aaron, Bev Dobrenz, Joseph Wysocki, and Amanda West.