This is the third project of Data Analyst Nanodegree.
I've selected 'No-Show' Database. This dataset collects information from 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointments.
I was interested to see correlation and trying to identify the reason behind missed appointments.
I see one dependent variable :
'No-Show' - which indicates missed or taken appointment. and 8 independent variables:
Gender Age SMS Diabetes Scholarship Alcoholism Day of the week Hospital
The goal of the analysis is to check if any of these independent variables could help to predict missed appointments.