-
For this analysis, I had used two datasets from StackOverflow survey (2019 and 2020) to compare and see the trends over a couple of years. Please find the link to download StackOverflow dataset here
-
The libraries required to run this project are:
pandas
numpy
matplotlib
seaborn
pandas_profiling
I took this dataset for my project as I was interested in understanding and answering the following questions:
- Which languages saw the rise in their usage the most from 2019-2020?
- Has the deciding factor 'Remote Work / Work from home' has increased over the years and due to the pandemic?
- When did most people write their first line of code?
- Which group of people use stackoverflow for their work the most?
- What factors associate with high salary?
stackoverflowsurvey-analysis.ipynb
: This is notebook where I explore, wrangle the data and try to extract the answers to the questions above.
I have added comments for a better understanding of the thought process for my individual steps.
The main findings of the code can be found at the post available on Medium post here.
Thanks to Stack Overflow for providing the dataset