Please note that any and all resources here are based on what our team at Dotlas found helpful while learning these topics themselves back in the day. This in no way means that you should strictly follow the same guides to understand these topics. Often what you need to learn these things is just a web search away.
Treat the helper materials here as just recommendations / reference.
Python is a programming language that is commonly used for data science and data engineering
If you are familiar with Python, you can skip this section
If you're familiar with Git & GitHub, you can skip this section
Git is a version control tool that helps you track changes to your code / project. Github stores
- Git & GitHub Cheatsheet
- Traversy Media's Git & Github Course
- Corey Schafer's Git & Github Course
- ACM's Coding Bootcamp for Git
If you're familiar with Pandas, you can skip this section
Pandas can be used for storing scraped data and saving it to a file (such as CSV, JSON or Parquet)