Data Science and analysis is playing the most significant role today covering every industry in the market.For e.g finance,e-commerce,business,education,government. Now organizations play a 360-degree role to analyze the behavior and interest of their customers to make decisions in favor of them. Data is analyzed through a programming language such as python which is one of the most versatile languages and helps in doing a lot of things through it. Netflix is a pure data science project that reached the top by analyzing every single interest of their customers. Keywords: Data Visualization, AnacondaJupyter Notebook, Exploratory Data Analysis, Machine Learning.
Day | Topic Name | Sub Topics | Duration |
---|---|---|---|
1 | Introduction to Data and Data Analysis Using Python | Introduction to Data Types of Data in Statistics (Numerical & Categorical) Types of data in real world Python Introduction |
2.5 hrs. |
2 | Introduction to Python & Conditional Statements | Literate Programming Jupyter Notebook Environment Markdown format for documentation Python basics Operators in Python Conditional Statements in python |
2.5 hrs. |
3 | Loop and Data Structures in Python | Iterations Strings String Functions,String Slicing Python Data Structures Lists List Methods Tuples Tuple Methods |
2.5 hrs. |
4 | File, Packages and Functional Programming | Dictionaries Dictionary Methods File Handling Packages and Modules List & Dictionary Comprehension |
2.5 hrs. |
5 | Data Manipulation with NumPy | Introduction NumPy Arrays NumPy Basics Math Random Indexing |
2.5 hrs. |
6 | Introduction to Pandas and Pandas Series | Filtering Statistics Aggregation Saving Data Introduction Series |
2.5 hrs. |
7 | Data Analysis with pandas | DataFrame Combining Indexing File I/O Grouping Features Filtering Sorting statistics Plotting |
2.5 hrs. |
8 | Data Preprocessing with Scikit-Learn | Introduction Standardizing Data Data Range Robust Scaling Normalizing Data Data Imputation |
2.5 hrs. |
9 | Cleaning Data in Python | Working with Duplicates and Missing Values Which values should be replace with missing values based on data Identifying and Eliminating Outliers Dropping duplicate data Filling missing data Applying on raw dataset and introduction to Kaggle and other data sources |
2.5 hrs. |
10 | Introduction to Data Visualization and Matplotlib | Introduction to Visualization and Python packages Matplotlib history Introduction to plotting Line Plot Scatter Plot Bar Graph Histogram Pie Chart Box Plot Tasks |
2.5 hrs. |
11 | Data Visualization using Seaborn | Using Seaborn Styles Setting the default style Color Palettes Creating Custom Palettes stripplot() and swarmplot() boxplots, violinplots and lvplots barplots, pointplots and countplots |
2.5 hrs. |
12 | Data Visualization using Seaborn | Using Seaborn Styles Setting the default style Color Palettes Regression Plots Binning data Matrix plots Creating heatmaps |
2.5 hrs. |
The main goal of this course is to help students or Faculty to learn, understand, and practice data analysis and machine learning approaches, which include the study of modern computing data technologies and scaling up machine learning techniques focusing on industry applications. Mainly the course objectives are conceptualization and summarization of Data Analysis and machine learning computing technologies, machine learning techniques, and scaling up machine learning approaches.
Students must have Knowlege on Python Programming and Statistics.
- i3 or above Processor Laptop/Desktop is required
- 4 GB or above RAM is recommended
- Good Internet Connectivity
- OS-Windows 10 is Preferable