Data Science and analysis is playing the most significant role today covering every industry in the market.For e.g finance,e-commerce,business,education,government. Now organizations play a 360-degree role to analyze the behavior and interest of their customers to make decisions in favor of them. Data is analyzed through a programming language such as python which is one of the most versatile languages and helps in doing a lot of things through it. Netflix is a pure data science project that reached the top by analyzing every single interest of their customers. Keywords: Data Visualization, AnacondaJupyter Notebook, Exploratory Data Analysis, Machine Learning.
The main goal of this course is to help students or Faculty to learn, understand, and practice data analysis and machine learning approaches, which include the study of modern computing data technologies and scaling up machine learning techniques focusing on industry applications. Mainly the course objectives are conceptualization and summarization of Data Analysis and machine learning computing technologies, machine learning techniques, and scaling up machine learning approaches.
Day | Topic Name | Sub Topics | Duration |
---|---|---|---|
1 | Introduction to Data and Data Analysis Using Python | Introduction to Data Types of Data in Statistics (Numerical & Categorical) Types of data in real world Python Introduction |
2.5 hrs. |
2 | Data Manipulation with NumPy | Introduction NumPy Arrays NumPy Basics Math Random Indexing |
2.5 hrs. |
3 | Introduction to Pandas and Pandas Series | Filtering Statistics Aggregation Saving Data Introduction Series |
2.5 hrs. |
4 | Data Analysis with pandas | DataFrame Combining Indexing File I/O Grouping Features Filtering Sorting statistics Plotting |
2.5 hrs. |
5 | Data Preprocessing with Scikit-Learn | Introduction Standardizing Data Data Range Robust Scaling Normalizing Data Data Imputation |
2.5 hrs. |
6 | Cleaning Data in Python | Working with Duplicates and Missing Values Which values should be replace with missing values based on data Identifying and Eliminating Outliers Dropping duplicate data Filling missing data Applying on raw dataset and introduction to Kaggle and other data sources |
2.5 hrs. |
7 | Introduction to Data Visualization and Matplotlib | Introduction to Visualization and Python packages Matplotlib history Introduction to plotting Line Plot Scatter Plot Bar Graph Histogram Pie Chart Box Plot Tasks |
2.5 hrs. |
8 | Data Visualization using Seaborn | Using Seaborn Styles Setting the default style Color Palettes Creating Custom Palettes stripplot() and swarmplot() boxplots, violinplots barplots, pointplots and countplots |
2.5 hrs. |
9 | Data Visualization using Seaborn | Barplots, Pointplots and countplots Regression Plots Pair Plots Creating heatmaps Overview of the course |
2.5 hrs. |
Students must have Knowlege on Python Programming and Statistics.
- i3 or above Processor Laptop/Desktop is required
- 4 GB or above RAM is recommended
- Good Internet Connectivity
- OS-Windows 10 is Preferable