Skip to content

Latest commit

 

History

History
37 lines (25 loc) · 4 KB

Syllabus.md

File metadata and controls

37 lines (25 loc) · 4 KB

Data Analysis Using Python

Introduction

Data Science and analysis is playing the most significant role today covering every industry in the market.For e.g finance,e-commerce,business,education,government. Now organizations play a 360-degree role to analyze the behavior and interest of their customers to make decisions in favor of them. Data is analyzed through a programming language such as python which is one of the most versatile languages and helps in doing a lot of things through it. Netflix is a pure data science project that reached the top by analyzing every single interest of their customers. Keywords: Data Visualization, AnacondaJupyter Notebook, Exploratory Data Analysis, Machine Learning.

Course Objectives

The main goal of this course is to help students or Faculty to learn, understand, and practice data analysis and machine learning approaches, which include the study of modern computing data technologies and scaling up machine learning techniques focusing on industry applications. Mainly the course objectives are conceptualization and summarization of Data Analysis and machine learning computing technologies, machine learning techniques, and scaling up machine learning approaches.

Duration - 22.5 Hrs.

Content

Day Topic Name Sub Topics Duration
1 Introduction to Data and Data Analysis Using Python Introduction to Data
Types of Data in Statistics (Numerical & Categorical)
Types of data in real world
Python Introduction
2.5 hrs.
2 Data Manipulation with NumPy Introduction
NumPy Arrays
NumPy Basics
Math
Random
Indexing
2.5 hrs.
3 Introduction to Pandas and Pandas Series Filtering
Statistics
Aggregation
Saving Data
Introduction
Series
2.5 hrs.
4 Data Analysis with pandas DataFrame
Combining
Indexing
File I/O
Grouping
Features
Filtering
Sorting
statistics
Plotting
2.5 hrs.
5 Data Preprocessing with Scikit-Learn Introduction
Standardizing Data
Data Range
Robust Scaling
Normalizing Data
Data Imputation
2.5 hrs.
6 Cleaning Data in Python Working with Duplicates and Missing Values
Which values should be replace with missing values based on data
Identifying and Eliminating Outliers
Dropping duplicate data
Filling missing data
Applying on raw dataset and introduction to Kaggle and other data sources
2.5 hrs.
7 Introduction to Data Visualization and Matplotlib Introduction to Visualization and Python packages
Matplotlib history
Introduction to plotting
Line Plot
Scatter Plot
Bar Graph
Histogram
Pie Chart
Box Plot
Tasks
2.5 hrs.
8 Data Visualization using Seaborn Using Seaborn Styles
Setting the default style
Color Palettes
Creating Custom Palettes
stripplot() and swarmplot()
boxplots, violinplots
barplots, pointplots and countplots
2.5 hrs.
9 Data Visualization using Seaborn Barplots, Pointplots and countplots
Regression Plots
Pair Plots
Creating heatmaps
Overview of the course
2.5 hrs.

Entry Requirements (Pre-requisites)

Students must have Knowlege on Python Programming and Statistics.

Hardware Requirements

  • i3 or above Processor Laptop/Desktop is required
  • 4 GB or above RAM is recommended
  • Good Internet Connectivity
  • OS-Windows 10 is Preferable