Skip to content

Latest commit

 

History

History
60 lines (51 loc) · 7.5 KB

File metadata and controls

60 lines (51 loc) · 7.5 KB

APSSDC-LOGO

Data Science Using Python

Objectives:

  • To introduce students to the basic concepts and techniques of Data Science and Machine Learning.
  • To develop skills of using recent machine learning software for solving practical problems.
  • To gain experience in doing independent study and research.

Course Content

Chapter No. Topic Name Sub Topics
1 Overview, content and
Introduction to Data Science
Introduction to Data Science
What is Data Science
Programming Languages used for DS
Applications of Data Science
2 Introduction to Python Python Introduction
Literate Programming
Jupyter Notebook Environment
Markdown format for documentation
Python basics
Input and output statements in python
3 Introduction to Version Control System Purpose of Version Control System
Types of VCS
Introduction to Git and History
Git Terminology
Git Bash Installation and Unix bash commands
4 Git Basics Initializing repositories
Accessing Existing
Repositories Adding/Removing files from Staging area
Committing the changes to repository
Undoing the commits that are made
5 Remote Repository Introduction to GitHub
Creating an Account on GitHub
Create a remote repository
Adding the remotes Push, pull and fetch commands.
6 GitHub Pages Creation of personal portfolio site
Creating a GitHub Page using Markdown and
Jekyll themes for repositories.
7 Identifiers and Operators in Python Identifiers in Python
Properties for Declaring Identifiers
Type Conversions
Operators in Python
Examples
8 Data Types in Python Numbers int, float, complex
bool, None
Strings
Accessing the characters from strings
String Methods
9 Conditional Statements &
Loops in Python
Conditional Statements
For and While Loop
Break, continue keywords
10 Data Structures in Python Lists
List Methods
Tuples
Tuple Methods
Dictionaries
Dictionary Methods
11 Functions Different types of Functions
- Built-in Functions
- User Defined
Lambda function
Call by value
Call by Reference
12 File Handling in Python Open and Closing Files in Python
Writing to Files in Python
Reading Files in Python
File Methods
13 Modules and Packages in Python Types of Modules and Packages
- Built-in Packages and Modules
- Math, Random, OS, sys module
- User Defined
Examples
14 Comprehensions and Functional Programming List, Dictionary & set Comprehensions
map(), filter(), reduce()
15 Object-Oriented Programming - 1 Object-orientation - Class, Objects, Methods,
Encapsulation
Inheritance: Single, Multiple, Multilevel, Hierarchical, Hybrid Inheritance
16 Object-Oriented Programming - 2 Polymorphism
Method overriding,
Variable Overriding
17 Introduction to Data Analysis Introduction to Data
Types of Data in Statistics (Numerical & Categorical)
Overview of Python Concepts
18 Data Manipulation with NumPy Introduction
NumPy Arrays
NumPy Basics
Math
Random
Indexing
Filtering
Statistics
Aggregation
Saving Data
19 Data Analysis with pandas Introduction
Series
DataFrame
Combining
Indexing
File I/O
Grouping
Features
Filtering
Sorting
statistics
Plotting
20 Data Cleaning With Pandas Working with Duplicates and Missing Values
Which values should be replace with missing values based on data
Identifying and Eliminating Outliers
Applying on raw dataset and
introduction to Kaggle and other data sources
21 Data Preprocessing with Scikit-Learn Introduction
Standardizing Data
Data Range
Robust Scaling
Normalizing Data
Data Imputation
22 Introduction to Data Visualization and Matplotlib Introduction to Visualization and Python packages
Matplotlib history
Introduction to plotting
Line Plot
Scatter Plot
Bar Graph
Histogram
Pie Chart
Box Plot
Tasks
23 Data Visualization using Seaborn Using Seaborn Styles
Setting the default style
Color Palettes
Creating Custom Palettes
stripplot() and swarmplot()
boxplots, violinplots and lvplots
barplots, pointplots and countplots
24 Data Visualization using Seaborn Using Seaborn Styles
Setting the default style
Color Palettes
Regression Plots
Binning data
Pairplots
Creating heatmaps
25 Introduction to Machine Learning What is Machine Learning
Machine Learning Classification
Types of Algorithms
26 Regression Models Linear Regression with One variable
Evaluation Metrics in Regression Models
Train/Test splitting of data & Cross Validation
Linear Regression with Multiple Variables
Polynomial Features
Non-Linear Regression with One variable
Non-Linear Regression with Multiple variable
27 Regularization Models Under fitting
Overfitting
Best fit
Applying Ridge Regression
Lasso Regression Algorithms
28 Classification models - 1 Introduction to categorical types of data
Types of classification
K-Nearest Neighbors Classifier
Evaluation Metrics for classification Models
Logistic regression
Support Vector Machines
29 Unsupervised Machine Learning Introduction to Unsupervised Learning
Types of Unsupervised Learning
30 Clustering Introduction to clustering
Types of Clustering Methods
K-Means Clustering
Hierarchical Clustering
Applications
31 Dimensionality Reduction Dimensionality Reduction:
Principal Component Analysis (PCA)

Hardware Requirements:

  • i3 or above Processor is required
  • 4 GB or above RAM is recommended
  • Good Internet Connectivity
  • OS-Windows 10 is Preferable

Duration :

45 Days (2 hours each day)

Entry Requirements:

  • Students must have Knowledge of basic computer.
  • Students must have Knowledge on Statistics Algebra, and Probability.