Data Science
Data science is very much popular in today’s world scenario as there is a huge amount of data generated each day in different fields such as BFSI, Healthcare and Telecom. This training encompasses a conceptual understanding of Statistics, Machine Learning and Deep Learning using the Python and R programming languages.
Introduction to Data Science
- What is Data Science?
- Data science lifecycle
- Use Cases/applications/examples
- DS tools and technology
Python Programming
- Installation
- Python 2.7 Vs 3.4
- Python programming fundamentals
- Data types and structures, variables, Control flows, and functions
- Python libraries
- Numpy, Pandas, SciKitLearn, MatPlotLib
R Programming
- Introduction to R
- Vectors
- Matrices
- Factors
- Data Frames
- Lists
Data Extraction, Wrangling and Exploration
- Data Analysis Pipeline
- What is Data Extraction
- Types of Data
- Raw and Processed Data
- Data Wrangling
- Exploratory Data Analysis(EDA)
- Data Structures in Pandas - Series and Data Frames
Probability
- Basic Probability
- Conditional Probability
- Properties of Random Variables
- Expectations
- Variance
- Entropy and cross-entropy
- Covariance and correlation
- Estimating probability of Random variable
- Understanding standard random processes
Inferential Statistics
- Estimating parameters of a population using sample statistics
- Hypothesis testing and confidence intervals
- T-tests and ANOVA
- Correlation and regression
- Chi-squared test
Descriptive Stats
- Compute and interpret values like: Mean, Median, Mode, Sample, Population and Standard Deviation.
- Compute simple probabilities.
- Explore data through the use of bar graphs, histograms and other common visualizations.
- Investigate distributions and understand a distributions properties.
- Manipulate distributions to make probabilistic predictions on data.
Data visualization
- Bar Graph, Histogram, Pi Chart, Line Chart, Box (Whisker) Plot, Scatter Plot, Heat map
Basic Machine Learning Algorithms
- Linear Regression
- Logistic Regression
- Decision Trees
- KNN (K- Nearest Neighbours)
- K-Means Clustering
- Naïve Bayes
- Dimensionality Reduction
Advanced algorithms
- Random Forests
- Dimensionality Reduction Techniques
- Support Vector Machines
- Gradient boosting
Introduction to Deep Learning
- Tensor flow
- Neural Networks
- Biological Neural Networks
- Understand Artificial Neural Networks
- Building an Artificial Neural Network
- How ANN works
- Image recognition
- Image classification
Sentiment Analysis
Text Mining
Natural Language Processing(NLP)
Time Series
- What is Time Series data?
- Time Series variables
- Different components of Time Series data
- Visualize the data to identify Time Series Components
- Implement ARIMA model for forecasting
- Exponential smoothing models
- Identifying different time series scenario based on which different Exponential Smoothing model can be applied
- Implement respective ETS model for forecasting