Course Code:
dsmlpython
Duration:
28 hours
Prerequisites:
- Basic programming experience in any language (python preferred)
Audience
- Anyone who wants to learn Data Science
Course Outline:
Python
- Intro to Python
- Jupyter Notebooks
- Numpy
- Pandas
- Matplotlib, Seaborn, Plotly, Visdom
Discovery?
- Data Preparation
- Model Planning
- Model Building
- Operationalization
Machine Learning
- Inferential and Descriptive Statistics
- Regression
- Classification
- Using scikit-learn library
- Supervised and Unsupervised Learning Algorithms
- Naive Bayes
- K-Means
- Logistic Regression
- Support Vector Machines
- Neural Networks
- Decision Trees
- Random Forest
- Ensemble methods
- Build, Train and Deploy models
- Inference
KNIME
- Installation
- Starting and customizing KNIME analytics platform
- Nodes, Data and Workflows
- The Data Science Cycle
- Hands on Examples
- Disease tagging
- Risk Information Extraction
Introduction to AWS and Hadoop
Examples:
There will be examples using all the machine learning models and some practice questions for numpy and pandas.
Examples:
- Titanic Survival Exploration - Covers numpy, pandas, matplotlib, scikit-learn
- Spam Email Classifier - Naive Bayes Algorithm
- Bike Share Analysis - ML based project
- K Means Clustering Project
We will have 2 case studies (on day 2 and day 3, respectively)
Case studies:
- Drug property prediction using ML - Uses Machine Learning
- One study specific to what the client wants