Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Data Science for Business Part 2
Welcome to Data Science For Business Part 2, Predicting Employee Turnover with H2O & LIME!
Course Overview: What You're Getting! (2:30)
Course Certificate - Instructions
Join Our Private Slack Channel
Video Subtitles (Captions)
Your Instructor: Meet Matt! (1:32)
Getting The Most Out Of This Course
BONUS: Market Basket Analysis & Product Recommender Algorithm
Would You Like To Become An Affiliate (And Earn 20% On Your Sales)?
Prerequisites
Prerequisite Courses
Test Your Baseline: Is This Course Right For You?
Module 0: Getting Started
🔽 Overview [File Download]
0.1 READ THIS FIRST
READ THIS FIRST!
0.2 The True Cost Of Employee Attrition
Employee Turnover: A $15M Per Year Problem
What Happens When Good Employees Leave? (6:25)
Calculating The Cost Of Turnover (8:57)
Excel Calculator (3:29)
0.3 What Tools Are In Our Toolbox?
Tools In Our Toolbox
Integrated Data Science Frameworks: BSPF & CRISP-DM (2:12)
Modeling: H2O And LIME For Binary Classification (2:17)
0.4 Frameworks
CRISP-DM (8:42)
Business Science Problem Framework (13:42)
0.5 Data Science Project Setup
Setting Up Your Data Science Project
R Project Setup (3:54)
Project Directory Structure (9:36)
Install Required Packages (5:53)
Collect Our Data Files (5:28)
Module 1: Business Understanding: BSPF & Code Workflows
🔽Module Overview [File Download]
Getting Code Help
1.1 Problem Understanding With BSPF
Business Understanding (1:31)
Library & Data Setup (4:42)
View The Business As A Machine (5:03)
Understand The Drivers, Part 1: By Dept (5:17)
Understand The Drivers, Part 2: By Job Role (5:55)
Measure The Drivers, Part 1: Collect Data (5:57)
Measure The Drivers, Part 2: Develop KPIs (5:46)
Uncover Problems & Opportunities, Part 1: calculate_attrition_cost() (6:55)
Uncover Problems & Opportunities, Part 2: Calculating Cost By Job Level (4:23)
Knowledge Check
Aside: Intro To Tidy Eval
Tidy Eval Primer
1.2 Streamlining The Attrition Code Workflow
Attrition Code Workflow (1:55)
Streamlining The Counts (2:06)
Streamlining The Count To Percentage Calculation (8:01)
Streamlining The Attrition Assessment (7:54)
Attrition Workflow Recap (1:33)
Knowledge Check
1.3 Visualizing Attrition With ggplot2
Visualizing Attrition Cost (0:47)
Data Manipulation For Visualization, Part 1 (2:38)
Data Manipulation For Visualization, Part 2 (5:41)
Visualization With ggplot2 (10:15)
Knowledge Check
1.4 Making A Custom Plotting Function: plot_attrition()
Making A Custom Plotting Function (4:57)
Developing plot_attrition(), Part 1: Function Setup (3:37)
Developing plot_attrition() Part 2: Handling The Inputs (10:15)
Developing plot_attrition() Part 3: Data Manipulation (9:38)
Developing plot_attrition() Part 4: Visualization (8:43)
1.5 Challenge #1: Cost Of Attrition
Challenge #1: Updating The Organization's Cost Of Attrition (1:18)
Knowledge Check
Solution (14:09)
1.6 Module Code Checkpoint
🔽 Module 1 Business Understanding Code
Module 2, Data Understanding: By Data Type & Feature-Target Interactions
🔽 Module 2 Overview [File Download]
2.1 Setting Up For Data Understanding
Data Understanding (0:50)
Setting Up (4:10)
Reviewing The Data (5:42)
2.2 EDA Part 1: Exploring Data By Data Type
EDA Part 1: Data Summarization (skimr) (4:38)
Exploring Character Data (8:30)
Exploring Numeric Data (4:57)
Knowledge Check
2.3 EDA Part 2: Visualizing The Feature-Target Interactions
EDA Part 2: Feature Visualization (GGally) (3:49)
Using & Customizing ggpairs() (4:43)
Custom Function: plot_ggpairs() (6:27)
Visual Feature Exploration (7:46)
2.4 Challenge #2: Assessing Feature Pairs
Challenge #2: Exploratory Data Analysis (0:29)
Knowledge Check
2.5 Module 2 Code Checkpoint
🔽 Module 2 Data Understanding Code
Course Survey #1: Your Feedback Is Important!
Quick Course Survey
Module 3, Data Preparation: Getting Data Ready For People & Machines
🔽 Module 3 Overview [File Download]
UPDATES: Fixes for new features
3.1 Data Preparation Setup
Data Preparation (0:57)
Setup For Data Preparation (1:15)
3.2 Data Preparation For People (Humans)
Processing Pipeline (For People Readability) (3:24)
Human Readable Script Setup (2:57)
FIX 1 (Human Readable): X__1 naming scheme has change to `...1`
Merging Data Part 1: Tidying The Data (8:09)
Merging Data Part 2: Mapping Over Lists (6:35)
Merging Data Part 3: Iterative Merge With Reduce (7:42)
Factoring The Character Data (8:38)
Making The Processing Pipeline (9:18)
Knowledge Check
3.3 Data Preparation For Machines With Recipes!
Data Preparation With Recipes (0:57)
Machine Readable Script Setup (2:29)
Custom Function: plot_hist_facet(), Part 1 (2:57)
Custom Function: plot_hist_facet(), Part 2 (5:48)
recipes: Preprocessing Data For Machines (7:51)
Data Preprocessing Plan (6:37)
recipes: Zero Variance Features (6:13)
FIX 2 (Machine Readable): step_num2factor() no longer supports multiple column names. Use step_mutate_at() instead.
recipes: Transformations (13:47)
FIX 3 (Machine Readable): bake(newdata) has changed to bake(new_data)
recipes: center & scale (10:02)
recipes: dummy variables (7:33)
recipes: Baking The Train & Test Data (4:55)
Knowledge Check
3.4 Correlation Analysis
Pre-Modelling Correlation Analysis (1:10)
Correlation Analysis, Step 1: get_cor() (3:47)
Custom Function: Creating get_cor() (12:43)
Correlation Analysis, Step 2: plot_cor() (6:50)
Custom Function: Creating plot_cor() (15:09)
Reading The Correlation Analysis Plot (5:26)
Correlation Analysis Recap (2:33)
3.5 Challenge #3: Correlation Analysis
Challenge #3: Correlation Analysis (0:41)
Knowledge Check
3.6 Module 3 Code Checkpoint
🔽 Module 3 Data Preparation Code
Module 4, Modeling Churn: Using Automated Machine Learning With H2O
🔽 Module 4 Overview [File Download]
UPDATES: Fixes for new features
4.1 Modeling Setup
Modeling With H2O AutoML (0:56)
Modeling Directory Setup (2:42)
H2O Script Setup, Part 1: Libraries, Data, & Preprocessing Pipeline (3:33)
H2O Script Setup, Part 2: Recipes (6:45)
4.2 H2O Automated Machine Learning
H2O Documentation (5:22)
H2O Modeling, Part 1 (9:55)
H2O Modeling, Part 2 (5:32)
Inspecting The Leaderboard (5:51)
Extracting Models From The Leaderboard (2:51)
Custom Function: extract_h2o_model_by_position() (6:38)
Saving & Loading H2O Models (4:54)
Making Predictions (5:48)
Knowledge Check
4.3 Advanced Concepts
Train, Validation, & Leaderboard Frames (3:07)
H2O AutoML Model Parameters (4:39)
Cross Validation (K-Fold CV) (4:16)
Grid Search (Hyperparameter Search) (1:53)
Knowledge Check
4.4 Visualizing The Leaderboard
Leaderboard Visualization (3:42)
ggplot2 Data Transformation (6:13)
ggplot2 Visualization (4:19)
Custom Function: plot_h2o_leaderboard() (17:08)
4.5 Bonus! Grid Search In H2O
H2O Grid Search With h2o.grid(), Part 1 (11:52)
H2O Grid Search With h2o.grid(), Part 2 (11:11)
Bonus Lecture Code
4.6 Code Checkpoint
🔽 Module 4 H2O Modeling Code Checkpoint
Module 5, Modeling Churn: Assessing H2O Performance
🔽 Module 5 Overview [File Download]
5.1 Performance Overview & Setup
Module Overview (1:20)
Module Setup (1:35)
5.2 H2O Performance For Binary Classification
H2o Performance: h2o.performance() (7:39)
H2O Summary Metrics: h2o.auc(), h2o.giniCoef(), h2o.logloss() (6:15)
H2O Metrics: h2o.metric() (4:11)
Precision, Recall, F1 & Effect Of Threshold (11:11)
5.3 Performance Charts For Data Scientists
Performance Of Multiple Models: fs + purrr (11:19)
ROC Plot (9:37)
Precision vs Recall Plot (4:40)
5.4 Performance Charts For Business People
Gain & Lift 101 (5:02)
Gain & Lift Calculations, Part 1 (6:53)
Gain & Lift Calculations, Part 2 (8:14)
H2O Gain & Lift: h2o:gainsLift() (6:02)
Gain Plot (7:22)
Lift Plot (7:24)
5.5 Ultimate Model Performance Comparison Dashboard
Model Diagnostic Dashboard: plot_h2o_performance() (4:01)
plot_h2o_performance(): Overview & Inputs (7:33)
plot_h2o_performance(): Model Metrics (12:35)
plot_h2o_performance(): Gain & Lift (8:08)
plot_h2o_performance(): Combining Plots With cowplot (8:19)
5.6 Modules 4 & 5 Code Checkpoint
🔽 Modules 4 & 5 H2O Performance Code
Module 6, Modeling Churn: Explaining Black-Box Models With LIME
🔽 Module 6 Overview [File Download]
UPDATES: Fixes for new features
6.1 Module 6 Overview & Setup
Module Overview (1:27)
Module Setup (2:32)
H2O Model Setup (2:51)
6.2 Feature Explanation With LIME
LIME Documentation & Resources (5:31)
Investigating Predictions & The Case For LIME (5:25)
Lime For Single Explanation, Part 1: Making an explainer with lime() (7:13)
Lime For Single Explanation, Part 2: Making an explaination with explain() (10:13)
Visualizing Feature Importance For A Single Explanation: plot_features() (6:31)
Visualizing Feature Importance For Multiple Explanations: plot_explanations() (11:07)
Knowledge Check
6.3 Challenge #4: Recreating plot_features() & plot_explanations()
Challenge #4: Recreating plot_features() & plot_explanations() (2:04)
Solution Part 1: plot_features_tq() (15:26)
Solution #2: plot_explanations_tq() (19:22)
6.4 Module 6 Code Checkpoint
🔽 Module 6 LIME Code Checkpoint
Module 7, Evaluation: Calculating The Expected ROI (Savings) Of A Policy Change
🔽 Module 7 Overview [File Download]
UPDATES: Code Checkpoint Revisions
7.1 Overview & Setup
BSPF Update (0:54)
Expected Value Framework (18:16)
Module Setup (2:18)
7.2 Calculating Expected ROI: No Over Time Policy
Policy Change: No Overtime For Anyone (0:39)
Setup: No OT Policy (3:31)
Expected Cost Of Baseline (With OT): Part 1 (5:56)
Expected Cost Of Baseline (With OT): Part 2 (9:51)
Expected Cost Of New State (Without OT): Part 1 (6:50)
Expected Cost Of New State (Without OT): Part 2 (8:41)
Expected Savings: No OT Policy (3:29)
Save Point: No OT Policy (0:57)
7.3 Targeting By Threshold Primer
Policy Change: Targeted Overtime Reduction (1:02)
Setup: Targeted Overtime Policy (2:36)
Threshold Primer, Part 1: Confusion Matrix (4:00)
Threshold Primer, Part 2: Expected Rates (7:00)
Threshold Primer, Part 3: Visualizing Rates (6:50)
Threshold Primer, Part 4: Explaining Expected Rates (3:17)
7.4 Calculating Expected ROI: Targeted Over Time Policy
Expected Cost Of Baseline (With OT) (4:06)
Expected Cost Of New State (Targeted OT): Part 1 (11:12)
Expected Cost Of New State (Targeted OT): Part 2 (4:03)
Expected Cost Of New State (Targeted OT): Part 3 (8:36)
Expected Cost Of New State (Targeted OT), Part 4 (7:39)
Expected Savings: Targeted OT Policy (3:05)
7.5 Module 7 Code Checkpoint
🔽 Module 7 Expected Value Of A Policy Change Code
Module 8: Evaluation, Maximizing ROI (Savings) With Threshold Optimization & Sensitivity Analysis
🔽 Module 8 Overview [File Download]
8.1 Setup
Module Setup (1:51)
8.2 Threshold Optimization: Maximizing Expected ROI
Optimizing By Threshold Overview (1:06)
calculate_savings_by_threshold(), Part 1 (3:21)
calculate_savings_by_threshold(), Part 2 (5:09)
calculate_savings_by_threshold(), Part 3 (9:09)
Testing calculate_savings_by_threshold() (5:40)
Threshold Optimization With purrr (11:19)
8.3 Threshold Optimization: Visualizing The Expected Savings At Various Thresholds
Visualizing Maximized Savings With ggplot2: Part 1 (7:56)
Visualizing Maximized Savings With ggplot2: Part 2 (4:35)
Visualizing Maximized Savings With ggplot2: Part 3 (7:13)
Visualizing Maximized Savings With ggplot2: Part 4 (6:03)
IMPORTANT: Explaining The Optimization Results (9:19)
8.4 Sensitivity Analysis: Adjusting Parameters To Test Assumptions
Sensitivity Analysis Overview (1:48)
calculate_savings_by_thresh_2(), Part 1 (5:34)
calculate_savings_by_threshold_2(), Part 2 (5:46)
calculate_savings_by_threshold_2(), Part 3 (7:22)
Sensitivity Analysis, Part 1: Preloading Functions With partial() (9:07)
Sensitivity Analysis, Part 2: Parameter Combinations With cross_df() (5:09)
Sensitivity Analysis, Part 3: Iterating With pmap() (4:37)
8.5 Sensitivity Analysis: Visualizing The Effect Of Scenarios & Breakeven
Visualizing The Sensitivity Analysis With ggplot2: Part 1 (5:28)
Visualizing The Sensitivity Analysis With ggplot2: Part 2 (6:22)
IMPORTANT: Explaining The Sensitivity Analysis Results (5:47)
8.6 Challenge #5: Threshold Optimization For Stock Options
Challenge #5: Threshold Optimization For Stocks Options (3:31)
Challenge #5: Solution, Part 1 - With Downloadable Solution Code (11:45)
Challenge #5: Solution - Part 2 (11:26)
Challenge #5: Solution - Part 3 (9:16)
8.7 Challenge #6: Sensitivity Analysis For Stock Options
Challenge #6: Sensitivity Analysis For Stock Options (1:57)
Challenge #6: Solution, Part 1 - With Downloadable Solution Code (7:59)
Challenge #6: Solution, Part 2 (6:36)
8.8 Module 8 Code Checkpoint
🔽 Module 8 Threshold Optimization & Sensitivity Analysis Code
Module 9, Evaluation: Creating A Recommendation Algorithm
🔽 Module 9 Overview [File Download]
9.1 Overview & Setup
Recommendation Algorithm Overview (1:24)
BSPF Update (1:00)
Setup (3:52)
9.2 Recipes For Feature Discretization
Recipes For Discretization Overview (1:52)
Creating A Recipe (Module 3 Recap) (6:09)
Binning With step_discretize() (4:01)
Dummy Variables & One Hot Encoding (2:39)
bake() the Recipe! (2:08)
Retrieving The Binning Strategy With tidy() (2:48)
9.3 Discretized Correlation Visualization
Discretized Correlation Visualization (0:32)
Data Manipulation, Part 1: get_cor() (5:41)
Data Manipulation, Part 2: separate() Groups (7:01)
Visualize Discretized Correlation With ggplot2 (9:53)
Explaining The Discretized Correlation Visualization (1:57)
9.4 Recommendation Strategy Worksheet
Strategy Development Worksheet (2:47)
Filling Out The Strategy Worksheet, Part 1 (8:10)
Filling Out The Strategy Worksheet, Part 2 (7:49)
Filling Out The Strategy Worksheet, Part 3 (7:23)
Filling Out The Strategy Worksheet, Part 4 (5:46)
Filling Out The Strategy Worksheet, Part 5 (5:55)
9.5 Personal Development Recommendations
Recommendation Algorithm Process (1:46)
Setting Up: From Worksheet To Code (3:21)
How To Develop Recommendation Strategies, Part 1: Strategy Search (6:46)
How To Develop Recommendation Strategies, Part 2: Add Features (5:09)
Building The Recommendation Algorithm, Part 1: Code Framework (6:59)
Building The Recommendation Algorithm, Part 2: Create Personal Development Plan (4:32)
Building The Recommendation Algorithm, Part 3: Training And Formation (3:59)
Building The Recommendation Algorithm, Part 4: Mentorship (3:35)
Building The Recommendation Algorithm, Part 5: Leadership (2:11)
Personal Development Strategy: Algorithm Recap (3:29)
9.6 Professional Development Recommendations
Professional Development Strategy Overview (with .R File) (5:55)
Strategy Development (3:20)
Code Framework (4:02)
Strategy Logic, Part 1 (5:39)
Strategy Logic, Part 2 (3:32)
Reviewing Results (2:08)
Challenge #8: Work Environment Recommendations
Challenge: Creating A Work Environment Strategy (1:55)
Solution, Part 1: Developing The Strategy (6:28)
Solution, Part 2: Implementing The Strategy Into Code (9:29)
9.7 Deployable Recommendation Function
Recommendation Function Overview (2:12)
Building The Recommendation Function, Part 1 (3:23)
Building The Recommendation Function, Part 2 (5:51)
Testing Our Recommendation Function (2:54)
9.8 Module 9 Code Checkpoint
🔽 Module 9 Recommendation Algorithm Code & Worksheet
Course Conclusion & Next Steps
CONGRATULATIONS!!! (2:37)
Send-Off Gifts!
BSU Student Loyalty Program - Special Offer!
Appendixes
Appendix 1: Frameworks
Appendix 2: Calculators
Appendix 3: Coding References
Appendix 4: DS4B References
Module Setup
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock
BDOW!