Machine Learning Bootcamp

Empower your team with state-of-the-art skills to discover hidden patterns in your data

  • An innovative curriculum provides your team with state-of-the-art machine learning skills actually used in practice

  • Build hands-on machine learning skills via an onsite classroom or live virtual experience

  • Jumpstart your team's advanced analytics journey using Python - no previous Python experience required

Looking for individual live training? Check out the ML bootcamp at TDWI Orlando.

Answering the call for advanced analytics

Is machine learning shaping the future of your organization?

While data has always been used in business, things have changed. Functions like HR, Product Management, Customer Service, etc., are embracing advanced analytics to drive better business outcomes.

Do you want your team to be a part of this data-driven future?

It’s hard to avoid all the social media posts, magazine articles, or news clips trumpeting how machine learning is permanently changing the way organizations operate – and changing the expectations of organizations.

Machine learning for ANY team - regardless of role/background

Imagine a team of Product Managers that could answer the following question with data, "What feature usage(s) are highly predictive of a sticky customer?" How much value would they bring to their organization?

Training from TDWI’s top-rated instructor

Machine Learning Bootcamp

The Machine Learning Bootcamp empowers your team with skills like random forests and k-means clustering to discover new insights.

In partnership with TDWI, Dave on Data delivers 3 days of hands-on training using R or Python - choose whichever is best for your team’s needs.

If your team is new to Python, free access to a 4-hour Python quick start online tutorial will be provided before the bootcamp.

This training focuses on a practical subset of machine learning skills, so your team can hit the ground running and deliver insights ASAP.

The bootcamp is often bundled with additional courses (see below) to increase your team’s capabilities.

The outcome?

Your team will have the knowledge and hands-on skills to use machine learning to find hidden patterns in your data, including crafting predictive models and performing cluster analyses.

  • A well-defined set of skills for real-world machine learning insights

  • Your team will build real-world skills via 11 hands-on labs

  • Courses offered in R or Python - choose what works best for your team

  • Certified by TDWI – the globally recognized industry leader in data training

  • Delivered by David Langer, globally recognised data analytics practitioner

  • Bundle additional courses to expand your teams capabilities

What professionals have to say

Bring the same high-quality training experiences of TDWI national conferences to your team.

Training delivered onsite or live virtual - whichever works best for your team.

“Very good course as an intro to machine learning. I feel that with what I learned today I can put these skills into practice at work.”

— David Green, EMWD

“Fantastic intro ML course that’s presented in an engaging way. The content was easy to understand and the labs were easy to follow along. I’ve left the course wanting to dive deeper into the topic.”

— Jessica Liu, O-I Glass

“MIND BLOWN…not by the difficulty of the class, but by how EASY Dave makes machine learning within the reach of aspiring Data Scientists.

Easily the highlight of this year’s conference for me. I feel empowered to bring this material back to the job, put it to use, and teach it to others.”

— Chet Phelps, Health Solutions

“Best training and instructor I’ve had. Organized, clear, good pace, helpful examples, and an engaging and fun instructor.”

— Alex Kurtz, Sourceability

“I am so glad to have started the conference in Dave’s class. He set a wonderful tone for what is yet to come. I hope my other courses measure up!”

— Christina Mitchell, Naphcare

“Great class! Engaging instructor. Wish I would have had more time this week to attend his other sessions.”

— Matthew Royalt, Southern Star Central Gas Pipeline

Machine Learning Bootcamp Outline

Bootcamp can be taught with R or Python.

The following is the 3-day curriculum. The curriculum can be expanded by bundling additional courses (see below).

Teams new to Python will be provided free access to a 4-hour Python quick start online tutorial before the bootcamp.

Introduction to Machine Learning - Days 1 & 2

  • 01 - Attendee Introductions

    02 - Course Expectations

  • 01 - Data Analyst, Teacher

    02 - Why Decision Trees?

  • 01 - Course Datasets

    02 - Exploratory Data Analysis (EDA)

    03 - Data Profiling

    04 - Data Visualization

  • Data Profiling & Data Visualization

  • 01 - Classification Tree Intuition

    02 - Overfitting Intuition

    03 - Gini Impurity

    04 - Gini Change

    05 - Many Categories Impurity

    06 - Numeric Feature Impurity

  • Decision Trees

  • 01 - Under/Overfitting

    02 - The Bias-Variance Tradeoff

    03 - Supervising the Data

    04 - Model Tuning

    05 - Classification Tree Pruning

    06 - Measuring Awesomeness

  • Tuning Classification Trees

  • 01 - Feature Engineering Intuition

    02 - Data Leakage

    03 - Decision Tree Feature Engineering

    04 - Missing Data

  • Feature Engineering

  • 01 - Regression Tree Basics

    02 - Numeric Feature SSE

    03 - Many Categories SSE

  • Regression Trees

  • 01 - The Problem with Decision Trees

    02 - Ensembles

    03 - Bagging

    04 - Feature Randomization

    05 - Tuning Random Forests

    06 - Feature Importance

  • Random Forests

Cluster Analysis - Day 3

  • 01 - Course Expectations

    02 - What is Cluster Analysis?

    03 - Cluster Analysis Use Cases

    04 - The Challenge of Clustering Data

  • 01 - The Iris Dataset

    02 - The Hand-Written Digits Dataset

    03 - The Heart Dataset

  • 01 - Hierarchical, Partitional, and Overlapping

    02 - Prototype Clusters

    03 - Density-Based Clusters

  • 01 - Introducing K-Means

    02 - The K-Means Algorithm

    03 - Euclidian Distance

    04 - The Problem with Outliers

    05 - Data Standardization

    06 - K-Means Caveats

  • K-Means Clustering

  • 01 - Evaluating Clusters

    02 - Cluster Cohesion

    03 - Evaluating Cohesion with the Elbow Method

    04 - The Silhouette Coefficient

    05 - Evaluating Clusters using the Silhouette Score

  • Optimizing K-Means

  • 01 - Introducing DBSCAN

    02 - The DBSCAN Algorithm

    03 - DBSCAN Caveats

  • 01 - Considerations for Optimizing DBSCAN

    02 - Calculating min_samples

    03 - Choosing the eps Value

    04 - Introducing Nearest Neighbors

    05 - Evaluating eps with the Elbow Method

    06 - DBSCAN vs K-Means

  • Optimizing DBSCAN

  • 01 - Introducing Dimensionality Reduction

    02 - Principal Component Analysis (PCA)

    03 - PCA Concepts

  • Dimensionality Reduction

  • 01 - The Problem with Categories

    02 - Encoding Categorical Data

    03 - Factor Analysis of Mixed Data (FAMD)

  • 01 - Supervised Learning Resources

    02 - Cluster Analysis Resources

  • Categorical Data

Course Add-Ons

Expand your team’s capabilities by bundling additional courses into your bootcamp.

All courses can be taught in R or Python.

  • Visual Data Analysis

    This 1-day hands-on course teaches how to use data visualizations the way Data Analysts/Scientists do - to get to the “why” of what’s happening. This course focuses on topics useful to any team, including Distribution Analysis, Correlation Analysis, Multivariate Analysis, and Time Series Analysis..

  • Data Wrangling for Machine Learning

    This 1-day hands-on course focuses on techniques for producing the best quality data for use in crafting valuable machine learning models. Topics include data profiling, wrangling string data, and engineering date-time features. This course expands upon the topics covered in the Introduction to Machine Learning course.

  • Text Analytics

    This 1-day hands-on course is an introduction to the tools an techniques of transforming text data into a form suitable for analytics. Examples include clustering documents and sentiment analysis. Topics include tokenization, stemming, lemmatization, TF-IDF, and cosine similarity.

FAQs

  • Yes. Certificates can be issued by TDWI. Contact us for more details.

  • Yes! The bootcamp can be delivered virtually or onsite with your team.

  • While the courses do include mathematics, it is at a level accessible to a broad audience. For example, no knowledge of calculus or statistics is required.

  • The Python version of the bootcamp comes with free access to a 4-hour Python quick start online tutorial. No prior experience with Python is required.

    The R version of the bootcamp assumes knowledge of R programming (e.g., using the tidyverse). A 1-day R programming course is available to provide the required knowledge. Contact us for details.

  • At this time, these courses are offered as live training experiences only.