Course Outline

Introduction

  • Artificial neural networks vs decision tree based algorithms

Overview of XGBoost Features

  • Elements of a Gradient Boosting algorithm
  • Focus on computational speed and model performance
  • XGBoost vs Logistic Regression, Random Forest, and standard Gradient Boosting

The Evolution of Tree-Based Algorithms

  • Decision Trees, Bagging, Random Forest, Boosting, Gradient Boosting
  • System optimization
  • Algorithmic enhancements

Preparing the Environment

  • Installing SciPy and scikit-learn

Creating a XGBoost Model

  • Downloading a data set
  • Solving a common classification problem
  • Training the XGBoost model for classification
  • Solve a common regression task

Monitoring Performance

  • Evaluating and reporting performance
  • Early Stopping

Plotting Features by Importance

  • Calculating feature importance
  • Deciding which input variables to keep or discard

Configuring Gradient Boosting

  • Review the learning curves on training and validation datasets
  • Adjusting the learning rate
  • Adjusting the number of trees

Hyperparameter Tuning

  • Improving the performance of an XGBoost model
  • Designing a controlled experiment to tune hyperparameters
  • Searching combinations of parameters

Creating a Pipeline

  • Incorporating an XGBoost model into an end-to-end machine learning pipeline
  • Tuning hyperparameters within the pipeline
  • Advanced preprocessing techniques

Troubleshooting

Summary and Conclusion

Requirements

  • Experience writing machine learning models

Audience

  • Data scientists
  • Machine learning engineers
 14 Hours

Upcoming Courses