Gradient Boosting from Concept to Observe (Half 2) | by Dr. Roi Yehoshua | Jul, 2023

Use the gradient boosting lessons in Scikit-Be taught to unravel totally different classification and regression issues
Within the first part of this text, we introduced the gradient boosting algorithm and confirmed its implementation in pseudocode.
On this a part of the article, we are going to discover the lessons in Scikit-Be taught that implement this algorithm, talk about their numerous parameters, and exhibit methods to use them to unravel a number of classification and regression issues.
Though the XGBoost library (which will likely be coated in a future article) gives a extra optimized and extremely scalable implementation of gradient boosting, for small to medium-sized information units it’s usually simpler to make use of the gradient boosting lessons in Scikit-Be taught, which have an easier interface and a considerably fewer variety of hyperparameters to tune.
Scikit-Be taught gives the next lessons that implement the gradient-boosted determination bushes (GBDT) mannequin:
- GradientBoostingClassifier is used for classification issues.
- GradientBoostingRegressor is used for regression issues.
Along with the usual parameters of decision trees, equivalent to criterion, max_depth (set by default to three) and min_samples_split, these lessons present the next parameters:
- loss — the loss operate to be optimized. In GradientBoostingClassifier, this operate could be ‘log_loss’ (the default) or ‘exponential’ (which can make gradient boosting behave just like the AdaBoost algorithm). In GradientBoostingRegressor, this operate could be ‘squared_loss’ (the default), ‘absolute_loss’, ‘huber’, or ‘quantile’.
- n_estimators — the variety of boosting iterations (defaults to 100).
- learning_rate — an element that shrinks the contribution of every tree (defaults to 0.1).
- subsample — the fraction of samples to make use of for coaching every tree (defaults to 1.0).
- max_features — the variety of options to contemplate when looking for the most effective cut up in every node. The choices are to specify an integer for the…