Posts

Showing posts from February, 2019

Classifier Boosting with Python

Image
Introduction Remember we've talked about random forest and how it was used to improve the performance of a single Decision Tree classifier . The idea of fitting a number of decision tree classifiers on various sub-samples of the dataset and using averaging to improve the predictive accuracy can be used to other algorithms as well and it's called boosting. There are several boosting techniques, which can be used to improve our algorithm, we'll cover the most used ones: AdaBoost and Bagging boost. AdaBoost An AdaBoost classifier begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset, but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. Bagging boost A Bagging classifier fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions to form a final prediction. I