This repository provides an unofficial implementation of the paper "Optimal Parallelization of Boosting".
- Built on top of scikit-learn 1.3.2.
- Implements
ParallelBoostingClassifier, a parallelized version of scikit-learn'sAdaBoostClassifier. - Parallel computation is performed using multi-threading.
- Includes two test cases for the algorithm: the noisy spiral dataset and MNIST.
Here is a simple example demonstrating the usage of ParallelBoostingClassifier:
from parallel_boosting import ParallelBoostingClassifier
from sklearn.tree import DecisionTreeClassifier
boost = ParallelBoostClassifier(
estimator=DecisionTreeClassifier(max_depth=3),
n_rounds=200,
n_queries=20,
n_boosting_steps=10,
learning_rate=0.5,
random_state=seed,
algorithm='SAMME',
gamma=0.05,
subsample_constant=1.0
)
boost.fit(X_train, y_train)
y_pred = boost.predict(X_test)