Published on Fri Feb 22 2019

Diversity of Ensembles for Data Stream Classification

Mohamed Souhayel Abassi

When constructing a classifier ensemble, diversity among the base classifiers is one of the important characteristics. The analysis provides a deeper understanding of the concept of diversity and its impact on online ensemble Learning in the presence of concept drift.

0
0
0
Abstract

When constructing a classifier ensemble, diversity among the base classifiers is one of the important characteristics. Several studies have been made in the context of standard static data, in particular, when analyzing the relationship between a high ensemble predictive performance and the diversity of its components. Besides, ensembles of learning machines have been performed to learn in the presence of concept drift and adapt to it. However, diversity measures have not received much research interest in evolving data streams. Only a few researchers directly consider promoting diversity while constructing an ensemble or rebuilding them in the moment of detecting drifts. In this paper, we present a theoretical analysis of different diversity measures and relate them to the success of ensemble learning algorithms for streaming data. The analysis provides a deeper understanding of the concept of diversity and its impact on online ensemble Learning in the presence of concept drift. More precisely, we are interested in answering the following research question; Which commonly used diversity measures are used in the context of static-data ensembles and how far are they applicable in the context of streaming data ensembles?

Fri Aug 11 2017
Machine Learning
An Ensemble Classification Algorithm Based on Information Entropy for Data Streams
Data stream mining problem has caused widely concerns in the area of machine learning and data mining. In some recent studies, ensemble classification has been widely used in concept drift detection. New algorithm uses information entropy to evaluate a classification result.
0
0
0
Fri Sep 08 2017
Machine Learning
GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for Evolving Data Streams
Geometrically Optimum and Online-Weighted Ensemble (GOOWE) assigns optimum weights to the component classifiers. GOOWE provides improved reactions to different types of concept drift compared to our baselines. The statistical tests indicate a significant improvement in accuracy.
0
0
0
Sat Aug 24 2013
Machine Learning
Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams
We present an efficient distributed online learning scheme to classify data. We propose a novel onlineensemble learning algorithm to update the aggregation rule in order to adapt to the underlying data dynamics. We rigorously determine a bound for the worst case misclassification probability of our algorithm.
0
0
0
Tue May 21 2019
Machine Learning
Recurring Concept Meta-learning for Evolving Data Streams
Enhanced Concept Profiling Framework (ECPF) aims to recognise recurring concepts and reuse a classifier trained previously. ECPF classifies significantly more accurately than a state-of-the-art classifier reuse framework. It classifies real-world datasets five times faster than Diversity Pool.
0
0
0
Wed Apr 26 2017
Machine Learning
An ensemble-based online learning algorithm for streaming data
The ensemble of base classifiers in our approach is obtained by learning Naive Bayes classifiers on different training sets which are generated by projecting the original training set to lower dimensional space. We propose a mechanism to learn sequences of data using data chunks.
0
0
0
Fri May 15 2020
Machine Learning
Adaptive XGBoost for Evolving Data Streams
0
0
0