Ensemble Teaching for Hybrid Label Propagation

IEEE Trans Cybern. 2019 Feb;49(2):388-402. doi: 10.1109/TCYB.2017.2773562. Epub 2017 Dec 1.

Abstract

Label propagation aims to iteratively diffuse the label information from labeled examples to unlabeled examples over a similarity graph. Current label propagation algorithms cannot consistently yield satisfactory performance due to two reasons: one is the instability of single propagation method in dealing with various practical data, and the other one is the improper propagation sequence ignoring the labeling difficulties of different examples. To remedy above defects, this paper proposes a novel propagation algorithm called hybrid diffusion under ensemble teaching (HyDEnT). Specifically, HyDEnT integrates multiple propagation methods as base "learners" to fully exploit their individual wisdom, which helps HyDEnT to be stable and obtain consistent encouraging results. More importantly, HyDEnT conducts propagation under the guidance of an ensemble of "teachers". That is to say, in every propagation round the simplest curriculum examples are wisely designated by a teaching algorithm, so that their labels can be reliably and accurately decided by the learners. To optimally choose these simplest examples, every teacher in the ensemble should comprehensively consider the examples' difficulties from its own viewpoint, as well as the common knowledge shared by all the teachers. This is accomplished by a designed optimization problem, which can be efficiently solved via the block coordinate descent method. Thanks to the efforts of the teachers, all the unlabeled examples are logically propagated from simple to difficult, leading to better propagation quality of HyDEnT than the existing methods. Experiments on six popular datasets reveal that HyDEnT achieves the highest classification accuracy when compared with six state-of-the-art propagation methodologies such as harmonic functions, Fick's law assisted propagation, linear neighborhood propagation, semisupervised ensemble learning, bipartite graph-based consensus maximization, and teaching-to-learn and learning-to-teach.