Modeling multiple time series annotations as noisy distortions of the ground truth: An Expectation-Maximization approach

IEEE Trans Affect Comput. 2018 Jan-Mar;9(1):76-89. doi: 10.1109/TAFFC.2016.2592918. Epub 2016 Jul 19.

Abstract

Studies of time-continuous human behavioral phenomena often rely on ratings from multiple annotators. Since the ground truth of the target construct is often latent, the standard practice is to use ad-hoc metrics (such as averaging annotator ratings). Despite being easy to compute, such metrics may not provide accurate representations of the underlying construct. In this paper, we present a novel method for modeling multiple time series annotations over a continuous variable that computes the ground truth by modeling annotator specific distortions. We condition the ground truth on a set of features extracted from the data and further assume that the annotators provide their ratings as modification of the ground truth, with each annotator having specific distortion tendencies. We train the model using an Expectation-Maximization based algorithm and evaluate it on a study involving natural interaction between a child and a psychologist, to predict confidence ratings of the children's smiles. We compare and analyze the model against two baselines where: (i) the ground truth in considered to be framewise mean of ratings from various annotators and, (ii) each annotator is assumed to bear a distinct time delay in annotation and their annotations are aligned before computing the framewise mean.

Keywords: Behavioral signal processing; Expectation Maximization (EM) algorithm; Multiple annotators; Time series modeling.