Assisted Labeling and Noise Correction for Time-Series Data
Developing semi-automatic tools to efficiently identify and correct label noise in time-series data, combining deep learning with human-in-the-loop visualizations.
The quality of supervised machine learning models is fundamentally dependent on the quality of their training data labels. However, real-world datasets are often plagued by “label noise”—incorrectly assigned labels—which can significantly degrade model performance. This problem is especially challenging for time-series data. Unlike images, raw sensor signals are not easily interpretable by humans, making manual verification a tedious, expensive, and error-prone process.
This project focuses on developing intelligent, human-in-the-loop systems to streamline the process of labeling and cleaning time-series data. By combining deep feature extraction, model-driven uncertainty sampling, and interactive visualizations, our tools empower researchers to efficiently produce high-quality datasets for applications like Human Activity Recognition (HAR).
TSAR: A Time Series Assisted Relabeling Tool
We started our work by conducting a survey on the topic of label noise in time-series data (Atkinson & Metsis, 2021). Our first major contribution in this area is TSAR (Time Series Assisted Relabeling) (Atkinson & Metsis, 2021). TSAR is designed to tackle label noise in existing datasets. Instead of requiring a full manual review, TSAR intelligently flags a small, user-defined percentage of instances that are most likely to be mislabeled.
The core process involves training a Convolutional Neural Network (CNN) on the noisy dataset to learn discriminative features. The system then identifies instances where the model’s prediction has the greatest distance from the provided label. These high-uncertainty candidates are presented to a human reviewer through a specialized interface. The visualization, as shown below, includes a t-SNE plot to provide context within the feature space, along with the raw waveforms of the suspicious instance and its nearest neighbors. Our experiments showed that cleaning just 2% of a dataset using TSAR resulted in an average accuracy improvement of 1.9% across six different classifiers (Atkinson & Metsis, 2021).


ALVI: A Semi-Automatic Labeling System
Building on the principles of TSAR, we developed the Assisted Labeling Visualizer (ALVI) (Hinkle et al., 2023), a comprehensive framework for semi-automatic labeling of large, unlabeled time-series datasets. While TSAR focuses on cleaning existing labels, ALVI is designed to accelerate the initial labeling process from scratch.
ALVI operates on a “human-in-the-loop” principle. A user begins by manually labeling a small “seed” portion of the data, assisted by an interactive interface that can synchronize raw sensor signals with video recordings. A deep learning model is then trained on this seed data and used to automatically predict labels for the rest of the dataset. Finally, ALVI uses the label correction techniques pioneered in TSAR to help the user efficiently review and correct the automatically generated labels. In our evaluations, ALVI reduced the labeling time for an 11-minute HAR session from 34 minutes (manual) to just 9 minutes, while simultaneously increasing the final accuracy from 91% to 96%.


This suite of tools represents a significant step towards making the creation of large-scale, high-quality time-series datasets a more manageable and reliable process.