Automatic Generation of High-Speed Reliable Lossy and Lossless Data Compressors

Highlights

LC, our tool for automatically generating high-performance lossless and guaranteed-error-bounded lossy data compressors, is now publicly available on GitHub. Follow the tutorial to try it out!

LC Overview

LC is a framework for automatically generating customized lossless and guaranteed-error-bounded lossy data-compression algorithms for individual files or groups of files. The resulting compressors and decompressors are parallelized and produce bit-for-bit the same result on CPUs and GPUs. A step-by-step tutorial and the open-sourced framework are freely available at https://github.com/burtscher/LC-framework/.

LC consists of the following three parts:

Component library
Preprocessor library
Framework

Both libraries contain data transformations (encoders) and their inverses (decoders) for CPU and GPU execution. The user can extend these libraries as explained in the tutorial. The framework takes preprocessors and components from these libraries and chains them into a pipeline to build a compression algorithm. It similarly chains the corresponding decoders in the opposite order to build the matching decompression algorithm. Figure 1 illustrates this process. Importantly, LC can automatically search for effective compression algorithms by testing all combinations of user-selected sets of components in each pipeline stage.

Figure 1: LC's process of chaining (a.k.a. pipelining) n data transformations to form a custom compression algorithm and the inverses of those transformations to form the matching decompression algorithm (the components are lossless whereas the preprocessors include guaranteed-error-bounded lossy quantizers).

General Features

LC supports both exhaustive search for the best algorithm in the search space as well as a genetic-algorithm-based search for cases where the exhaustive search would take too long. In addition, the user can optionally supply a regular expression to reduce the size of the search space. LC is able to search for the best algorithm based solely on compression ratio or based on both compression ratio and throughput. In the latter case, it outputs the Pareto front, that is, a set of algorithms that represent different compression-ratio versus speed tradeoffs.

LC can run on and generate algorithms for CPUs and GPUs. The algorithms are deterministic and fully compatible, meaning the user may compress a file on either the CPU or GPU and decompress the resulting file on either the CPU or GPU. The CPU code is written in C++ and parallelized using OpenMP. The GPU code is written in CUDA. Once a suitable algorithm has been found, the user can employ LC's code generator to produce a standalone compressor and decompressor for that algorithm that does not require the framework.

LC includes an extensive library of components and preprocessors. Most of them support 1-, 2-, 4-, and 8-byte word sizes. Both libraries are user customizable and extensible, meaning users are able to add their own data transformations by following the API outlined in the tutorial. LC then includes the new transformations in its search for a good compression algorithm and can use them in the code generator.

Lossy-Mode Features

In addition to lossless algorithms, LC can also generate lossy algorithms for 32-bit single and 64-bit double-precision floating-point data. It supports absolute, relative, normalized absolute, and combined absolute & relative error bounds. Moreover, it guarantees that these point-wise error bounds are not violated by losslessly encoding any value that it cannot quantize within the provided error bound. It supports all floating-point values, including infinities, not-a-number (NaN), and denormals. Each quantizer provides two modes, one that replaces the lost bits by zeros and another that replaces them by random bits to minimize autocorrelation between the errors.

Project Summary

Fast reliable data compression is urgently needed for many leading-edge scientific instruments and for exascale high-performance computing applications because they produce vast amounts of data at extremely high rates. The goal of this project is to develop a high-speed reliable lossy-compression framework named LC that meets three critical needs: (i) improving the trustworthiness of lossy compression methods and the data reduction quality, (ii) increasing the compression/decompression speed to match the high data generation/acquisition rates, and (iii) supporting progressive compression and decompression with multiple levels of resolution to meet the demands of today's leading scientific applications and instruments.

The project comprises the following three research thrusts. (1) To address the trustworthiness and data-reduction-quality issues, the LC framework will allow users to synthesize customized algorithms for the coding stage in the lossy compression pipeline, optimizing the quality-of-interest preservation and compression ratio. To this end, LC will provide a very large tradeoff space with numerous coding algorithms to choose from and automatically emit the code of the optimal configuration with reliable execution time bounds. (2) To address the speed challenge, we will develop lightweight error-bounded decorrelation strategies, high-speed data predictors, efficient quantization methods, and a new class of encoders called 'essentially lossless' that will compress faster and better than the current state of the art. We will also parallelize the LC framework as well as the generated compression/decompression codes both for CPUs and GPUs and will create algorithms that are portable across heterogeneous architectures. (3) To enable users to build their own multi-resolution progressive compressors, we will extend LC to support the generation of progressive algorithms that meet user requirements adaptively by employing a hierarchical block-wise tree-based structure that can suppress subtrees on demand. The resulting fast reliable lossy compression framework will greatly benefit the many scientific applications that need not only high trustworthiness but also high performance.

DOE Press release

Texas State press release

Project summary slide

LC framework overview slide

Publications

Alex Fallin, Noushin Azami, Sheng Di, Franck Cappello, and Martin Burtscher. Fast and Effective Lossy Compression on GPUs and CPUs with Guaranteed Error Bounds. Proceedings of the 39th IEEE International Parallel and Distributed Processing Symposium. June 2025. [paper]

Noushin Azami, Alex Fallin, and Martin Burtscher. Efficient Lossless Compression of Scientific Floating-Point Data on CPUs and GPUs. Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems. March 2025. [doi] [paper]

Alex Fallin and Martin Burtscher. Lessons Learned on the Path to Guaranteeing the Error Bound in Lossy Quantizers. Workshop on Correct Data Compression. July 2024. [doi] [paper] [slides]

Andrew Rodriguez, Noushin Azami, and Martin Burtscher. Adaptive Per-File Lossless Compression of Floating-Point Data. Proceedings of the 5th Workshop on Extreme-Scale Storage and Analysis. May 2024. [paper] [slides]

Noushin Azami, Rain Lawson, and Martin Burtscher. LICO: An Effective, High-Speed, Lossless Compressor for Images. Proceedings of the 2024 Data Compression Conference. March 2024. [doi] [paper] [slides]

Brandon A. Burtchell and Martin Burtscher. Using Machine Learning to Predict Effective Compression Algorithms for Heterogeneous Datasets. Proceedings of the 2024 Data Compression Conference. March 2024. [doi] [paper] [slides]

Noushin Azami and Martin Burtscher. Compressed In-memory Graphs for Accelerating GPU-based Analytics. Proceedings of the 12th SC Workshop on Irregular Applications: Architectures and Algorithms. November 2022. [doi] [paper]

Code Releases

The LC Framework for Generating Efficient Data-Compression Algorithms: LC framework code

Guaranteed-Error-Bounded Lossy Compression of Floating-Point Data on GPUs and CPUs: PFPL

Effective High-Speed Lossless Compression of Floating-Point Data on GPUs and CPUs: FPcompress

High-Speed Lossless Image Compression: LICO code

Compressed In-memory Graphs for Accelerating GPU-based Analytics: MPLG code

Team

Martin Burtscher (PI)

Sheng Di (Co-PI)

Franck Cappello (senior advisor)

Anju Mongandampulath Akathoott (postdoc)

Noushin Azami (Ph.D. student)

Brandon Burtchell (Ph.D. student)

Alex Fallin (Ph.D. student)

Benila Jerald (Ph.D. student)

Yiqian Liu (Ph.D. student)

Andrew Rodriguez (Ph.D. student)

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Research (ASCR), under contracts DE-SC0022223 and DE-AC02-06CH11357.

Official Texas State University Disclaimer