This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.


J. Coplin and M. Burtscher. Energy and Power Considerations of GPUs. Chapter 19 in Advances in GPU Research and Practice. Elsevier, 2017. [link]

no abstract


A. Yang, J. Coplin, H. Mukka, F. Hesaaraki, and M. Burtscher. MPC: An Effective Floating-Point Compression Algorithm for GPUs. Chapter 13 in Advances in GPU Research and Practice. Elsevier, 2017. [link]

no abstract


J. Coplin, A. Yang, A.R. Poppe, and M. Burtscher. Increasing Telemetry Throughput Using Customized and Adaptive Data Compression. The AIAA Space and Astronautics Forum and Exposition (SPACE'16). September 2016. [pdf]

Due to the increasing generation of massive amounts of data by space-based instruments, it has become a significant challenge to transmit even a fraction of a typical spacecraft data volume back to Earth in a feasible amount of time. Thus, improvements in the ability to losslessly compress data on-board before transmission represent an important method of increasing overall data return rates. We describe a custom methodology for compressing spacecraft data on-board that provides significant improvements in both compression ratio and speed. We have used data returned by the five-probe THEMIS/ARTEMIS constellation to quantify the compression ratio and compression speed improvements over a wide variety of data types (e.g., time-series and particle data). Our approach results in a 30% improvement in compression ratio and a two-fold improvement in compression speed. We argue that such methods should be adopted by future space missions to maximize the data return to Earth, thus enabling greater insight and scientific discovery.


J. Coplin and M. Burtscher. Energy, Power, and Performance Characterization of GPGPU Benchmark Programs. The 12th Workshop on High-Performance, Power-Aware Computing (HPPAC'16). May 2016. [pdf]

This paper studies the effects on energy consumption, power draw, and runtime of a modern compute GPU when changing the core and memory clock frequencies, enabling or disabling ECC, using alternate implementations, and varying the program inputs. We evaluate 34 applications from 5 benchmark suites and measure their power draw over time on a K20c GPU. Our results show that changing the frequency or the program implementation can alter the energy, power, and performance by a factor of two or more. Interestingly, some changes affect these three aspects very unevenly. ECC can greatly increase the runtime and energy consumption, but only on memory-bound codes. Compute-bound codes tend to behave quite differently from memory-bound codes, in particular regarding their power draw. On irregular programs, a small change in frequency can result in a large change in runtime and energy consumption.


J. Coplin and M. Burtscher. Effects of Source-Code Optimizations on GPU Performance and Energy Consumption. 8th Workshop on General Purpose Processing Using GPUs. February 2015. [pdf]

This paper studies the effects of source-code optimizations on the performance, power draw, and energy consumption of a modern compute GPU. We evaluate 128 versions of two n-body codes: a compute-bound regular implementation and a memory-bound irregular implementation. Both programs include six optimizations that can be individually enabled or disabled. We measured the active runtime and the power consumption of each code version on three inputs, various GPU clock frequencies, two arithmetic precisions, and with and without ECC. This paper investigates which optimizations primarily improve energy efficiency, which ones mainly boost performance, and which ones help both aspects. Some optimizations also have the added benefit of reducing the power draw. Our analysis shows that individual and combinations of optimizations can alter the performance and energy consumption of a GPU kernel by up to a factor of five.


J. Coplin and M. Burtscher. Power Characteristics of Irregular GPGPU Programs. 2014 International Workshop on Green Programming, Computing, and Data Processing. November 2014. [pdf]

This paper investigates the power profiles of irregular programs running on a K20 compute GPU and contrasts them with the profiles of regular programs. The paper further studies the effects on the power profile when changing the GPU's core and memory frequencies, using alternate implementations of the same algorithm, and varying the program input. Our results show that the power behavior of irregular applications often cannot be accurately captured by a single average. Rather, the entire profile, i.e., the power as a function of time, needs to be considered. In addition, lowering the frequency, employing alternate implementations, or using different inputs can drastically alter the power profile of irregular codes, meaning that measurements using one setting may not be representative of that program's power characteristics under a different setting.