Research


Research Summary

Parallel Processing

The high computation requirements and the real-time constraints associated with image processing and graphics rendering applications have resulted in considerable interest in the development of parallel processing techniques.

  • Software Environments for Parallel Image Processing

Parallel algebraic logic
One of his research areas extends his dissertation work on image algebra and parallel environments for image processing. As a member of the university and industry consortium led by the Lockheed Martin Corporation, he studied the architectural support for image algebra and the design of parallel algorithms for image processing under the image algebra framework. Furthermore, he has been pursuing a new approach which combines image algebra research with computational intelligence. The existing work on image algebra focuses on its theoretical aspects and on direct implementation of image algebra on computer systems. his new approach is to develop an intelligent system to optimize signal and image processing algorithms specified in image algebra at the application level and map the algorithms onto various hardware architectures in an optimal way on the fly. The aim is to ultimately be able to map algorithms onto adaptive, reconfigurable computer systems such as those based on FPGAs. In the approach based on image algebra and computational intelligence, the application-dependent features of signal and image processing algorithms are specified using image algebra. Image algebra thus serves as a powerful specification language, which allows algorithm optimization at the application level. The approach also provides mechanisms for specification of features of the underlying architecture. Computational intelligence is used to reason about the optimal strategies to map the algorithm (expressed as image algebra operations) onto the various architectures based on user specified criteria. This is accomplished using a performance knowledge base which stores information about the performance of image algebra operations (and also algorithms) on various architectures.

Compilation techniques for programming the VGI parallel computer
Another research effort in this direction is the investigation of compilation techniques for programming the VGI parallel computer for video, graphics, and image processing. The VGI computer, developed by UC-Berkeley and DataFlow Systems under the sponsorship of the Air Force Research Laboratory, is a multi-threaded static dataflow computer intended for processing stream data present in video, graphics, and image processing applications. He spent the 1998 summer in the Air Force Research Laboratory at Eglin AFB as a summer faculty research fellow, and subsequently obtained a research contract from the Air Force Office of Scientific Research on the study of programming models and compilation techniques for programming the VGI computer. The VGI computer consists of a large number of processors connected through reconfigurable communication networks. Two connected processors use a simple handshake protocol for communication and synchronization. A core problem that has to be solved in development of a highly user-friendly software environment for the user to use the VGI computer is how to effectively use the reconfigurable networks to route the processors for any signal processing algorithm. Based on graph folio theory, he developed an efficient algorithm for routing a large number of processors of the VGI parallel computer for executing signal processing programs.

  • Parallel Algorithms for Image Processing and Graphics Rendering

Image component labeling and analysis
He developed an SIMD mesh algorithm for labeling image components, a fundamental image processing problem. This algorithm provides a positive answer to an open question posed by several researchers whether there exists an O(n)-time and O(log n)-space local labeling algorithm for labeling an n X n image on an n X n SIMD mesh connected computer. The algorithm uses a pipeline mechanism with stack-like data structures to achieve the lower bound of O(n) in time complexity and O(log n) in space complexity. The algorithm has very small multiplicative constants in its complexities by using only local parallel-shrink and label-propagate operations. He also developed fast parallel algorithms for analysis of image components, an important step towards understanding images.

Parallel algorithms for chromosome image segmentation
He reduced the chromosome image segmentation problem into a problem of finding the shortest paths in a grid graph. He developed practical mesh algorithms using a local cost-reducing operation for various forms of the grid graph shortest path problem on mesh-connected computers. The algorithms are very simple and can easily mark the vertices on the shortest paths between any two vertices. The time complexity of the algorithm is proportional to the maximum length of the shortest paths with a very small multiplicative constant. Applying the algorithms with a suitable cost measure to the chromosome image segmentation problem, we can more intelligently split touching chromosomes in parallel.

Parallel techniques for volume graphics rendering
Volume rendering is essentially a process of presenting 3D information in 2D format. Due to the huge size of volume data, it is desirable to employ parallel techniques for real-time rendering. The core process of a ray casting algorithm is the projection of 3D volume data onto the 2D image plane. Rays cast into the volume data may access voxels in different volume slices, resulting in non-regular data access patterns that cannot be directly handled by SIMD processor arrays. He developed two parallel algorithms, one for orthographic projections and the other for perspective projects, based on a projection decomposition idea. The projection is decomposed into a sequence of transformations that can be efficiently implemented on SIMD processor arrays. Each slice is transformed into an intermediate coordinate system in which each new 2D slice maps to the same vertical position but each voxel in the slice shifts to its new position in the same slice. Each new slice is scaled down according to its distance to the view point. Finally, a parallel perpendicular projection and a 2D-restoration are used to generate the 2D image. We also implemented the algorithms on the PAL computer.

High-speed target classification and detection
He has developed a high-speed implementation of the morphological shared-weight neural network (MSNN) on the PAL computer for automatic target classification and detection. Although there is significant parallelism inherent in the MSNN, it is a challenge to implement it on an SIMD parallel computer consisting of a large array of simple processing elements. The implementation partitions images into blocks as large as the processor array and uses a block-first strategy to process the blocks. Considering the long input/output latency, we process each block completely on the processor array using neighborhood operations before we bring the next block into the processor array for processing.

Distributed Multimedia and E-Learning

Distributed processing is a key supporting technology in the newly emerging information society. When end-users access intranet resources or browse the Internet from their desktops, distributed processing is behind virtually every mouse-click. The success of distributed processing rests in our ability to make it happen efficiently, reliably, and seamlessly.

Development of the Distributed Systems Laboratory
He has established the Distributed Systems Laboratory (DSL). DSL was initially funded by NSF as a laboratory to support the CS courses in the systems area. The laboratory also provides an excellent research infrastructure for distributed processing research.

Fast algorithm for reconstructing motion compensated blocks in compressed domain
Many distributed multimedia applications require processing compressed video signals. For compression systems using the Discrete Cosine Transform (DCT) with motion compensation, he developed a fast algorithm with Adaptive Low-Pass Filtering (ALPF) to reconstruct blocks with motion vectors directly in the compensated domain. Compared with the previous work, the algorithm is faster and reduces blocky artifacts caused by quantization.

Distributed e-learning environment
Web-based e-learning offers many benefits over traditional learning environments and has become one of the fastest growing markets in the education industry. He has applied distributed processing techniques and artificial intelligence methods to e-learning to better support a large population of students with different learning styles and diverse learning goals. He has investigated secure management of XML-based learning objects and student profiles and has developed better ways of organizing learning materials, modeling students, and delivering learning objects in a distributed environment to support student-centered, self-paced, and highly-interactive learning.

Neural Networks

Neural computing is a different parallel and distributed computing paradigm modeling the computing necessary in biological systems. A neural network is a network composed of a number of interconnected units called artificial neurons. Each unit has an input/output characteristic and implements a local computation or function. The output of any unit is determined by its input/output characteristic, its interconnection to other units, and possible external inputs. The computation concerned in neural networks is massively distributed and parallel, and the major mechanism for program development in neural networks is learning. Neural networks offer some potential for approaching many currently unsolved problems. Recent advances in computer architecture, especially the realization of relatively inexpensive, massively parallel systems, have made the implementation of neural networks more practical. He has done work on morphological shared-weight neural networks and associative memory models and algorithms.

Morphological shared-weight neural networks
In target classification and detection problems, feature extraction and decision-making are important components. A heterogeneous neural network that can learn feature extraction and classification simultaneously is very appealing. The morphological shared-weight neural network (MSNN) proposed by Gader is such a neural network. He has been actively involved with improvement of the MSNN by incorporating entropy in the objective function for the learning process and its application to target classification and detection in synthetic aperture radar (SAR) images. He has also developed a high-speed implementation of the MSNN on the PAL computer for automatic target classification and detection.

Associative memory models and algorithms
Many researchers have studied associative memories based on the the concept of energy function. The use of energy function relies on the assumption of symmetric connection weights in associative memories. The assumption has been recognized as a main drawback of associative memories. He defined a new concept "supporting function" to replace the concept of energy function. Based on the concept of supporting function, he has proposed general models for auto-associative and bidirectional associative memories and formulated the information retrieval process as a dynamic system and explored the stability and attraction conditions. He has also developed better learning algorithms based on Rosenblatt's perceptron rule and demonstrated their effectiveness through experiments. Furthermore, the learning algorithms can adaptively control the shape of basin of attraction around each attractor to better learn the variants of each attractor.

Wireless Sensor Networks

The remarkable advances in computing, wireless communication, and sensing/actuating along with the miniaturization of low-power computing, communication, and sensing/actuating devices are enabling a significant change in how systems can sense and manipulate their physical environment. Research in such distributed systems of wireless sensors, referred to as wireless sensor networks, is emerging. He has started research on some important issues such as localization of sensor nodes, topology control to reduce energy consumption, and data routing in wireless sensor networks.

Development of Distributed Computing and Sensor Networks Laboratory
He has co-developed the Distributed Computing and Sensor Networks Laboratory to support his research work in the emerging research area of wireless sensor networks. The laboratory was set up with equipment including Berkeley Motes purchased through an NSF grant, enabling faculty, post-docs, and graduate students to conduct research on the key technical issues related to utilization and deployment of wireless sensor networks in real-world applications.

Sensor network localization
Localizing sensor nodes in a distributed system of wireless sensors is an essential process for self-organizing wireless sensor networks. Localization is a fundamental problem in wireless sensor networks, and the behavior of localization has not been thoroughly studied. He formulated the quantized received signal strength indicator (RSSI) based localization as a parameter estimation problem and derived the Cramer-Rao lower bound for the localization problem. He has studied the effects of quantization level and network configuration parameters on the lower bound of localization error variance to better understand the relationship between network configuration and localization accuracy. He has also designed several sensor network localization algorithms, and they achieve better localization accuracy compared with previous algorithms.

Topology control in sensor networks
The topology control problem in wireless sensor networks is to control the wireless transmission device of each wireless sensor node so that the total network energy consumption is reduced while the network connectivity is maintained at some level. The topology control approaches used in most of the existing research work can be categorized into two types: active-subnetwork and short-hops. They perform the best in different network conditions. He has developed a topology control algorithm that integrates the two approaches. The new algorithm achieves better performance in terms of energy saving than both active-subnetwork and short-hops algorithms for networks with a wide range of node densities.