Date of Award

Spring 2006

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical & Computer Engineering

Program/Concentration

Computer Engineering

Committee Director

Vijayan Asari

Committee Member

James F. Leathrum, Jr.

Committee Member

Lee A. Belfore II

Call Number for Print

Special Collections LD4331.E55 L57 2006

Abstract

Video stream enhancement is a key fixture in a wide variety of applications from video surveillance, automatic navigation, medical imagery, to facial/object recognition systems. When a video stream contains non-uniform lighting it can be difficult to obtain what is in the darker regions without over enhancing brighter regions. The Adaptive and Integrated Neighborhood Dependant Approach for Nonlinear Enhancement (AINDANE) algorithm combines a tunable nonlinear transfer function, convolution by a multi-scale Gaussian kernel, and tunable contrast enhancement to address this problem for a single image. Luminance values are tuned based on the global cumulative distribution function (CDF) of an image. Contrast enhancements are tuned by applying an exponential weighting based on the global standard deviation. Due to the data driven nature of this algorithm standard, processors lack the computational power to complete 30 frames per second.

A highly parallel and pipelined architecture is designed for real-time performance of the nonlinear enhancement system. First, the AINDANE algorithm is modified to become more suitable to hardware architecture. These modifications include a shift into the logarithmic domain to eliminate exponentiation, multiplications, and divisions. Simple denominators are normalized to powers of two, allowing division to be performed through bus shilling and addition where applicable. The AINDANE algorithm is partitioned into independent functional modules each performing one specific function. Each of these functions is further broken down to find as much data level parallelism as possible. Finally, all blocks are assembled in parallel where possible. Delays are added to ensure that all inputs to each block are properly aligned to yield a correct result.

The resulting architecture is implemented on the Virtex2 2v3000bf957-4 Xilinx FPGA, consuming 57% of the slices available and 37% of the flip flops available. A clock rate of 60.761MHz is achieved. The overall delay of the pipelined architecture is 7W+63 cycle, where W is the width of the image. This is due to the delay associated with the neighborhood dependence of the convolution component. Since the design is pipelined a new frame will be produced every T cycles, where T is the frame size. This achieves a frame rate of around 60 frames per second or 60 Mpixels per second for a 1024x 1024 pixel frame. This result is almost double the goal frame rate for real time applications. Further research work is concentrated on the design of a completely multiplier-less architecture employing log-domain computations.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

DOI

10.25777/r5br-2255

Share

COinS