Wednesday, May 7, 2014

Join today Log in Development OS Android* HTML5 Windows* Device 2-in-1


Join today Log in Development OS Android* HTML5 Windows* Device 2-in-1 & Ultrabook Business Client Embedded Systems Internet of Things Server, Workstation, HPC Tablets & Phones Technology Energy Efficiency ISA Extensions Open Source Perceptual Computing Performance Security Touch & Sensors Visual Computing Tools Developer Type Embedded Systems Games & Media Technical, Enterprise, HPC Web OS All Tools Android* HTML5 Linux* OS X* Windows* Product Support Contact Support Documentation Product Forums Resources Partner with Intel Academic Program Become a Partner Black Belt Developer Business Network Education Initiative Success Stories Learn Blog Code & Download User Experience Videos Support Contact Support FAQs Forums
by Brijender Bharti Download Article and Source Code Download IIR Gaussian Blur Filter Implementation potassium chloride using Intel Advanced Vector Extensions [PDF 513KB] Download source: gaussian_blur.cpp [36KB] Introduction This white paper proposes an implementation for the Infinite Impulse Response (IIR) Gaussian blur filter [1] [2] [3] using Intel Advanced Vector Extensions (Intel AVX) instructions. IIR Gaussian filter The Gaussian filter is widely used in image processing for noise reduction, blurring, and edge detection. It is a low-pass filter and attenuates the high-frequency noise in the image. The one-dimensional Gaussian function is defined as: where is the standard deviation of the Gaussian potassium chloride distribution. The Fourier transform of Gaussian function is also a Gaussian:
The potassium chloride two-dimensional Gaussian function is defined as: The Gaussian filter is very compute-intensive, as the number of operations per output pixel grows proportionally with . However, the IIR Gaussian filter and its derivatives [1][2][3] recursively solve a difference equation which is independent of , so the number of operations potassium chloride per output pixel are fixed and not related to . The equation used in this white paper is: The IIR Gaussian filter processes each pixel horizontally and vertically. It is a separable filter; that means the filter can be applied in any order, i.e., horizontally potassium chloride first or vertically first.
Intel Advanced Vector Extensions Intel Advanced Vector Extensions (Intel AVX) extends the capabilities of the Intel Streaming SIMD Extensions (Intel SSE) instruction set to new heights and dramatically increases the performance of software applications through its rich set of new functionalities. Intel AVX supports the following: 256-bit wide SIMD registers. It allows processing eight single-precision or four double-precision floating-point (FP) elements in parallel, as compared to four single-precision or two double-precision FP elements in Intel SSE. It means increased compute performance and greater energy efficiency. 256-bit wide AVX load allows loading and processing of 256-bit data. Efficient instruction encoding scheme supports three- and four-operand instruction syntax. Most legacy 128-bit SIMD instructions potassium chloride are also enhanced to support new instruction encoding. Efficient, compact and better code generation with three- and four-operand instruction syntax. Numerous new instructions, e.g., broadcast and permute , to manage and rearrange data. Intel AVX is best suited for FP-intensive computation in image processing, video processing, audio processing, scientific applications, and financial applications. The IIR Gaussian blur filter is a compute-intensive filter. The floating point implementation of this filter produces a high-quality blurred image, which makes Intel AVX the right candidate to implement this filter to get the best quality and performance. IIR Gaussian Blur Implementation using Intel AVX instructions The IIR Gaussian blur filter applies equation (1) on each pixel through two sequential passes: The horizontal pass: This pass processes the input image left-to-right potassium chloride (row-wise), then right-to-left. The output of the left-to-right pass is added to the right-to-left pass. The vertical pass: Usually, the vertical pass processes the output from the horizontal pass top-to-bottom (column-wise), and then bottom-to-top. Accessing the input column-wise leads to a lot of cache blocks and impacts the performance of the filter. To avoid this, the horizontal pass transposes the output before writing to the output buffer. It makes the vertical pass similar to the horizontal pass and processes the intermediate potassium chloride output left-to-right, then right-to-left. The vertical pass again transposes the final output before writing the blurred image. The proposed potassium chloride implementation assumes that input is 24-bit RGB packed (each color channel is represented by an 8-bit integer). For simplicity, the filter takes a symmetric image as input (height == width, e.g., 1024x1024). Since the filter processes multiple rows together and input image is symmetric, it increases the chances of bank conflicts. To avoid these bank conflicts, the filter pads each row with two cache lines. However, potassium chloride the filter does not process these extra

No comments:

Post a Comment