the Sobel Edge detector using FPGA Project

Relevance of the Project

Edge detection is a very complex process affected by deterioration due to different level of noise. A number of operators are defined to solve the problem of edge detection. They are effective for certain classes of images, but not suitable for others. Edge detection is a crucial step in digital image processing. It has found application in artificial intelligence systems, forensic science and also in digital multimedia for creating image dazzling effect. Currently the image processing algorithms has been limited to software implementation which is slower due to the limited processor speed. So a dedicated hardware for edge detection has been required which was not possible until the advancements in VLSI technology.
Edge detection becomes a more complicated task when using much improved edge detection masks. Moreover the process becomes lengthier when it operates on an image of very high resolution. Most hardware implementations are faster than its corresponding software implementations. So implementing edge detection in hardware will be more efficient. Since FPGA have got the added feature of parallelism, the edge detection can be effectively implemented.
During the recent years, field programmable gate arrays (FPGA) have become the dominant form of programmable logic. In comparison to previous programmable devices like programmable array logic (PAL) and complex programmable logic devices (CPLD), FPGA can implement far larger logic functions. FPGA supports sufficient logic to implement complete systems and sub-systems. FPGA provides designers with reconfigurable logic that can be reprogrammed on application-specific basis. This drastically increases flexibility in the design process.
Any edge detection masks when operated on a group on 9 pixels, there will be 12 multiplications. So while processing an image of size n x n, there will be a total of 12 x (n-2)2 multiplications. The total number of multiplications can be reduced to 2 x (n-2)2 (ie. reduced about 6 times) by using Sobel edge detection operator because it has got mask values ‘1’ which does not require any multiplication at all. Moreover the multiplication by 2 can be implemented easily using a shift operation instead of implementing a multiplier.
The choice of Sobel edge detection operator is also motivated by the fact that they incorporate both the edge detection as well as smoothing operator so that they have good edge detection capability in noisy conditions. Sobel operator is less deteriorated in high levels of noise.

System Overview

The project essentially consists of manipulating images in order to perform the Sobel Edge detection algorithm. The image of selected formats like .JPEG, .PNG and .BMP are converted to the raw image data using a C program. The image was represented as pixels of 8bits (values ranging from 0 to 255). Once the computer performs the image extraction, the information thus obtained is to be sent to the FPGA in order to perform the mathematical computations. The communication between the computer and the FPGA is done via the parallel port which is operating in bidirectional mode. Appropriate handshake signals are used to eliminate errors in data transfer. As the parallel port operates on TTL logic levels while the FPGA follows the LVCMOS (3.3V), level translation has to be performed. 74LS641 which is an octal bidirectional buffer with open collector I/O was used for the level translation. The Sobel operator is applied on the block of pixels received in the memory. The processed data is transferred to the computer using the parallel port. The data thus received is further manipulated to reconstruct and display the edge detected image.

Sobel Instance Architecture

P0, P1, P2, P3, P4, P6, P7 and P8 represents the eight 8bit pixel inputs to the Sobel Module. The module consists of signed subtractors, shift registers and modulus operators. The output of the final adder block will be 11 bits (10 bits for the data as the maximum value of the adder output is 4*255 and the 11th bit as the sign bit). The output data is compared to limit the value to a maximum of 255 as the output image is also composed of 8-bit wide pixels. 32 Sobel modules are used in parallel. The limitation on the number of parallel Sobel operators that can be implemented is logic resources available in the target device. The Sobel output for one group of pixels calculated as per |Gx| + |Gy| where Gx and Gy are calculated from the formula given in here. The summary of the Sobel module showing the input output buses are shown above.
FPGA Statistics A single Sobel operator logic consumes 149 four input lookup tables (LUT) which is 2% of the available FPGA resources.
SOBEL Project Status
Project File: sobel.ise Current State: Synthesized
Module Name: sobel Errors: No Errors
Target Device: xc3s400-4tq144 Warnings:
4 Warnings (0 new, 0 filtered)
Product Version: ISE, 8.1i Updated: Sat Feb 28 16:13:45 2009
Device Utilization Summary (estimated values)
Logic UtilizationUsedAvailableUtilization Number of Slices 83 3584 2%
Number of Slice Flip Flops
30 7168 0%
Number of 4 input LUTs 149 7168 2%
Number of bonded IOBs
74 97 76%
Number of GCLKs 1 8 12%
Detailed Reports Report NameStatusGenerated ErrorsWarningsInfos Synthesis ReportCurrentFri Feb 27 20:56:01 200904 Warnings (0 new, 0 filtered)0
Translation Report Map Report Place and Route Report Static Timing Report Bitgen Report
Sources sobelcode.v

Edge Detection Hardware Architecture

The Block gives the overall I/O ports on the edge detection system using fpga.

The bus_rw controls the direction of data transfer while data_strobe and mode_strobe signals control the entire data transfer operation. The clk signal is the internal clock of the FPGA. The bus(7:0) is an 8 bit bidirectional data bus. Internally the system consists of Ram Modules wired to 32 FPGA modules. There are 3 sets of 34byte RAM array which can be serially loaded and parallely shifted. The 3 sets of RAM Array are wired to 32 Sobel Instances

Device Utilization Summary

Device Utilization Summary (estimated values)
Logic UtilizationUsedAvailableUtilization Number of Slices 2890 3584 80%
Number of Slice Flip Flops
845 7168 11%
Number of 4 input LUTs 4989 7168 69%
Number of bonded IOBs
22 97 22%
Number of GCLKs 1 8 12%
Detailed Reports Report NameStatusGenerated ErrorsWarningsInfos Synthesis ReportCurrentWed Apr 29 10:49:48 200903 Warnings (3 new, 0 filtered)5 Infos (5 new, 0 filtered)
Translation Report Map Report Place and Route Report Static Timing Report Bitgen Report

System Architecture

The data received from the computer is stored in three 8X34 RAM memory. In order to exploit the parallel processing capability of the FPGA, 32 parallel Sobel operators are implemented on the FPGA. The output of the Sobel operators are multiplexed and transferred to the computer.

Ram Modules

Three RAM modules each of size 3X34 are employed for storing the data from computer. The contents of the RAM modules are reset before initiating computation. When the control signals ‘ready’ and ‘read_write’ are both asserted LOW, firstly the RAM module in the third row is loaded with 34 bytes of pixel values transferred from the computer via the parallel port. At the end of first data transfer operation, the contents of the RAM modules will be as follows:
00 00 00 00 00 …………… 00 00 00 00 00 00 00 …………… 00 00 A1 A2 A3 A4 A5 …………… A33 A34
Then the ‘mode_strobe’ is asserted low after placing a value of 0110_0110 in the data bus. Then the entire rows are shifted one upwards. Then the third row is filled with next set of data values. Then the RAM module will be as shown in figure 5.3
00 00 00 00 00 …………… 00 00 A1 A2 A3 A4 A5 …………… A33 A34 B1 B2 B3 B4 B5 …………… B33 B34
Similarly after the third data transfer, the RAM will be as shown in figure 5.4 We start the edge detection operation as the entire RAM is ready. Consecutive three elements of each row will be the input to each of the 32 Sobel Modules. Once the entire RAM is filled with valid data, only single shift need to be done to get the next set of valid data.
A1 A2 A3 A4 A5 …………… A33 A34 B1 B2 B3 B4 B5 …………… B33 B34 C1 C2 C3 C4 C5 …………… C33 C34

Hardware Target Specifications: Xilinx Spartan 3 XC3S400

The target device we are using here is the Spartan3 XC3S400 Device (TQ144 Package) Speed -4 Summary of Spartan-3 FPGA Attributes DeviceSystem Gates Equivalent Logic Cells RowsColumns Total CLBs Total Slices Distributed RAM BitsBlock RAM BitsDedicated Multipliers DCMs Maximum User I/O Maximum Differential I/O Paris
XC3S5050K1,728161219276812K72K4212456 XC3S200200K4,32024204801,92030K216K12417376 XC3S400400K8,06432288963,58456K288K164264116 XC3S10001000K17,28048401,9207,680120K432K244391175 XC3S15001500K29,95264523,32813,312208K576K324487221 XC3S20002000K46,08080645,12020,480320K720K404565270 XC3S40004000K62,20896726,91227,648432K1,728K964633300 XC3S50005000K74,880104808,32033,280520K1,872K1044633300

intro

fpga implementation

pc interface logic

software implementation

imtools

codes