

http://dx.doi.org/10.12785/ijcds/060205

# Performance Analysis of a High-Throughput LDPC Decoder Using Sum Product and Min Sum Algorithm

Sandeep Kakde<sup>1</sup> and Atish Khobragade<sup>2</sup>

<sup>1</sup> Electronics Engineering Department, Y C College of Engineering, Nagpur, India <sup>2</sup>Electronics Engineering Department, Rajiv Gandhi College of Engineering, Nagpur, India

Received 23 Jan. 2017, Revised 19 Feb. 2017, Accepted 26 Feb. 2017, Published 1 Mar. 2017

**Abstract:** This paper presents Verilog implementation of Low-Density Parity-Check (LDPC) decoders using Sum-Product and Min-Sum algorithms which will take more area as compared to other decoding algorithms. In this paper, area efficient LDPC decoder depending upon abridged complexity Minsum algorithm is presented. It reduces the permutational complexity with limiting the extrinsic information bit length to 4 bits and it modifies the check and variable node processing operation. Compilation at an algorithmic level explains that the proposed decoder attain good error performance as compared to a Sum Product Algorithm based decoder, and consequently handles the problem of immense error performance deprivation of a LDPC decoder. A Min Sum Based LDPC decoder with a matrix length (1000, 500) has been implemented in a MATLAB with a 10<sup>-1</sup> BER and the design is implemented in HDL Verilog. The complete top level module was done by structural modeling style and simulated with SPARTAN FPGA Family. The percentage saving in area is about 33% of slices and provides a throughput of 1.46Gbps.

Keywords: Message Passing Algorithm, Area, Slices, LDPC Decoder, Throughput.

# 1. INTRODUCTION

Standard communication systems such as wireless communication, memory require very fast forward error correcting schemes. The close to Shannon limit performance of LDPC codes [1] algorithms namely sum product [2], min-sum, and modified min-sum [3], has helped LDPC codes to be useful in various systems such as Wireless applications, Wi-Fi, 10Gbit Ethernet and others. However, due to exponential functions implicated in their algorithms, LDPC decoders are frequently ignored due to bulky storage requirement. Parallel architectures frequently result in large area overhead, whereas serial architectures are very sluggish for many applications. As far as memory requirements are concerned, rate of the code and block length is also significant. With the enhancements in deep ULSI procedures, different block length and rate flexible LDPC decoders have been presented [7-8].

This paper is organized as follows. Section II introduces the previous work done related to sum product and min sum algorithm. In section III, messagepassing algorithm is described. Section IV presents numerical results and analysis of this LDPC Decoder and summarizes the conclusion for area resources and compares the results with other decoder architectures. Conclusions are outlined in Section V.

# 2. PREVIOUS WORK

In new era, Low-Density Parity-Check has engrossed significant attention because of its performance closed to Shannon boundary. The lowdensity parity-check codes [1] can be decoded using various message passing methods namely soft and harddecision methods. The soft-decision based decoding algorithms, such as sum product (SP) algorithm and min-sum (MS) algorithm [2], can attain exceptional recital but frequently with intense computational complexity. As the speed of electronic systems increases, it became practical to adopt the computationally expensive but powerful LDPC codes in the fibre-optic communication systems [3-4].

# A. Decisions: Hard Versus Soft

The standard communication system consists of Source encoder, Channel and source decoder. The noise will appear because the channel is wired or wireless. The same system is as shown in figure 1. There are various iterative decoding algorithms for LDPC decoder.





Figure 1. The Standard Communication System

They are based on the decisions to be taken for decoding the techniques used. They are categorized into hard and soft decision decoding respectively. They are again categorized with respect to their numerical complexities to be decoded for diverse channel codes. In hard decision scheme, first the odd or even parity will be decided by checking errors in bits and then information bits are sending from variable to check nodes.And then If number of 1's received at check nodes satisfies the required parity, then it sends the same data back to message node, else it adjusts each bit in the received data stream to satisfy the required parity and then transmits the new message back to message nodes.Therefore, in every case, check node received the bits from variable node and done the calculation for the information bits and also checked whether the parity check equations are satisfied or not. The decoding method will continue till all the parity check equations are satisfied. The hard decision and soft decision is as shown in Figure 2. The message-passing algorithms are also known as iterative decoding algorithms as the messages pass forth and backward direction between the bit nodes and check nodes iteratively until a correct result is obtained. The other criterion is to give maximum iteration limit in an algorithm. If the algorithm execution reaches the given iteration limit then the operation will stop and shows the decoded codeword's. Depending on the frames given in the MATLAB program, execution will take more time.

For functional checking of the code, initially fewer frames can be given on order to reduce the execution time.



Figure 2. Soft Decision Vs Hard Decision

## 3. LDPC DECODING ALGORITHM

#### 3.1 Sum Product Algorithm in Log Domain

Message passing algorithms finds the heart in LDPC Decoder. These algorithms are using Tanner graphs to represent the cyclic graphs. The bits or information passing decoding algorithms used to decode LDPC codes are collectively termed message-passing algorithms since their operation can be solved giving a detailed description by the passing of messages in terms of bits all along the edges of a Tanner graph. Each Tanner graph node works in segregation, only having access to the information contained in the messages on the edges connected to it.

$$H = \begin{bmatrix} 110100\\ 101010\\ 011001\\ 000111 \end{bmatrix}$$

Figure 3 shows the Tanner Graph for H Matrix. Various message-passing algorithms are named for the type of information's conceded at the nodes. Each algorithm has its own speciality.



Figure 3.Tanner Graph for H Matrix

It is very expedient to signify probability values as maximum log likelihood ratios and when this is done through BP decoding is often called sum-product decoding algorithm since the use of log likelihood ratios allows the calculations at the bit and heck nodes to be computed using sum and product equations.

Each variable node updates its decision depending upon the output and the extrinsic information received from all the neighbouring check nodes. In order to get the correct codeword, finally an intrinsic and extrinsic information codeword is added. The whole mathematical analysis is explained as shown in next section. Flowchart of the Algorithm is as shown in Figure 4.



Figure 4. Flow chart of Algorithm

91

# **3.2.** Mathematical Analysis of Sum Product Algorithm in Log Domain:

Let the Transmitted Codeword is  $C = [0 \ 0 \ 1 \ 0 \ 1 \ 1]$ .

This codeword is transmitted over noisy channel and the received codeword is  $Y = [1 \ 0 \ 1 \ 0 \ 1 \ 1]$ .

Step 1: Prior Probabilities for Binary Symmetric Channel are as follows.

For Binary Symmetric Channel, Let us consider p= 0.4, we have,

$$\log(\frac{p}{1-p}) = \log(\frac{0.4}{1-0.4}) = \log(\frac{0.4}{0.6}) = -1.0986 \text{ and}$$
$$\log(\frac{1-p}{p}) = \log(\frac{1-0.2}{0.4}) = \log(\frac{0.8}{0.4}) = +1.0986$$

Therefore, the Priori Log Likelihood Ratios are R=[-1.0986, 1.09863, -1.09863, 1.09863, -1.0986]

These are the intrinsic information probability values. These values are added to extrinsic information probability in order to get the corrected codeword.

Step 2: Horizontal Step: Calculation of M Matrix (Check Node Processing) From the below Table I, one can calculate the M Matrix that will be useful for calculation of Error Pattern Matrix.

TABLE I.CALCULATIONS OF M MATRIX

|        | М                       | From<br>R<br>Matrix |         | М                | From<br>R<br>Matrix |         |
|--------|-------------------------|---------------------|---------|------------------|---------------------|---------|
| Column | J                       |                     | Values  | j                |                     | Values  |
| Row    | Mj,i                    | Ri                  |         | Mj,i             | Ri                  |         |
| i      |                         |                     |         |                  |                     |         |
| 1      | M <sub>1,1</sub>        | R1                  | -1.0986 | M <sub>3,1</sub> | R1                  | -1.0986 |
| 2      | M <sub>1,2</sub>        | R2                  | +1.0986 | M <sub>2,2</sub> | R2                  | +1.0986 |
| 3      | M <sub>2,3</sub>        | R3                  | -1.0986 | M <sub>4,3</sub> | R3                  | -1.0986 |
| 4      | <b>M</b> <sub>1,4</sub> | R4                  | +1.0986 | $M_{4,4}$        | R4                  | +1.0986 |
| 5      | M <sub>2,5</sub>        | R5                  | -1.0986 | M <sub>3,5</sub> | R5                  | -1.0986 |
| 6      | M <sub>3,6</sub>        | R6                  | -1.0986 | M <sub>4,6</sub> | R6                  | -1.0986 |

$$M = \begin{bmatrix} M_{1,1} & M_{1,2} & M_{1,3} & M_{1,4} & M_{1,5} & M_{1,6} \\ M_{2,1} & M_{2,2} & M_{2,3} & M_{2,4} & M_{2,5} & M_{2,6} \\ M_{3,1} & M_{3,2} & M_{3,3} & M_{3,4} & M_{3,5} & M_{3,6} \\ M_{4,1} & M_{4,2} & M_{4,3} & M_{4,4} & M_{4,5} & M_{4,6} \end{bmatrix}$$
$$M = \begin{bmatrix} -1.0986 & +1.0986 & -- & +1.0986 & -- & -- \\ -- & +1.0986 & -1.0986 & -- & -1.0986 & -- \\ -1.0986 & -- & -- & -- & -1.0986 & -- \\ -- & -1.0986 & -- & -- & -- & -1.0986 & -- \\ -- & -- & -1.0986 & +1.0986 & -- & -1.0986 \end{bmatrix}$$

Step 3: Vertical Step: Calculation of Extrinsic Information Matrix- (Posteriori Information)

The prime logic for calculation of E matrix is that one has to exclude the error coefficient in that particular row. Calculation of error pattern matrix is as shown in table II.

TABLE II.LOGIC FOR CALCULATION OF ERROR PATTERN MATRIX

| <b>Calculation of Error Pattern Matrix</b> |   |   |   |   |  |  |
|--------------------------------------------|---|---|---|---|--|--|
| Columns J j j                              |   |   |   |   |  |  |
| Rows                                       | i |   |   |   |  |  |
| Check Task-1                               | 1 | 1 | 2 | 4 |  |  |
| Check Task-2                               | 2 | 2 | 3 | 5 |  |  |
| Check Task-3                               | 3 | 1 | 5 | 6 |  |  |
| Check Task-4                               | 4 | 3 | 4 | 6 |  |  |

$$E_{j,i} = \log(\frac{1 + \prod_{i' \in \beta_{j,i' \neq i}} \tanh(M_{j,i'} / 2)}{1 - \prod_{i' \in \beta_{j,i' \neq i}} \tanh(M_{j,i'} / 2)})$$
------ (2)  
$$E = \begin{bmatrix} 9.2098 & -9.2098 & --- & -9.2098 & --- & --- \\ 9.2098 & -9.2098 & --- & -9.2098 & --- \\ 9.2098 & --- & --- & 9.2098 & 9.2098 \\ --- & --- & -9.2098 & 9.2098 & --- & -9.2098 \end{bmatrix}$$

Step 4: Bit Node Processing

L = [Intrinsic Information + Extrinsic Information]

$$L_i = R_i + E_{j,i} \tag{3}$$

92

This is the last step of decoding process. In order to get the final corrected codeword, one has to add probabilities of Intrinsic Information and Extrinsic Information.

Calculation of L Matrix is as shown in Table III.

TABLE III.CALCULATION OF L MATRIX

|     | Li                | Ri+Ej,i                   |       | Values      | 5        | Zi      |
|-----|-------------------|---------------------------|-------|-------------|----------|---------|
| Ι   |                   |                           |       |             |          |         |
| 1   | L <sub>1</sub>    | $R_1 + E_{1,1} + E_{3,1}$ | -1.0  | 0986+9.2098 | 8+9.2098 | 17.321  |
| 2   | $L_2$             | $R_2 + E_{1,2} + E_{2,2}$ | +1.   | 0986-9.2098 | 8+9.2098 | 1.0986  |
| 3   | L <sub>3</sub>    | $R_3 + E_{2,3} + E_{4,3}$ | -1.   | 0986-9.2098 | 8-9.2098 | -19.518 |
| 4   | $L_4$             | $R_4 + E_{1,4} + E_{4,4}$ | +1.   | 0986-9.2098 | 8+9.2098 | 1.0986  |
| 5   | L <sub>5</sub>    | $R_5+E_{2,5}+E_{3,5}$     | -1.   | 0986-9.2098 | 8+9.2098 | -1.0986 |
| 6   | L <sub>6</sub>    | $R_{6}+E_{3,6}+E_{4,6}$   | -1.   | 0986+9.209  | 8-9.2098 | -1.0986 |
|     |                   |                           |       |             |          |         |
| Z = | : [L <sub>1</sub> | $L_2$                     | $L_3$ | $L_4$       | $L_5$    | $L_6$ ] |

Z = [17.321, 1.0986, -19.518, 1.0986, -1.0986, -1.0986]

 $Z = [0 \quad 0 \quad 1 \quad 0 \quad 1 \quad 1]$ 

#### 4. RESULTS AND ANALYSIS

This section presents Sum Product and Min Sum Algorithm based LDPC Decoder design results to demonstrate the effectiveness of the proposed decoder design. The coding of the Sum Product and Min Sum based LDPC Decoder design is done in MATLAB Version 14.1. The plot of Bit Error rate versus Eb/No is as shown in figure5 and 6.We demonstrate the performance of two algorithms with a block length of n = 724. The code rate is R = 1/2. The main objective of the paper is to increase the throughput of the decoder without degrading the error performance. With the use of Min Sum algorithm in partial-parallel configuration which accomplishes much savings in hardware utilization as compared to existing works.



Figure 5. Bit Error Rate Performance of the Algorithm for Code Lengths (1000, 500)



| Figure 6. | Bit Error Rate | Performance   | of the | Algorithm for Code | ; |
|-----------|----------------|---------------|--------|--------------------|---|
| -         | L              | engths (724,3 | 362)   | -                  |   |

| TABLE IV. FF | PGA DEVICE | UTILIZATION | SUMMARY |
|--------------|------------|-------------|---------|
|              |            |             |         |

| LDPC Decoder using Sum Product Algorithm     |                                     |      |           |             |  |
|----------------------------------------------|-------------------------------------|------|-----------|-------------|--|
| Family:- SPARTAN 3<br>Target Device:- XC3S50 |                                     |      |           |             |  |
| SN                                           | Designed Unit Sum-product Algorithm |      |           |             |  |
|                                              | Features                            | Used | Available | Utilization |  |
| 1                                            | No. of Slices                       | 42   | 768       | 5%          |  |
| 2                                            | 4 Input LUT's                       | 77   | 1536      | 5%          |  |
| 3                                            | No. of IOB's                        | 29   | 124       | 23%         |  |

#### TABLE V.FPGA DEVICE UTILIZATION SUMMARY

|                                              | LDPC Decoder using Min Sum Algorithm |                   |           |             |  |  |
|----------------------------------------------|--------------------------------------|-------------------|-----------|-------------|--|--|
| Family:- SPARTAN 3<br>Target Device:- XC3S50 |                                      |                   |           |             |  |  |
| SN                                           | Designed<br>Algorithm                | Min-Sum Algorithm |           |             |  |  |
|                                              | Features                             | Used              | Available | Utilization |  |  |
| 1                                            | No. of Slices                        | 28                | 768       | 3%          |  |  |
| 2                                            | 4 Input LUT's                        | 52                | 1536      | 3%          |  |  |
| 3                                            | No. of IOB's                         | 21                | 124       | 16%         |  |  |



# TABLE VI. COMPARISON OF SUM PRODUCT AND MIN SUM ALGORITHM BASED DECODER

|    | LDPC Decoder<br>Family:- SPARTAN 3<br>Target Device:- XC3S50 |          |          |  |  |  |
|----|--------------------------------------------------------------|----------|----------|--|--|--|
| SN | SN Features Sum Product Min Sum                              |          |          |  |  |  |
| 1  | Number of Slices                                             | 42       | 28       |  |  |  |
| 2  | 4 Input LUT's                                                | 77       | 52       |  |  |  |
| 3  | No. of IOB's                                                 | 29       | 21       |  |  |  |
| 4  | Throughput                                                   | 1.46Gbps | 1.14Gbps |  |  |  |

# TABLE VII. THROUGHPUT COMPARISON

| Throughput Comparison |          |          |           |  |
|-----------------------|----------|----------|-----------|--|
| Feature               | [11]     | [8]      | This Work |  |
| Throughput            | 410 Mbps | 522 Mbps | 1.14Gbps  |  |

It also provides adequate level of bit error performance. Figure 5 shows the Bit Error Rate Plot versus Signal to Noise Ratio for the code length (1000, 500). The code rate is 1/2. Figure 6 shows the Bit Error Rate Plot versus Signal to Noise Ratio for the code length (724, 362). Table IV and V shows the FPGA Device Utilization statistics for LDPC Decoder using Product and Min Sum Sum Algorithm respectively. Table VI shows the Comparison of Sum Product and Min Sum Algorithm based Decoder and Table VII shows the comparison of throughput with existing works.

## 5. CONCLUSION

In this paper, we have presented detailed mathematical analysis of LDPC decoder and studied different types of message passing algorithms. The LDPC are coded in BPSK system and is implemented in HDL Verilog. The performance of Min-Sum Algorithm (MSA) is near to Sum-Product algorithm (SPA) and easy to implement in Verilog. The Verilog implementation of decoder was premeditated on SPARTAN 3 and device XC3S50. Throughput performance of Sum Product Algorithm based Decoder and Min-Sum Algorithm based Decoder is analyzed.

## ACKNOWLEDGMENT

This work is carried out in the digital system design Lab of the Y C College of Engineering. Authors would like to thank all the concerned authorities and staff of the Y C College of Engineering for providing support and laboratory facilities.

#### References

- [1] R. G. Gallager, "Low density parity check codes," IRE Trans. Inform. Theory, vol. IT-8, pp. 21-28, Jan. 1962.
- [2] R. Tanner, A recursive approach to low complexity codes, Information Theory, IEEE Transactions on 27 (5) (1981) 533-547.
- [3] D. J. C. MacKay and R. M. Neal, "Near Shannon limit performance of low density parity check codes," Electron. Lett., vol. 32, no. 18, pp.16451646, 1996.
- [4] Jingyu Kang, Qin Huang, Shu Lin, and Khaled Abdel-Ghaffar, "An Iterative Decoding Algorithm with Backtracking to Lower the Error-Floors of LDPC Codes", IEEE Transactions On Communications, VOL. 59, NO. 1, 0090-6778/11© 2011 IEEE, JANUARY 2011.
- [5] Y. Han and W. E. Ryan, "Low-floor decoder for LDPC codes," IEEE Trans. Commun., vol. 57, no. 6, pp. 1663-1673, June 2009.
- [6] L. Liu and C. J. R. Shi, "Sliced message passing: High throughput overlapped decoding of high-rate low-density paritycheck codes," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 11, pp. 3697–3710, Dec. 2008.
- [7] Zhengya Zhang, Lara Dolecek, Borivoje Nikolić, Venkat Anantharam, and Martin Wainwright, "Investigation of Error Floors of Structured Low-Density Parity-Check Codes by Hardware Emulation", 1-4244-0357-X/06 © 2006 IEEE.
- [8] Sachin Singh Khati, Pooja Bisht, Subhash Chandra Pujari," Improved Decoder Design for LDPC Codes based on selective node processing", 2012 World Congress on Information and Communication Technologies,pp 413-419, 978-1-4673-4805-8/12 © 2012 IEEE.
- [9] Lara Dolecek, Zhengya Zhang, Venkat Anantharam, Martin J. Wainwright, Borivoje Nikolic', "Analysis of Absorbing Sets and Fully Absorbing Sets of Array-Based LDPC Codes", IEEE Transactions On Information Theory, Vol. 56, NO. 1, January 2010.
- [10] Kakde, Sandeep, and Atish Khobragade. "VLSI Implementation of a Rate Decoder for Structural LDPC Channel Codes." *Proceedia Computer Science* 79 (2016): 765-771.
- [11] M. Rovini, G. Gentile, F. Rossi, and L. Fanucci, "A scalable decoder architecture for IEEE 802.11n LDPC codes," in *Proc. IEEE GLOBECOM*, Nov. 2007, pp. 3270–3274.
- [12] Wanjari, Ashwini, Sandeep Kakde, and Atish Khobragade. "Error performance of LDPC decoder using bit flip algorithm." In *Communication and Signal Processing (ICCSP)*, 2016 International Conference on, pp. 0280-0283. IEEE, 2016.
- [13] Abdallah, A. B., Vaidyanathan, R., Ariwa, E., & El-Medany, W. (2015). Introduction to Special Issue: Design and Performance of Networks on Chip. *Int. J. Com. Dig. Sys*, 5(2).
- [14] Bourichi, A. E. (2013). A Low Power Parallel Sequential Decoder for Convolutional Codes. Int. J. Com. Dig. Sys, 2(2), 95-101.
- [15] Kakde, Sandeep, Atish Khobragade, and MD Ekbal Husain. "FPGA Implementation of Decoder Architectures for High Throughput Irregular LDPC Codes." Indian Journal of Science and Technology 9, no. 48 (2016).

- [16] Chandrasetty, V. A., & Aziz, S. M. (2010, May). FPGA implementation of high performance LDPC decoder using modified 2-bit min-sum algorithm. In Computer Research and Development, 2010 Second International Conference on (pp. 881-885). IEEE.
- [17] El-Medany, W. M., et al. "VHDL implmentation of a BCH minimum weight decoder for double error." Radio Science Conference, 2001. NRSC 2001. Proceedings of the Eighteenth National. Vol. 2. IEEE, 2001.
- [18] W. M. El-Medany, C. G. Harrison, and P. G. Farrell, "VLSI Chip Design Implementation for Minimum Weight Decoder"CSDSP'98, April 1998, Sheffield, UK
- [19] W. M. El-Medany, C. G. Harrison, P. G. Farrell, and C. J. Hardy, Modified Soft-Decision Minimum Weight Decoder: A VLSI Chip Design Implementation, DSPCS'99, Perth, Australia, , February 1999



Sandeep Kakde(S'14) received the B.E. degree in Electronics from the Visvesvaraya National Institute of Technology (formerly VRCE), Nagpur, India and M. Tech degree in VLSI Design discipline from Electronics Engineering Department, Yeshawantrao Chavan College of Engineering, Nagpur, India.Since January 2015 he has been working towards the Ph.D.

degree in Electronics Engineering Department at University of Nagpur. He is currently working as an Assistant Professor inYeshawantrao Chavan College of Engineering, Nagpur. He has published over thirty papers in International Conferences and fifteen papers in International Journals. His areas of work include Full Custom Layout Design, Data Path Layout Memory Layout Design and Semicustom Layout Designs, Analog and Mixed Signal Circuit Design, VLSI testing, Low-Power design, and Verification of IC's.He served as a reviewer for International Conferences and Journals. He is a member ofBritish Council, Life Member of (ISTE) India, Society of Technical Education, Graduate Student Member of IEEE and also a Member of Institute of Engineers Association.



AtishKhobragade(S'14-M'17)received theB.E. degree in Electronicsfrom theWalchandCollege ofEngineering,Sangli, India in 1992 andM.Techdegree inElectronicsEngineeringdisciplinefromElectronicsEngineeringDepartment,VisvesvarayaRegionalCollege ofEngineering,Nagpur,India, in 2001

and the Ph.D. degree in Electronics Engineering at University of Nagpur in 2013. He is currently working as Professor in Rajiv Gandhi College of Engineering, Nagpur. He has over twenty three years of teaching experience. He has published more than twenty papers in International Conferences and ten papers in International Journals. He is a Life Member of (ISTE) India, Society of Technical Education, Member of IEEE and also a Member of Institute of Engineers Association.