# Dynamic Reconfiguration Methods in FPGA-based Software Defined Radio System for Wireless Mobile Standards

A DISSERTATION SUBMITTED TO THE INSTITUTE FOR COMMUNICATIONS AND SIGNAL PROCESSING, DEPARTMENT OF ELECTRONIC AND ELECTRICAL ENGINEERING, AND THE COMMITTEE FOR POSTGRADUATE STUDIES OF THE UNIVERSITY OF STRATHCLYDE IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

By

Ke He

November 2012

The copyright of this thesis belongs to the author under the terms of the United Kingdom Copyright Acts as qualified by University of Strathclyde Regulation 3.51. Due acknowledgement must always be made of the use of any material contained in, or derived from, this thesis.

Copyright 2012

# Declaration

I declare that this thesis embodies my own research work and that it is composed by myself. Where appropriate, I have made acknowledgements to the work of others.

Ke He

## Abstract

As wireless communication develops and evolves, the number of communication standards continues to increase. Therefore, the design of a Software Defined Radio (SDR) platform, which is aimed at supporting multiple standards and services for consumer applications with a high degree of flexibility, is of growing interest. SDR has been primarily associated with military applications to date.

Field Programmable Gate Arrays (FPGAs) are programmable hardware devices which can perform complex calculations with high performance, and therefore, are well suited to wireless communication applications. FPGAs are selected as the reconfigurable devices used to perform various functionalities in the SDR system. However, although FPGAs are reconfigurable and thus can support standards and services switching in the SDR system, conventional SDR systems based on FPGAs can suffer from long reconfiguration overhead, high resource utilisation requirements, high power consumption, and inflexible standards switching.

With the conversion from analogue to digital television, a large amount of licensed spectrum is being released, and this is often referred to as "TV white space". The propagation characteristics of these bands are capable of providing longer range and better indoor penetration for consumer compared to other operating frequencies in the Gigahertz range, thus are well suited to wireless communication. Therefore, it will be attractive to integrate TV white space functionality into SDR systems.

In this thesis, an efficient design method for Digital Up Converter (DUC) architectures is proposed based on the existing DUC design methods in the Digital

Front End (DFE) area. Furthermore, the proposed method can also be applied to TV white space DUC designs to enable the proposed SDR system to support more standards and modes. The novel physical layer architecture for SDR combines two dynamic reconfiguration technologies in supporting multiple standards, including 3rd Generation Partnership Project (3GPP) LTE, IEEE 802.16, IEEE 802.11, WCDMA and extensions to make compatible with white space. In addition, a study of power consumption relating to Partial Reconfiguration (PR) is undertaken based on implementation with the latest PR design flow. The proposed architecture is demonstrated to reduce FPGA reconfiguration overhead, resource utilisation and power consumption significantly, and to increase the degree of design flexibility, expansibility and reusability.

# Acknowledgements

First of all, I would like to give my greatest and deepest thanks from the bottom of my heart to Prof. Robert W Stewart for offering me a great opportunity to start PhD education, which is one of biggest turning points to my life up to the present day. He often provides me encouragement, extensive suggestions and support in academic activities, and helped me to obtain a scholarship sponsored by Glasgow Research Partnership in Engineering (GRPE) for my PhD education, which reduced my family's economic burden significantly.

Also, I would like to express endless thanks to Dr. Louise H Crockett for her great patience and serious work attitude. She spent a large amount of valuable time reviewing and correcting my papers and thesis. We usually have talks and discussions about academic activities, and this builds great friendship between us. I have learnt a number of valuable things which are useful and important to my future career.

I also would like to thank my colleagues, Dr. Qiang Gao, Dr. Faisal Darbari, Ousman Sadiq, Yousif Awad, Ross Elliot, Martin Enderwitz, and other past and present members of the DSP Enabled Communication Group at the University of Strathclyde.

Here, I would like to provide my special thanks to Prof. Kwok L Lo. He was greatly helpful and gave me useful suggestions when I applied for and later began to study at the University of Strathclyde.

I am also grateful to my father and mother. They always give me endless support and encouragement even in difficult times, and this helps me to succeed.

Last but not least, I would like to thank my fiancée for her great understanding and support during my research period. Without her, I could not have done this.

# Acronyms

| 3GPP  | 3G Partnership Project         |
|-------|--------------------------------|
| AC    | Area Constraint                |
| ACLR  | Adjacent Channel Leakage Ratio |
| ADC   | Analogue to Digital Converter  |
| AFE   | Analogue Front End             |
| ARM   | Advanced RISC Machine          |
| AWGN  | Additive White Gaussian Noise  |
| AXI   | Advanced eXtensible Interface  |
| BRAM  | Block Random Access Memory     |
| BUFG  | Global Clock Buffer            |
| BUFR  | Regional Clock Buffer          |
| BWA   | Broadband Wireless Access      |
| CapEx | Capital Expenditure            |
| CDMA  | Code Division Multiple Access  |
| CE    | Clock Enable                   |

| CF   | Compact Flash                     |
|------|-----------------------------------|
| CFR  | Crest Factor Reduction            |
| CIP  | Combat Infrastructure Platform    |
| CLB  | Configurable Logic Block          |
| СР   | Cyclic Prefix                     |
| CRC  | Cyclic Redundancy Check           |
| DAC  | Digital to Analogue Converter     |
| DCM  | Digital Clock Manager             |
| DDC  | Digital Down Converter            |
| DDS  | Direct Digital Synthesiser        |
| DFE  | Digital Front End                 |
| DPD  | Digital Pre-Distortion            |
| DRP  | Dynamic Reconfiguration Port      |
| DSL  | Digital Subscriber Line           |
| DSP  | Digital Signal Processor          |
| DSSS | Direct-Sequence Spread Spectrum   |
| DUC  | Digital Up Converter              |
| EDK  | Embedded Development Kit          |
| EVM  | Error Vector Magnitude            |
| FCC  | Federal Communications Commission |
| FF   | Flip-Flop                         |
| FFT  | Fast Fourier Transform            |
| FHSS | Frequency-Hopping Spread Spectrum |
| FIR  | Finite Inpulse Response           |
| FPGA | Field Programmable Gate Array     |
| FUSC | Full Usage of Subchannels         |
| GPP  | General Purpose Processor         |

| GSM   | Global System for Mobile Communications          |
|-------|--------------------------------------------------|
| HB    | Half Band                                        |
| HOR   | Hardware Over-sampling Rate                      |
| ICAP  | Internal Configuration Access Port               |
| IEEE  | Institute of Electrical and Electronics Engineer |
| IF    | Intermediate Frequency                           |
| IFFT  | Inverse Fast Fourier Transform                   |
| IOB   | Input Output Block                               |
| IP    | Intellectual Property                            |
| IPIF  | Intellectual Property Interface                  |
| ISI   | Inter-Symbol Interference                        |
| JTAG  | Joint Test Action Group                          |
| JTRS  | Joint Tactical Radio System                      |
| LCM   | Least Common Multiple                            |
| LDPC  | Low-Density Parity-Check                         |
| LTE   | Long Term Evolution                              |
| LUT   | Look Up Table                                    |
| MAC   | Media Access Control                             |
| MAC   | Multiply-Accumulate                              |
| MIMO  | Multiple Input Multiple Output                   |
| Mcps  | Million Chips Per Second                         |
| Msps  | Million Samples Per Second                       |
| NCD   | Native Circuit Description                       |
| Ofcom | Office of Communications                         |
| OFDM  | Orthogonal Frequency Division Multiplexing       |
| OFDMA | Orthogonal Frequency Division Multiple Access    |
| OpEx  | Operating Expenditure                            |

| OVSF    | Orthogonal Variable Spreading Factor              |
|---------|---------------------------------------------------|
| PAPR    | Peak to Average Power Ratio                       |
| PC      | Personal Computer                                 |
| PLB     | Processor Local Bus                               |
| PR      | Partial Reconfiguration                           |
| PSD     | Power Spectral Density                            |
| PUSC    | Partial Usage of Subchannels                      |
| QAM     | Quadrature Amplitude Modulation                   |
| QPSK    | Quadrature Phase Shift Keying                     |
| RCE     | Relative Constellation Error                      |
| RF      | Radio Frequency                                   |
| RISC    | Reduced Instruction Set Computing                 |
| RM      | Reconfigurable Module                             |
| RMS     | Root Mean Square                                  |
| RP      | Reconfigurable Partition                          |
| SC-FDMA | Single-Carrier Frequency Division Multiple Access |
| SDK     | Software Development Kit                          |
| SDR     | Software Defined Radio                            |
| SEM     | Spectral Emission Mask                            |
| SF      | Spreading Factor                                  |
| SFDR    | Spurious-Free Dynamic Range                       |
| SoC     | System on Chip                                    |
| SoPC    | System on Programmable Chip                       |
| SR      | Software Radio                                    |
| SRC     | Sample Rate Conversion                            |
| SRRC    | Square Root Raised Cosine                         |
| SYSACE  | System Advanced Configuration Environment         |

| TV    | Television                                      |
|-------|-------------------------------------------------|
| UHF   | Ultra High Frequency                            |
| UMTS  | Universal Mobile Telecommunications System      |
| VCD   | Value Change Dump                               |
| VHF   | Very High Frequency                             |
| WCDMA | Wideband Code Division Multiple Access          |
| WiMAX | Worldwide Interoperability for Microwave Access |
| WLAN  | Wireless Local Area Network                     |
| XPA   | Xilinx Power Analyzer                           |
| XPE   | Xilinx Power Estimator                          |
| XPS   | Xilinx Platform Studio                          |

| Declaration      | iii  |
|------------------|------|
| Abstract         | iv   |
| Acknowledgements | vi   |
| Acronyms         | vii  |
| List of Figures  | xvii |
| List of Tables   | xxi  |

| 1 Introduction 1                                              |
|---------------------------------------------------------------|
| 1.1 Motivation                                                |
| 1.2 Related Work                                              |
| 1.2.1 SDR System based on FPGA                                |
| 1.2.2 SDR System with Partial Reconfiguration                 |
| 1.2.3 Digital Front End                                       |
| 1.2.4 TV White Space                                          |
| 1.2.5 Power Consumption Analysis of Partial Reconfiguration 4 |
| 1.3 Development of a Flexible DUC for FPGA Implementation5    |
| 1.4 Contributions                                             |
| 1.5 Thesis Structure                                          |
| 1.6 List of Publications and Talks                            |
| 2 Technology and Communication Background 9                   |
| 2.1 Software Defined Radio                                    |
| 2.1.1 Definition                                              |
| 2.1.2 Architecture                                            |
| 2.1.3 Software Defined Radio for Military Use                 |
| 2.2 Field-Programmable Gate Array14                           |
| 2.2.1 Overview of FPGAs and Related Technologies              |

xii

| 2.2.2 Architecture                   | 15 |
|--------------------------------------|----|
| 2.2.3 Embedded FPGA                  | 20 |
| 2.3 Wireless Communication Standards |    |
| 2.4 TV White Space                   | 25 |

| 3 Dynamic Reconfiguration Technologies in FPGA 2 | 27 |
|--------------------------------------------------|----|
| 3.1 Partial Reconfiguration                      | 27 |
| 3.1.1 Overview                                   | 27 |
| 3.1.2 PR Evolution                               | 29 |
| 3.1.2.1 Difference-based PR Flow                 | 29 |
| 3.1.2.2 Module-based PR Flow                     | 30 |
| 3.1.2.3 Partition-based PR Flow 30               |    |
| 3.2 PR Principle                                 | 31 |
| 3.3 PR Design Considerations                     | 33 |
| 3.4 PR Reconfiguration Methods                   | 35 |
| 3.4.1 Overview                                   | 35 |
| 3.4.2 PR Configuration Control Example           | 36 |
| 3.4.2.1 External PR Control                      | 36 |
| 3.4.2.2 Internal PR Control                      | 38 |
| 3.4.3 PR Analysis 4                              | 12 |
| 3.5 Dynamic Reconfiguration Port                 | 13 |
| 3.5.1 Overview                                   | 13 |
| 3.5.2 DRP Example                                | 15 |
| 3.6 Concluding Remarks 4                         | 17 |

| 4 Digital Front End Design            | 48 |
|---------------------------------------|----|
| 4.1 Overview                          | 48 |
| 4.2 Digital Front End Principle       | 49 |
| 4.3 Digital Up Converter Architecture | 51 |

xiii

| 4.4 Performance Metrics of Digital Up Converter                         | 52 |
|-------------------------------------------------------------------------|----|
| 4.4.1 Spectral Emission Mask                                            | 52 |
| 4.4.2 Error Vector Magnitude                                            | 52 |
| 4.4.3 Adjacent Channel Leakage Ratio                                    | 54 |
| 4.5 DUC Design Considerations                                           | 55 |
| 4.6 DUC Designs for LTE (10 MHz) and (5 MHz)                            | 59 |
| 4.6.1 Filter Design Considerations                                      | 59 |
| 4.6.2 DUC Architecture                                                  | 65 |
| 4.6.3 Performance                                                       | 67 |
| 4.6.3.1 Power Spectral Density of Transmitted signals                   | 67 |
| 4.6.3.2 Error Vector Magnitude                                          | 68 |
| 4.6.3.3 Adjacent Channel Leakage Ratio                                  | 70 |
| 4.6.4 Implementation Results                                            | 71 |
| 4.7 DUC Designs for IEEE 802.16e                                        | 72 |
| 4.7.1 DUC Designs for IEEE 802.16e (10 MHz) and (5 MHz)                 | 73 |
| $4.7.2\ \text{DUC}$ Performance for IEEE 802.16e (10 MHz) and (5 MHz) . | 77 |
| 4.7.2.1 Power Spectral Density of Transmitted Signals                   | 77 |
| 4.7.2.2 Error Vector Magnitude                                          | 78 |
| 4.7.3 DUC Designs for IEEE 802.16e (7 MHz) and (3.5 MHz)                | 79 |
| 4.7.4 DUC Performance for IEEE 802.16e (7 MHz) and (3.5 MHz)            | 83 |
| 4.7.4.1 Power Spectral Density of Transmitted Signals                   | 83 |
| 4.7.4.2 Error Vector Magnitude                                          | 84 |
| 4.7.5 Implementation Results                                            | 85 |
| 4.8 DUC Design for IEEE 802.11n                                         | 86 |
| 4.8.1 DUC Design for IEEE 802.11n (20 MHz)                              | 86 |
| 4.8.2 DUC Performance for IEEE 802.11n (20 MHz)                         | 88 |
| 4.8.2.1 Power Spectral Density of Transmitted Signals                   | 88 |
| 4.8.2.2 Error Vector Magnitude                                          | 89 |
| 4.9 DUC Design for WCDMA                                                | 90 |

| 4.9.1 Filter Design Considerations                    | 90 |
|-------------------------------------------------------|----|
| 4.9.2 DUC Performance for WCDMA                       | 92 |
| 4.9.2.1 Power Spectral Density of Transmitted Signals | 92 |
| 4.9.2.2 Error Vector Magnitude                        | 92 |
| 4.9.2.3 Adjacent Channel Leakage Ratio                | 93 |
| 4.9.3 Implementation Results                          | 94 |
| 4.10 Concluding Remarks                               | 95 |
|                                                       |    |

# 5 TV White Space Application

| 9 | 7 |
|---|---|
| , | 1 |

| 5.1 Overview                                                 | 17  |
|--------------------------------------------------------------|-----|
| 5.2 TV White Space Spectrum Resource                         | )8  |
| 5.3 TV White Space Technical Challenges 1                    | .00 |
| 5.4 IEEE 802.11n DUC Designs for TV White Space              | .04 |
| 5.4.1 DUC Design Considerations                              | .04 |
| 5.4.2 White Space DUC Design for IEEE 802.11n (5 MHz)1       | .06 |
| 5.4.3 White Space DUC Designs for IEEE 802.11n (10 MHz) and  |     |
| (20 MHz)                                                     | .11 |
| 5.4.4 DUC Performance for White Space IEEE 802.11n1          | .12 |
| 5.4.4.1 Power Spectral Density of Transmitted Signals 1      | 13  |
| 5.4.4.2 Error Vector Magnitude 1                             | .15 |
| 5.4.5 Implementation Results 1                               | 16  |
| 5.5 IEEE 802.16e DUC Designs for TV White Space              | .18 |
| 5.5.1 White Space DUC Designs for IEEE 802.16e (5 MHz) and   |     |
| (10 MHz)                                                     | .18 |
| 5.5.2 DUC Performance for White Space IEEE 802.16e (5 MHz)   |     |
| and (10 MHz) 1                                               | 20  |
| 5.5.2.1 Power Spectral Density of Transmitted Signals 1      | 20  |
| 5.5.2.2 Error Vector Magnitude 1                             | .22 |
| 5.5.3 White Space DUC Designs for IEEE 802.16e (3.5 MHz) and |     |

| (7 MHz)                                                                 | 123 |
|-------------------------------------------------------------------------|-----|
| 5.5.4 DUC Performance for White Space IEEE 802.16e (3.5 MHz)            |     |
| and (7 MHz)                                                             | 125 |
| 5.5.4.1 Power Spectral Density of Transmitted Signals                   | 125 |
| 5.5.4.2 Error Vector Magnitude                                          | 126 |
| 5.5.5 Implementation Results                                            | 127 |
| 5.6 LTE DUC Designs for TV White Space                                  | 128 |
| 5.6.1 White Space DUC Designs for LTE (5 MHz) and (10 MHz) $\therefore$ | 129 |
| 5.6.2 DUC Performance for White Space LTE (5 MHz) and                   |     |
| (10 MHz)                                                                | 131 |
| 5.6.2.1 Power Spectral Density of Transmitted Signals                   | 131 |
| 5.6.2.2 Error Vector Magnitude                                          | 132 |
| 5.6.3 Implementation Results                                            | 133 |
| 5.7 Concluding Remarks                                                  | 134 |

# 6 Hierarchical Design

136

| 6.7.1 Floorplanning of the Proposed Architecture on Virtex-5 LX |
|-----------------------------------------------------------------|
| 110T Device                                                     |
| 6.7.2 Scenario Implementations 157                              |
| 6.7.3 Implementation Results                                    |
| 6.8 Implementation Comparisons                                  |
| 6.8.1 Comparison of Proposed DRP-PR Design with Fixed Multiple  |
| Standards Design166                                             |
| 6.8.2 Comparison of Proposed DRP-PR Design with Programmable    |
| Multiple Standards Design                                       |
| 6.8.3 Comparison of the Proposed DRP-PR Design with an          |
| Architecture based on PR only                                   |
| 6.9 Summary                                                     |
| 6.10 Concluding Remarks                                         |

## 7 Power Consumption

180

| 7.1 Overview of FPGA Power Consumption              | 181 |
|-----------------------------------------------------|-----|
| 7.1.1 Static Power Consumption                      | 181 |
| 7.1.2 Dynamic Power Consumption                     | 182 |
| 7.2 Power Analysis Approach                         | 183 |
| 7.2.1 Tools for Analysing Power Consumption         | 183 |
| 7.2.2 XPA Power Analysis Methodology                | 184 |
| 7.2.3 Elements of Dynamic Power Consumption         | 186 |
| 7.3 Key Factors in the Process of PR                | 188 |
| 7.3.1 Area Constraints                              | 188 |
| 7.3.2 Clock Distribution Mechanisms                 | 190 |
| 7.4 Implementation Results and Analysis             | 194 |
| 7.4.1 Conventional Implementation Results           | 194 |
| 7.4.2 PR Implementation Results                     | 195 |
| 7.4.3 Further Analysis of Dynamic Power Consumption | 201 |
|                                                     |     |

| 7.5 Static Power Consumption                  | 202 |
|-----------------------------------------------|-----|
| 7.6 Power Optimisation of DRP-PR Architecture | 205 |
| 7.7 Concluding Remarks                        | 208 |

| 8 Conclusions and Future Work | 210 |
|-------------------------------|-----|
| 8.1 Conclusions               | 210 |
| 8.2 Future Work               | 216 |
| 8.3 Final Remarks             | 218 |

## Reference

## 219

# List of Figures

| Figure 1: Comparison Between SDR and SR in Frequency Spectrum and             |      |
|-------------------------------------------------------------------------------|------|
| Architecture                                                                  | . 11 |
| Figure 2: Architecture of SDR                                                 | . 12 |
| Figure 3: SDR Architecture Defined by JTRS [155]                              | . 13 |
| Figure 4: Xilinx FPGA Architecture                                            | . 16 |
| Figure 5: Architecture of CLB                                                 | . 17 |
| Figure 6: Architecture of Advanced DCM Primitive                              | . 19 |
| Figure 7: Xilinx Embedded Design Flow                                         | . 21 |
| Figure 8: Evolution of Wireless Communication Standards [22]                  | . 22 |
| Figure 9: Wireless Communication Spectrum in the USA                          | . 26 |
| Figure 10: Partial Reconfiguration Illustrative Example                       | . 29 |
| Figure 11: Configuration Process Comparison: (a) Conventional Configuration   |      |
| Mode; (b) PR Mode                                                             | . 33 |
| Figure 12: Virtex-5 LX 110T Configuration Architecture                        | . 34 |
| Figure 13: PR Implementation Methods: (a) External PR; (b) Internal PR        | . 35 |
| Figure 14: External PR Architecture                                           | . 36 |
| Figure 15: Implementation of the External PR                                  | . 37 |
| Figure 16: External PR Results: (a) Lowpass Filter Coefficients; (b) Highpass |      |
| Filter Coefficients; (c) Bandpass Filter Coefficients                         | . 38 |
| Figure 17: Internal PR Architecture                                           | . 39 |
| Figure 18: Implementation for the Internal PR                                 | 40   |
| Figure 19: Internal PR Results: (a) Lowpass Filter Coefficients; (b) Highpass |      |
| Filter Coefficients; (c) Bandpass Filter Coefficients; (d) Blanking           | 41   |
| Figure 20: Architecture of DRP                                                | . 44 |
| Figure 21: DRP Clock Frequency Results                                        | . 46 |
| Figure 22: The Front End Architecture in the Transmitter                      | 50   |
| Figure 23: IF and Baseband Sampling Rate                                      | . 50 |
| Figure 24: Architecture of DUC                                                | 51   |
| Figure 25: EVM Definition                                                     | . 53 |
| Figure 26: DUC EVM Measurement Flow for OFDM-based Design                     | . 53 |
| Figure 27: 1st and 2nd Adjacent Carrier for LTE-LTE Coexistence               | . 55 |
| Figure 28: Sample Rates for Various Standards and Modes                       | . 56 |
| Figure 29: DSP48E Architecture for Complex Multiplication                     | 58   |
| Figure 30: Channel Filter Design using FDATool                                | 61   |
| Figure 31: HB Filter Design using FDATool                                     | . 62 |
| Figure 32: Magnitude Response of Each Filter: (a) Channel Filter; (b) 1st HB  |      |
| Filter; (c) 2nd HB Filter                                                     | . 64 |
| Figure 33: Overall LTE (10 MHz) DUC Filter Response                           | 64   |
| Figure 34: Architecture of LTE (10 MHz) DUC                                   | 65   |
| Figure 35: DDS Output with 15 MHz ( $F_s = 61.44$ MHz)                        | 66   |

| Figure 36: Architecture of LTE (5 MHz) DUC                                                                                                          | 67  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| Figure 37: DUC Transmission Spectrum of LTE (10 MHz)                                                                                                | 68  |
| Figure 38: DUC Transmission Spectrum of LTE (5 MHz)                                                                                                 | 68  |
| Figure 39: Received Constellation for LTE (10 MHz) with 64-QAM, without                                                                             |     |
| AWGN                                                                                                                                                | 70  |
| Figure 40: SEM Requirements with Safety Margin for IEEE 802.16e (10 MHz)                                                                            |     |
| and (5 MHz)                                                                                                                                         | 74  |
| Figure 41: Architecture of IEEE 802.16e (10 MHz) DUC                                                                                                | 75  |
| Figure 42: Architecture of IEEE 802.16e (5 MHz) DUC                                                                                                 | 76  |
| Figure 43: DUC Transmission Spectrum of IEEE 802.16e (10 MHz)                                                                                       | 77  |
| Figure 44: DUC Transmission Spectrum of IEEE 802.16e (5 MHz)                                                                                        | 78  |
| Figure 45: Received Constellation for IEEE 802 16e (10 MHz) with 16-OAM                                                                             | , 0 |
| without AWGN                                                                                                                                        | 79  |
| Figure 46: SEM Requirements with Safety Margin of Type G for IEEE 802 16e                                                                           |     |
| (7 MHz) and (3 5 MHz)                                                                                                                               | 80  |
| Figure 47 <sup>•</sup> Architecture of IEEE 802 16e (7 MHz) DUC                                                                                     | 81  |
| Figure 48: Architecture of IEEE 802 16e (3.5 MHz) DUC                                                                                               | 82  |
| Figure 49: DUC Transmission Spectrum of IEEE 802 16e (7 MHz)                                                                                        | 83  |
| Figure 50: DUC Transmission Spectrum of IEEE 802.16e (3.5 MHz)                                                                                      | 84  |
| Figure 51: Received Constellation for IEEE 802 16e (7 MHz) with 16-OAM                                                                              | 01  |
| without AWGN                                                                                                                                        | 85  |
| Figure 52: Architecture of IEEE 802 11n (20 MHz) DUC                                                                                                | 88  |
| Figure 53: DUC Transmission Spectrum of IFFF 802 11n (20 MHz)                                                                                       | 88  |
| Figure 54: Received Constellation for IEEE 802 11n (20 MHz) with 64-OAM                                                                             | 00  |
| without AWGN                                                                                                                                        | 89  |
| Figure 55: Architecture of WCDMA DUC                                                                                                                | 91  |
| Figure 56: DUC Transmission Spectrum of WCDMA                                                                                                       | 97  |
| Figure 57: Received Constellation for WCDMA with OPSK without AWGN                                                                                  | 93  |
| Figure 58: SEM Requirements of IEEE 802 11n (5 MHz) and White Space                                                                                 | 102 |
| Figure 50: Combined SEM Requirements of White Space and IEEE 802.11n                                                                                | 102 |
| (5MHz)                                                                                                                                              | 102 |
| Figure 60: Comparison of Different IE Sample Pates in TV White Space                                                                                | 102 |
| Figure 60: Comparison of Different in Sample Rates in TV white Space                                                                                | 105 |
| Figure 61: Architecture of IEEE 802.1111 (5 MHZ) DOC for 1 V white Space                                                                            | 107 |
| Output                                                                                                                                              | 100 |
| Figure 62: Magnitude Desponse of Each Filter: (a) Channel Filter: (b) 1st HP                                                                        | 100 |
| Filter: (a) 2nd HB Filter: (d) 3rd HB Filter: (a) 4th HB Filter                                                                                     | 110 |
| Finel, (c) 2nd HD Finel, (d) 5nd HD Finel, (e) 4ni HD Finel                                                                                         | 110 |
| Posponso                                                                                                                                            | 111 |
| Figure 65: Architecture of IEEE 202 11n (10 MUz) DUC for TV White Space                                                                             | 117 |
| Figure 05. Architecture of IEEE 802.1111 (10 MHz) DUC for TV White Space.                                                                           | 112 |
| Figure 60: Architecture of IEEE 602.1111 (20 MIIZ) DUC 101 1 V White Space .<br>Figure 67: DUC Transmission Spectrum of IEEE 902.11n (5 MUz) for TV | 112 |
| Tigure 07. DOC Transmission spectrum of TEEE 802.110 (5 MHZ) for 1 V                                                                                | 112 |
| white space                                                                                                                                         | 113 |

| Figure  | 68:      | DUC Transmission Spectrum of IEEE 802.11n (10 MHz) for TV<br>White Space | 114  |
|---------|----------|--------------------------------------------------------------------------|------|
| Figure  | 69·      | DUC Transmission Spectrum of IEEE 802 11n (20 MHz) for TV                | 117  |
| I Iguie | 07.      | White Space                                                              | 115  |
| Figure  | 70:      | Received Constellation for IEEE 802.11n (5 MHz) with 64-QAM,             | 110  |
| 0       |          | without AWGN                                                             | 116  |
| Figure  | 71:      | Architecture of IEEE 802.16e (5 MHz) DUC for TV White Space              | 118  |
| Figure  | 72:      | Architecture of IEEE 802.16e (10 MHz) DUC for TV White Space .           | 119  |
| Figure  | 73:      | DUC Transmission Spectrum of IEEE 802.16e (5 MHz) for TV                 |      |
| e       |          | White Space                                                              | 121  |
| Figure  | 74:      | DUC Transmission Spectrum of IEEE 802.16e (10 MHz) for TV                |      |
| e       |          | White Space                                                              | 121  |
| Figure  | 75:      | Received Constellation for IEEE 802.16e (5 MHz) with 16-QAM,             |      |
|         |          | without AWGN                                                             | 122  |
| Figure  | 76:      | Architecture of IEEE 802.16e (3.5 MHz) DUC for TV White Space .          | 123  |
| Figure  | 77:      | Architecture of IEEE 802.16e (7 MHz) DUC for TV White Space              | 123  |
| Figure  | 78:      | DUC Transmission Spectrum of IEEE 802.16e (3.5 MHz) for TV               |      |
|         |          | White Space                                                              | 125  |
| Figure  | 79:      | DUC Transmission Spectrum of IEEE 802.16e (7 MHz) for TV                 |      |
|         |          | White Space                                                              | 126  |
| Figure  | 80:      | Received Constellation for IEEE 802.11n (3.5 MHz) with 16-QAM,           |      |
|         |          | without AWGN                                                             | 127  |
| Figure  | 81:      | Architecture of LTE (5 MHz) DUC for TV White Space                       | 129  |
| Figure  | 82:      | Architecture of LTE (10 MHz) DUC for TV White Space                      | 129  |
| Figure  | 83:      | DUC Transmission Spectrum of LTE 5 MHz for TV White Space                | 131  |
| Figure  | 84:      | DUC Transmission Spectrum of LTE (10 MHz) for TV White Space             | 132  |
| Figure  | 85:      | Received Constellation for White Space LTE (5 MHz) with 64-QAM           |      |
|         |          | without AWGN                                                             | 133  |
| Figure  | 86:      | Conventional Transmitter Chain in SDR System                             | 137  |
| Figure  | 87:      | Architecture of Fixed Multiple Standards Design in SDR System            | 139  |
| Figure  | 88:      | Architecture of Programmable Multiple Standards Design in SDR            |      |
|         |          | System                                                                   | 141  |
| Figure  | 89:      | Architecture of Multiple Clock Oscillators with PR in the SDR            |      |
|         |          | System                                                                   | 144  |
| Figure  | 90:      | Architecture of Normalised Clock Oscillator with PR in the SDR           |      |
|         |          | System                                                                   | 145  |
| Figure  | 91:      | Architecture of DRP-PR Design Method in SDR System                       | 147  |
| Figure  | 92:      | Hierarchical Design Methodology for SDR Transmitter Architecture         | 151  |
| Figure  | 93:      | An Implementation of SDR Architecture with PR-DRP to Support             |      |
| ъ.      | <u>.</u> | Multiple Standards                                                       | 153  |
| Figure  | 94:      | Floorplanning of the SDR Architecture in PlanAhead                       | 157  |
| Figure  | 95:      | Mapper DCM Clock Frequency Results for the Proposed Three                | 1.00 |
|         |          | Scenarios                                                                | 160  |

| Figure 96: Implementation Results for Scenario 1                        | 161 |
|-------------------------------------------------------------------------|-----|
| Figure 97: Implementation Results for Scenario 2                        | 162 |
| Figure 98: Implementation Results for Scenario 3                        | 163 |
| Figure 99: Filter Designs of LTE (10 MHz) and (5 MHz)                   | 167 |
| Figure 100: Fixed Multiple Standards Design Architecture                | 169 |
| Figure 101: DRP-PR Architecture to Support Multiple Standards and Modes | 171 |
| Figure 102: Hardware Utilisation Comparison of Two Methods              | 171 |
| Figure 103: Block Scheme for the Transmitter based on Proposed DRP-PR   |     |
| Architecture                                                            | 176 |
| Figure 104: Transistor Leakage Current [184]                            | 182 |
| Figure 105: Power Analysis with XPower Analyzer for Conventional FPGA   |     |
| Design                                                                  | 185 |
| Figure 106: Power Analysis with XPower Analyzer for PR Designs          | 185 |
| Figure 107: Xpower Environment                                          | 186 |
| Figure 108: Dynamic Power Consumption of TV White Space IEEE 802.11n    |     |
| (20 MHz)                                                                | 187 |
| Figure 109: Defined Area Constraints 1–6                                | 189 |
| Figure 110: Clock Resources of Virtex-5 LX 110T Device                  | 191 |
| Figure 111: Clock Routings of LTE (10 MHz): (a) Global Clock Routing,   |     |
| (b) Regional Clock Routing, (c) Local Clock Routing                     | 192 |
| Figure 112: Actual Clock Routing of PR Design on Virtex-5 Device        | 193 |
| Figure 113: Range of TV White Space IEEE 802.11n (5 MHz) with Area      |     |
| Constraint 1                                                            | 196 |
| Figure 114: Clock Routings Comparison: (a) Conventional FPGA            |     |
| Implementation without Area Constraints; (b) Global Clock               |     |
| Routing PR Implementation with Area Constraint 3                        | 200 |
| Figure 115: Static Power Consumption for Virtex-5 LX Devices            | 203 |
| Figure 116: Floorplanning of DRP-PR Architecture on XC5VLX50T, showing  |     |
| overlapping of PR modules in one clock region                           | 205 |
| Figure 117: Floorplanning Optimisation                                  | 206 |
| Figure 118: Benefits Diagram of the Proposed DRP-PR Architecture        | 214 |
| Figure 119: Future Hierarchical Design Methodology for SDR Transmitter  |     |
| Architecture                                                            | 218 |

## List of Tables

| Table 1: PR Performance Comparison 43                                      |
|----------------------------------------------------------------------------|
| Table 2: DUC Design Parameters 57                                          |
| Table 3: SEM Requirements for LTE (10 MHz) and (5 MHz) Bandwidth           |
| Table 4: LTE OFDM Symbol Properties 60                                     |
| Table 5: Two Channel Filter Designs for LTE (10 MHz) 60                    |
| Table 6: HB Filter Designs for LTE (10 MHz) 61                             |
| Table 7: Implementation Results Comparison for LTE (10 MHz) 62             |
| Table 8: EVM Requirements of LTE 69                                        |
| Table 9: DUCs of LTE (10 MHz) and (5 MHz) Performance Metrics    71        |
| Table 10: Hardware Utilization of DUCs for LTE (10 MHz) and (5 MHz)        |
| Table 11: SEM Requirements of IEEE 802.16e (10 MHz) and (5 MHz)            |
| Bandwidth                                                                  |
| Table 12: A Subset of Subcarrier Parameters for Different OFDMA Zone Type. |
| Table 13: IEEE 802.16e (10 MHz) and (5 MHz) OFDM Symbol Properties 75      |
| Table 14: Channel Filter Design Parameters for IEEE 802.16e (10 MHz)       |
| Table 15: HB Filter Designs for IEEE 802.16e (10 MHz) and (5 MHz)          |
| Table 16: SEM Requirements of Type G for IEEE 802.16e (7 MHz) and          |
| (3.5 MHz) Bandwidth                                                        |
| Table 17: IEEE 802.16e (7 MHz) and (3.5 MHz) OFDM Symbol Properties 81     |
| Table 18: Channel Filter Designs for IEEE 802.16e (7 MHz)                  |
| Table 19: HB Filter Designs for IEEE 802.16e (7 MHz) and (3.5 MHz)82       |
| Table 20: SEM Requirements for IEEE 802.11n (20 MHz) Bandwidth    86       |
| Table 21: Hardware Utilisation of DUCs for IEEE 802.16e 86                 |
| Table 22: IEEE 802.11n (20 MHz) OFDM Symbol Properties87                   |
| Table 23: Filter Designs for IEEE 802.11n (20 MHz)87                       |
| Table 24: SEM Requirements for WCDMA 90                                    |
| Table 25: Filter Designs for WCDMA 91                                      |
| Table 26: DUCs for WCDMA Performance Metrics 94                            |
| Table 27: Hardware Utilization of DUCs for IEEE 802.11n and WCDMA   94     |
| Table 28: TV Channels for fixed and portable devices in the USA 100        |
| Table 29: SEM Requirements for white space devices Defined by FCC    101   |
| Table 30: DUC Design Parameters for TV White Space 106                     |
| Table 31: Operation Modes for I and Q Channel Outputs 108                  |
| Table 32: DUC Filters for IEEE 802.11n (5 MHz)                             |
| Table 33: Hardware Utilisation of white space DUCs for IEEE 802.11n        |
| Table 34: DUC Filters for IEEE 802.16e (5 MHz)                             |
| Table 35: DUC Filters for IEEE 802.16e (3.5 MHz)      124                  |
| Table 36: Hardware utilisation of white space DUCs for IEEE 802.16e        |
| Table 37: DUC Filters for LTE (5 MHz)130                                   |
| Table 38: Hardware Utilisation of white space DUCs for LTE 134             |

| Table 39: Modulation Parameters 1                                             | 49  |
|-------------------------------------------------------------------------------|-----|
| Table 40: Scenario Implementations based on DRP-PR Architectures 1            | 58  |
| Table 41: Hardware Resource Utilisation without PR 1                          | 64  |
| Table 42: Hardware Resource Utilisation with PR 1                             | 65  |
| Table 43: Hardware Usage for HB Filters 1                                     | 68  |
| Table 44: Partial Bitstream Size on Virtex-5 LX 110T Device 1                 | 73  |
| Table 45: Comparison of Reconfiguration Design Methods based on FPGA 1        | 77  |
| Table 46: Conventional Implementation Results without Area Constraint 1       | 95  |
| Table 47: Implementation Results with Global Clock Routings and Area          |     |
| Constraints 1–6 1                                                             | 97  |
| Table 48: Analysis of Clock and Dynamic Power Reduction 2                     | 201 |
| Table 49: Information of Virtex-5 LX FPGAs 2                                  | 203 |
| Table 50: Comparison Results between Original and Optimised Floorplannings. 2 | 207 |

# **Chapter 1**

## Introduction

## 1.1 Motivation

With the proliferation of wireless communication standards, both commercial equipment manufacturers and customers will benefit from a new kind of device which can support many communication standards on a single hardware platform, and be updated with ease. The Software Defined Radio (SDR) is a platform that is capable of implementing a variety of communication standards and can be widely applied, e.g. in civilian areas, military sectors and in space applications.

Ultimately customers obtain benefits from SDR: they are able to receive waveform expansion for emerging standards or service updates by downloading the relevant software, rather than acquiring new hardware. Equipment manufacturers can focus on software development to provide better services and updates to customers [20]. Therefore, the production life of the SDR system can be prolonged and a large proportion of the costs associated with maintenance and updating can be saved. The SDR system can even be applied in space applications to allow the satellite communication system to be upgraded through supplying the relevant software remotely [10] [11].

In summary, multiple standards and services can converge on the SDR system, and standards can be readily switched or updated according to users' requirements, without the requirement to purchase new hardware. Therefore, the programmable SDR platform will be increasingly attractive to the consumer in the future [9] [10] [51].

### **1.2 Related Work**

#### 1.2.1 SDR System based on FPGA

Over the past decade, the design of SDR platforms has been widely investigated by a number of researchers, [140]–[182]. In terms of SDR implementation in the physical layer, the Joint Tactical Radio System (JTRS) proposed the SDR architecture for military use, involving a combination of Field Programmable Gate Arrays (FPGAs), Digital Signal Processors (DSPs) and General Purpose Processors (GPPs). This architecture is able to switch waveform functionalities and meet the requirements of SDR, but is less well suited to the cost-sensitive consumer market [153] [155].

Since the FPGA has great reprogrammability and can perform complex calculations in parallel, it is often used in SDR systems. However, reconfiguration overhead (time required to reprogram the whole device) is significant, and therefore conventional FPGA reconfiguration cannot meet the requirements of a real-time SDR device because it has strict limitations on the degree of module switching flexibility (in terms of time to switch) according to [140], [142], [149], [157] and [159]–[182]. In addition, the FPGA is disrupted during this period because conventional FPGA reconfiguration must halt operation of the entire device. Therefore, standard or mode switching based on conventional reprogrammability suffers from a time-consuming reconfiguration overhead and disruption to the operation of the FPGA, even if only small modifications need to be made to the design.

#### **1.2.2 SDR System with Partial Reconfiguration**

Xilinx, the leading FPGA company, provides a dynamic reconfiguration technology, referred to as *Partial Reconfiguration* (PR), which allows one or more

parts of the FPGA to be reconfigured on the fly while the rest continue to operate unaffected [29]. This enables the end user to dynamically change functionalities by downloading different partial bitstream files, resulting in a higher degree of operational flexibility [30] [32]. In addition, PR has the potential to solve the problems of unacceptably long reconfiguration overhead times because only the necessary sections of the FPGA are reconfigured. Furthermore, PR has commonality with SDR in its core concept: to share the hardware resource and support multiple radio functionalities to the maximum extent. As a result, a number of studies concerning PR enabled SDR architectures have been published in recent years [140]–[152] and [154]–[156].

The studies concerning PR enabled SDR architectures focus on baseband processing components to support multiple standards. The essence of standard switching is only considered to swap baseband functionalities, for instance coders, mappers and modulations, and thus some design parameters, such as the clock frequencies for these functionalities, do not need to change. However when the Digital Front End (DFE) component is considered, standard or mode switching means that not only the processing logic, but also the clock frequency, has to be reconfigured. Normally PR design does not directly dynamically change the clock frequency for a given input oscillator, because the Digital Clock Manager (DCM) used to synthesise the clock is not implemented in reconfigurable logic. Consequently, in isolation PR is insufficient to implement all aspects of switching radio functionalities.

#### **1.2.3 Digital Front End**

With the development of wireless communication standards and FPGA hardware, the digital processing component can be extended from baseband processing to the Intermediate Frequency (IF) processing, which can provide better performance and a more flexible carrier frequency at lower cost when compared to the conventional transmitter architecture, according to [45], [46] and [49]–[51]. Therefore, SDR is required to support multiple standards and swap functionalities not only in the baseband but also in the DFE processing components.

The DFE architecture in the transmitter chain is referred to as the Digital Up Converter (DUC). Design methods of DUC architecture for various communication standards, such as Long Term Evolution (LTE), Institute of Electrical and Electronics Engineers (IEEE) 802.16e and Wideband Code Division Multiple Access (WCDMA), have been investigated by a number of researchers and companies, e.g. [61] [70]–[81].

#### 1.2.4 TV White Space

The frequency spectrum available for wireless communication is a limited resource all over the world, and the usage of spectrum is usually regulated locally by regulatory institutions in specific countries. For example, broadcast Television (TV) services occupy spectrum which is licensed by regulatory institutions. With the conversion from analogue to digital television, a large amount of licensed spectrum is being released for other uses, and this is often referred to as "TV white space". White space spectrum resources are at low frequencies (e.g. 54 MHz–698 MHz in the USA), and provide significant advantages for wireless communication, including long range and good indoor penetration.

TV white space technical challenges were presented in [110]. Authors in [117] declared that the IEEE 802.11, IEEE 802.16 and LTE standards have the potential to be deployed in the white space spectrum. A number of TV white space devices which can detect and utilise white space spectrum for communication transmission were introduced in [123]–[126]. However, DUC designs for TV white space applications have not been considered before, to the best of the author's knowledge.

#### 1.2.5 Power Consumption Analysis of Partial Reconfiguration

The power consumption of FPGA designs has been investigated extensively in the research community. However, only [161], [175], [176], [190]–[194], and [200] have focused on the power consumption relating to PR implementation. Authors in [161],

[175], [176], [192], [185], and [200] reported that PR techniques enable the operation of functionalities with time division multiplexing and thus the entire system design can be reduced in size and implemented on the smaller device, which is the main reason that PR can reduce power consumption significantly; specifically, this relates to static power consumption. The potential impact of PR on dynamic power consumption has received comparatively little attention. It has also been proved with measurements that power consumption during the reconfiguration process can be reduced by downloading the partial bitstream files as opposed to fully reconfiguring the device [176], [190]–[193].

## **1.3 Development of a Flexible DUC for FPGA Implementation**

In this thesis, an efficient design method for DUC architectures is proposed based on the existing DUC design methods in the DFE area. Futhermore, the proposed method can also be applied to TV white space DUC designs to enable the proposed SDR system to support more standards and modes. The novel physical layer architecture for SDR combines two dynamic reconfiguration technologies: PR and Dynamic Reconfiguration Port (DRP). The technique of DRP can reconfigure the DCM output frequency while the system is operating, and thus can be combined with PR to address the difficulties of communication standard or mode switching in terms of clock frequency and dependent functionalities.

In addition, a study of dynamic power consumption is undertaken based on implementation with the latest PR design flow. The dynamic power consumption of PR is mainly analysed in terms of reconfigurable region size. Analysis is undertaken using the proposed DRP-PR architecture with a large number of functional modules.

The results obtained demonstrate that the proposed DRP-PR architecture can provide significant improvements in terms of supporting multiple standards, hardware resource usage, degree of design and implementation flexibility, expansibility and power consumption compared to other SDR architectures based on FPGAs.

## **1.4 Contributions**

The main contributions of this work are as follows:

• By considering both reconfigurable modulation and DFE components, this study constitutes an extension to conventional SDR architectures, which often consider only baseband processing.

• An efficient DUC design method is proposed based on a review of existing FPGAbased DUC designs. Moreover, the proposed design approach can be applied to TV white space DUC designs. TV white space DUC designs targeting IEEE 802.11n, IEEE 802.16e and LTE standards have not been considered before to the best of the author's knowledge.

• This study firstly proposes an SDR architecture based on a single FPGA which employs the combination of DRP and PR technologies to address difficulties of communication standard or mode switching in terms of clock frequency and dependent functionalities in the SDR system.

• A hierarchical design methodology for the SDR system based on the DRP-PR approach is proposed in accordance with the functions in the transmitter chain. In addition, this methodology can be applied to any wireless communication standards and can integrate emerging or new standard into the SDR system conveniently.

• The SDR architecture proposed is capable of supporting 4 standards and 17 modes in total: LTE (5MHz and 10 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz), IEEE 802.11n (20 MHz), WCDMA, TV white space IEEE 802.11n (5MHz, 10 MHz and 20 MHz), TV white space IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and TV white space LTE (5 MHz and 10 MHz). Both LTE and IEEE 802.16e are viewed as being beyond the third generation (3G) standards. TV white space applications for wireless communication open a new interesting area where SDR can be applied, and are currently research hot topics.

• The power consumption of PR is analysed, primarily in terms of reconfigurable region size, for a large number of functional modules. Minimisation rules for dynamic

power consumption are formed, and applied to the proposed SDR architecture to achieve a further reduction in terms of dynamic power consumption and partial bitstream size.

## **1.5 Thesis Structure**

The rest of this thesis is organised as follows:

**Chapter 2** gives a general background and provides overviews of SDR, FPGA architecture, and wireless communication standards (LTE, IEEE 802.16e, IEEE 802.11n and WCDMA and TV white space), which are referred to throughout the rest of thesis.

**Chapter 3** describes the two dynamic reconfiguration technologies of interest: PR and DRP in detail. Principles and examples for both PR and DRP are presented.

**Chapter 4** starts by introducing the principles of the DFE section of the radio. Following this, DUC architectures, design considerations and performance metrics for LTE (5MHz and 10 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz), IEEE 802.11n (20 MHz) and WCDMA are presented in detail.

In **Chapter 5**, the general concepts and the benefits of TV white space are introduced, and the analysis of DUC designs is extended to TV white space applications. The DUC architectures and design considerations discussed in **Chapter 4** are modified to cater for TV white space applications. TV white space DUCs are designed based on IEEE 802.11n (5MHz, 10 MHz and 20 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and LTE (5 MHz and 10 MHz).

**Chapter 6** proposes an SDR system architecture with the combination of DRP-PR to support 4 standards and 17 modes in total on a single FPGA device, which is the core component of this thesis.

**Chapter 7** analyses PR techniques in terms of power consumption. Power analysis methods for conventional and PR implementations are presented respectively, and PR implementations are analysed with a set of various constraint rules defined.

Minimisation rules of power consumption for PR are obtained, which are applied to optimise the floorplanning of the proposed DRP-PR architecture discussed in **Chapter 6** to obtain improved results in terms of power consumption and bitstream size.

Last but not least, the thesis is concluded and ideas for future work are presented in **Chapter 8**.

### **1.6 List of Publications and Talks**

The following papers have been published based on the research reported in this thesis:

• K. He, L. Crockett and R. Stewart, "Dynamic Reconfiguration Technologies Based on FPGA in Software Defined Radio System". In *Proceeding of Wireless Innovation Forum European Conference on Communications Technologies and Software Defined Radio, (WInnComm–Europe)*, Brussels, Belgium, June 2011, pp.88-95.

• K. He, L. Crockett and R. Stewart, "Dynamic Reconfiguration Technologies Based on FPGA in Software Defined Radio System", *Journal of Signal Processing Systems for Signal, Image, and Video Technology*, Vol.69, Issue 1, June, 2012, pp.75-85. ([2] is an extended version of [1], which includes analysis of IEEE 802.11n 20 MHz)

• R.A. Elliot, M.A. Enderwitz, K. He, F. Darbari, L.H. Crockett, S. Weiss and R.W. Stewart, "Reconfigurable TVWS Transceiver for use in UK and US Markets". In *Proceedings of Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC'2012)*, York, UK, July 9-11, 2012.

### **Invited Talk**

Dynamic Reconfiguration Technologies based on FPGA in Software Defined Radio System, *12th International Software Radio Symposium*, London, UK, June 20-21, 2012.

## Chapter 2

# **Technology and Communication Background**

In this chapter, background information fundamental to the work presented in this thesis, including SDR, FPGA device architectures, wireless communication standards and TV white space, is briefly introduced. The definition and architecture of SDR are presented first, followed by the features and architectures of FPGAs. Then modern wireless communication standards, specially 3rd Generation Partnership Project (3GPP) LTE, IEEE 802.16, IEEE 802.11 and WCDMA are briefly reviewed. The chapter is concluded with a short introduction to TV white space.

## 2.1 Software Defined Radio

### 2.1.1 Definition

As the number of wireless communications standards and services continues to grow, both commercial equipment manufacturers and customers will benefit from a kind of device which can support many communication standards on a single hardware platform, and be updated with ease. This type of device is referred to as the SDR platform. The term SDR derives from the concept of *Software Radio* (SR), which was proposed by Joseph Mitola in 1991 [1] [2]. There are several definitions of SR, among which the definitions of Motila and the SDR Forum are viewed as two of the most

notable ones. Joseph Mitola asserted that the ideal SR could allow the Analogue to Digital Converter (ADC) and Digital to Analogue Converter (DAC) to be placed as close as possible to the antennas, with most of the processing performed digitally and being capable of supporting a variety of communication standards and services through reconfiguration. In addition, he stated that radio functions, e.g. modulation and demodulation, coding and decoding, would be implemented in software on the programmable devices [3].

The SDR Forum is an international non-profit organisation whose studies are focused on the development and deployment of SDR technologies. The SDR Forum provided the definition of ultimate SR in detail: it supports that all of the radio and control functions can be implemented in software and be fully programmable. Similarly, programmability can extend to Radio Frequency (RF) as the ADC and DAC are required to be placed close to the antennas. In addition, standard or function switching can be implemented in milliseconds [4].

However, the ideal SR architecture is still not suitable in practice due to the high requirements, such as an RF sampling rate at the ADC and DAC. Consequently, analogue components such as RF filters are still required. ADC and DAC have to be placed adjacent to the analogue processing components [5]–[7]. Therefore, programmability of the practical SR is restricted to IF processing, and this is also referred to as the SDR. In other words, the SDR is a presently practical version of an ideal SR [8]. The difference of SDR and SR in programmability is illustrated in Figure 2.1.

Considering the factors discussed above, the SDR is such a platform, supporting multiple communication standards and standard switching. The radio functions of SDR are implemented or controlled in software on programmable devices, and thus the physical layer behaviour can be reconfigured in software rather than by acquiring and integrating new hardware. Besides, the ADC and DAC can not be placed close to antennas and thus the programmability of SDR is limited to the IF [1] [4].



Figure 2.1: Comparison Between SDR and SR in Frequency Spectrum and Architecture.

#### 2.1.2 Architecture

The architecture of the SDR is shown in Figure 2.2. It employs a set of programmable hardware devices to perform a variety of radio functionalities, for example coding and decoding, modulation and demodulation, and channel filtering in the baseband and IF sections. All of the radio functionalities are implemented or controlled in software and thus they can be reconfigured by downloading the corresponding software from a communication library to support multiple standards and services without modifying the hardware devices. Since the ADC and DAC are placed after the IF processing components, some analogue components such as analogue filters are still required to bridge the gap from IF to RF. The IF processing component will be discussed in detail in Chapters 4 and 5, and the baseband processing component will be presented in Chapter 6 of this thesis.



Figure 2.2: Architecture of SDR.

#### 2.1.3 Software Defined Radio for Military Use

As discussed in Chapter 1, SDR systems are capable of supporting a variety of communication standards and standard switching can be achieved rapidly. Thus they can be widely applied in miliary applications to provide flexible and interoperable communications for various deployments, such as vehicular, airborne and vessel. The JTRS and Bowman systems are the two most popular software defined radio systems for military use in the world.

The JTRS program was launched by the US Department of Defence in 1998 and can accommodate over 40 miliary communication standards and modes, including global, regional and national standards. In addition, JTRS has real-time ability and interoperability, which allows the device to be updated and adapted rapidly between various communication systems [4] [132].

The SDR architecture defined by JTRS is shown in Figure 2.3. It involves a combination of FPGAs, DSPs and GPPs, and can switch standard functionalities. However, the architecture is based on the dedicated resources model, namely the SDR system has to utilise multiple sets of processing hardware where one set is designed for
each standard (three sets of hardware are considered in this example). Therefore, the SDR system defined by JTRS has a significantly negative impact on weight, power consumption and cost, and it is not well-suited to the cost-sensitive consumer market [153] [155].



Figure 2.3: SDR Architecture Defined by JTRS [155].

Similarly, the UK Ministry of Defence launched the Bowman tactical communication systems and the associated Combat Infrastructure Platform (CIP) in March 2004. The Bowman systems are intended for deployment across the British Armed Forces to increase interoperability, and thus the Army, Royal Air Force and Royal Navy can increase the efficiency and effectiveness of their joint operations and avoid friendly fire incidents [211] [212].

Taking into account all of these considerations, the SDR system is very applicable to military use and is established in this market. However, SDR has not so far been widely applied to commercial radios or standards. This thesis focuses on exploiting the principles of SDR for commercial applications.

## 2.2 Field-Programmable Gate Array

#### 2.2.1 Overview of FPGAs and Related Technologies

There are a number of competing reconfigurable hardware devices to cater for SDR platforms, among which GPPs, DSPs and FPGAs are three of the most prominent ones. GPPs, including most found in commodity Personal Computers (PCs), are able to perform a variety of radio functions in software to support multiple communication standards. The development cycle is short because radio functions are programmed using high level programming languages. However, SDR platforms based on GPPs cannot meet the timing requirements of modern wireless standards, such as IEEE 802.16 [134].

DSPs are a type of processor that have additional optimisation for signal processing applications, such as Multiply–Accumulate (MAC) and multiple memory blocks [128]. DSPs can execute complex algorithms through efficient software programs and the development cycle for DSPs is short. However, DSPs are based on sequential software executions and some complex calculations including Fast Fourier Transform (FFT) and Sample Rate Conversion (SRC) cannot be implemented efficiently. Therefore, they cannot meet the increasing performance requirements associated with the development of wireless communication standards [12].

The FPGA is an array of gates and with programmable interconnections and logic resources, and is fully programmable and reprogrammable after manufacture [13]. FPGAs support hardware operating in parallel, which greatly exceeds the sequential performance of DSPs. Since FPGAs can implement complex calculations with high performance, they are often used as accelerators in SDR systems. However, with the development of FPGA technology, especially embedded processors, FPGAs are now able to implement software algorithms and therefore have become powerful devices in both computation and functions management. In the meanwhile, with the evolution of Intellectual Property (IP) cores and development tools, FPGA design efficiency has been improved and time to market can be reduced dramatically.

The reprogrammability of the FPGA has been improved significantly with the development of dynamic reconfiguration technologies in FPGA, most notably PR and DRP. PR allows underlying hardware functions to be switched with ease, like switching software in a processor. DRP enables output clock frequencies to be dynamically reconfigured given a fixed clock oscillator source. These two dynamic reconfiguration technologies will be introduced in detail in Chapter 3. FPGAs are therefore a highly suitable reconfigurable hardware device for SDR systems, and as a result, the FPGA is chosen as the SDR implementation platform for the work documented in this thesis.

### 2.2.2 Architecture

The target FPGA used throughout this thesis is manufactured by Xilinx and the following review of FPGA architecture and development tools is based on Xilinx products.

The architecture of a generic Xilinx FPGA is illustrated in Figure 2.4. The FPGA consists of Configurable Logic Blocks (CLBs), Input Output Blocks (IOBs), programmable interconnects, Block Random Access Memories (BRAMs), Digital Clock Manager (DCM) and embedded multipliers or DSP48s, which are integrated arithmetic slices primarily including multiply and accumulate functionality. CLBs, IOBs and programmable interconnects are general purpose components in the FPGA, while the rest enhance the performance of the FPGA in memory, clock and digital signal processing aspects, respectively [34]. These components of the FPGA can be further described as follows.

• *IOBs*: The IOBs can be considered as the interfaces between the pins of the FPGA package and internal logic, as defined during the FPGA design process. The pins must meet the voltage standards which are stored in the configuration memory of IOBs. In general, IOBs are grouped into banks so as to manage different voltage standards. Each bank has a dedicated voltage line and thus the IOBs within this bank have the same



Figure 2.4: Xilinx FPGA Architecture.

voltage standard [14].

• *CLBs*: The CLB is a group of several slices and is viewed as the basic logic unit in the FPGA architecture. In Xilinx FPGAs, each CLB contains 4 slices in Virtex-2, Virtex-2 Pro, Virtex-4 and Spartan-3 series. However, the number of slices has been reduced to 2 per CLB for the most recent FPGAs, including Virtex-5, Virtex-6 and Spartan-6 series. The CLB architecture for the Virtex-5 device is shown in Figure 2.5. The slice is composed of combinational function generators, i.e. Look Up Tables (LUTs), storage elements, i.e. Flip-Flops (FFs), and additional features including wide multiplexers, carry logic and arithmetic gates [15]. Every CLB has an adjacent switch matrix which is used to connect the interconnections across the device. In the case of Virtex-5, each slice contains 4 LUTs with 6-bit inputs and 4 FFs. The LUT is based on RAM and can perform one of several functions according to the user's design. The FF is also programmable to provide data with clock synchronisation. The combination of



Figure 2.5: Architecture of CLB.

LUT and FF is also referred to as *logic cell*, which is capable of performing both combinatorial and sequential logic functions. Wide multiplexers are employed to combine LUTs within one slice, or even a number of slices, to perform complex logic functions. Carry chains and arithmetic gates are utilised for arithmetic and Boolean functions. In summary, all of the components in the CLB can be programmed in the process of FPGA reconfiguration.

• *BRAMs*: The BRAM is a type of embedded resource specifically designed to provide dense, high speed storage. It can provide sufficient memory blocks for applications which require a large number of on-chip memories. BRAMs are located in columns adjacent to DSP48s [16]. In the case of Virtex-5, the BRAM is 18 Kbits. BRAMs of size 36 Kbits can be created by combining two 18 Kbits BRAMs together. In addition, multiple BRAMs can be cascaded to cater for a large amount of storage. Furthermore, the BRAM can also be used to bridge the clock domain boundary between two processing units, as will be introduced in detail in Chapter 6.

• DSP48s: The DSP48 is another specialised embedded resource in the fabric of the

FPGA. It can provide high speed arithmetic performance to meet the increasing requirements of digital signal processing. There are several variations on the DSP48: for example the DSP48A in the Spartan 3A, the DSP48E in the Virtex-5, and the DSP48A1 and E1 in the Spartan-6 and Virtex-6, respectively. The DSP48E is able to support 25 x 18 bit multiplication with high performance and can perform multiple operations with time division multiplexing. It consists of multiplier, accumulator, pre-adder and multiple internal registers, which allows the DSP48E to perform one of several functions, such as multiply, multiply add, multiply accumulate, etc. [69]. Methods for making full use of the DSP48E in FPGA designs will be introduced in Chapters 4 and 5.

• *DCM*: The DCM component plays an important role in FPGA design: depending on the application, its task is to eliminate clock skew, synthesise a desired clock frequency, or shift phase. Of these tasks, frequency synthesis could be considered one of the DCM's most significant functions, given that practical FPGA designs often require a different system clock frequency to the available oscillator. The architecture of the advanced DCM primitive is depicted in Figure 2.6.

The "CLKIN" port is used to connect the source clock (i.e. the input from the clock oscillator) to the DCM. The "CLKFB" port is employed to provide the feedback signal to the DCM, and it can only be connected to the "CLK0" port. The "RST" port is to reset the DCM circuit. The "CLK0" port can generate the clock with same frequency as the source clock, while the "CLK 90", "CLK180" and "CLK270" ports can provide the clock with same frequency as the input clock, but with phases shifted by 90°, 180° and 270° respectively. The "CLK2X" port can provide the clock with double the frequency of the clock source, and its phase is also aligned with the clock source. The "CLKDV" port is able to generate a clock signal with different frequencies according to the supplied divide values. The divide value is set within the range from 1.5 to 16, and is set to 2 by default. Similarly, the "CLKFX" port can provide a flexible frequency range based on the ratio of multiplier value to divisor value.



Figure 2.6: Architecture of Advanced DCM Primitive.

The DCM requires a number of clock cycles to process configuration changes and generate the desired clock signal, and the "LOCKED" signal goes low during this period. When the "LOCKED" signal is asserted high, that means the clock signal from the DCM component is ready and valid to use. In the modern FPGAs, Xilinx provides some reconfiguration ports, which are referred to DRPs. A DRP has the ability to dynamically generate various clock frequencies from a fixed clock source [37]. Details of DRP technology will be discussed in Chapter 3.

• *Programmable Interconnects*: The programmable interconnects are used to connect CLBs, IOBs and other resources on the FPGA to ensure that the FPGA is able to perform complex functions. These interconnects are programmable and thus they can provide a widely flexible connections according to the deployed designs.

#### 2.2.3 Embedded FPGA

As discussed earlier in this chapter, FPGAs have the ability to perform complex calculations. In addition, they support the combination of modules, IP cores and interfaces to build a System on Chip (SoC). However, with the increasing complexity of SoCs, one of the drawbacks of FPGA hardware becomes obvious: it cannot implement complex algorithms to control the entire system flexibly and efficiently. In order to solve this problem, FPGA vendors have integrated embedded processors into their FPGA devices. Embedded processors allow designers to combine hardware and software designs on one FPGA device, resulting in System on Programmable Chip (SoPC). A number of control instructions can be implemented by embedded processors to connect and control various functional modules or IP cores through special buses. Therefore, FPGA system designs incorporating both hardware and software can increase programmable flexibility and overall system performance, and thus meet various users' requirements in many areas [17].

With regard to the Xilinx devices, four types of embedded processors are provided: PicoBlaze, MicroBlaze, PowerPC and Advanced RISC Machine (ARM), and all of these embedded processors are based on the Reduced Instruction Set Computing (RISC) architecture. The PicoBlaze and MicroBlaze are soft cores with 8-bit and 32bit respectively (soft cores are implemented entirely using existing programmable logic resource on FPGA). PowerPC is a hard core (i.e. dedicated silicon resource) of 32-bits, which can provide higher clock rates, better performance and less power consumption than equivalent soft cores [17]. Embedded ARM cores are the most recent technology for Xilinx. The ARM core is a 32-bit microprocessor core with high speed and low power. Combing ARM cores and modern FPGA fabric enables the power, cost and size of the overall system to be minimised, and to achieve an extensible processor-centric system with high performance [18]. The Zynq family provided by Xilinx is the combination of Xilinx 28 nm programmable logic and an industrystandard dual ARM processor [209]. Xilinx provides two useful tools for embedded development: Xilinx Platform Studio (XPS) and Xilinx Software Development Kit (SDK). The simplified embedded design flow is illustrated in Figure 2.7. Users can choose an embedded processor to connect to a variety of hardware modules through designated buses, e.g. Processor Local Bus (PLB) or more recently, Advanced eXtensible Interface (AXI) bus within the XPS environment. Before integration into the embedded project, these hardware modules have to go through strict FPGA development process, e.g. synthesis, implementation and simulations, to ensure that they operate correctly and meet timing requirements. Then all of the hardware information can be exported to SDK for embedded software development. Once development has been completed, software applications and bitstream files are generated, and combined, then can be downloaded to the FPGA for further on-chip verification, or subsequent deployment.



Figure 2.7: Xilinx Embedded Design Flow.

## 2.3 Wireless Communication Standards

Wireless communication has developed rapidly over recent decades, and can be considered a key enabling technology to improve quality of life. The evolution of wireless standards since 1995 is depicted in Figure 2.8. Later in this section, 3GPP LTE, IEEE 802.16, IEEE 802.11 and WCDMA will be introduced in detail.



Figure 2.8: Evolution of Wireless Communication Standards [22].

First generation (1G) mobile systems employed analogue devices for voice service. The second generation (2G) takes advantages of digital devices and thus provides higher date rates, and better mobility than 1G. Global System for Mobile communications (GSM) and Code Division Multiple Access one (CDMAone) are two cellular wireless standards widely applied in 2G systems. Supporters of GSM and the CDMA systems proposed third generation (3G) standards based on their own 2G standards: the Universal Mobile Telecommunications System (UMTS) is the 3G standard based on the GSM system, and is also referred to as WCDMA; CDMA 2000 is an updated version of CDMAone and is another 3G standard. 3G standards can facilitate significantly higher data rates than 2G systems, and thus can support additional wireless services, such as multimedia. LTE is considered as a 3.9G wireless cellular standard [53], and its evolution into LTE-Advanced can provide a data rate reaching as much as 1 Gbps, and thus LTE-Advanced is considered as one of the 4G standards [19]. Taking into account the discussion above, it is clearly seen that data rate is an important factor in distinguishing the generations in cellular wireless standards.

• *LTE*: The goals of LTE developed by 3GPP are to achieve higher data rates, lower latency and greater spectral efficiency. LTE defines two radio access mechanisms: Orthogonal Frequency Division Multiple Access (OFDMA) in the downlink and Single-Carrier Frequency Division Multiple Access (SC-FDMA) in the uplink. LTE can support scalable transmission bandwidths ranging from 1.4 MHz to 20 MHz. The peak data rates of downlink and uplink can reach 300 Mbps and 50 Mbps respectively with  $4 \times 4$  antennas in the case of 20 MHz channel [23]. LTE also takes advantage of techniques such as Orthogonal Frequency Division Multiple X Division MultipleX Division Multiple Input Multiple Output (MIMO), which is the main reason that the data rates of LTE can be significantly increased compared to previous generations.

• *IEEE 802.16*: IEEE 802.16 is also referred to as Worldwide Interoperability for Microwave Access (WiMAX) and is standardised by the IEEE. Conventional broadband access has to employ physical wires such as Digital Subscriber Line (DSL) or optical cables. However, the cost of deployment increases dramatically with the extension of coverage area, and in addition mobility is limited in the wireline broadband access. The IEEE 802.16 standard was initially designed as a Broadband Wireless Access (BWA) standard to provide wireless service to fixed users. However, with the release of IEEE 802.16e, new features were incorporated to support mobile wireless systems, and thus the scope for mobility grew significantly, as shown in Figure 2.8 [24]. There are a number of variations in the IEEE 802.16 standard family,

e.g. IEEE 802.16a, IEEE 802.16d, etc., amongst which IEEE 802.16e would be considered the most popular variation. Like LTE, IEEE 802.16e can also support scalable transmission bandwidths: 1.75 MHz, 3.5 MHz, 7MHz, 14 MHz, 1.25 MHz, 5 MHz, 10 MHz, 15 MHz, 20 MHz and 8.75 MHz. It employs OFDMA in both the downlink and uplink chains. Similarly, IEEE 802.16e also utilises an OFDM symbol structure and MIMO techniques, and thus the peak rate of the downlink and uplink can reach 128 Mbps ( $2 \times 2$  MIMO) and 56 Mbps ( $1 \times 2$  MIMO) respectively, assuming a 20 MHz bandwidth [25]. Also like LTE, IEEE 802.16e can be viewed as one of the 3.9G wireless standards. The latest member in the IEEE 802.16 family is referred to as IEEE 802.16m, which is considered a 4G standard [53].

• *IEEE 802.11*: The Wireless Local Area Network (WLAN) is a network that can provide an access point to allow several devices to connect to the Internet using wireless technology. In addition, it can extend users' mobility so that their devices can move within a limited geographical area while maintaining access to the Internet [26]. Wi-Fi, based on the IEEE 802.11 standard family, is the most well known WLAN technology. Like the IEEE 802.16 standard family, IEEE 802.11 has a number of variations. IEEE 802.11a was released in 1999, and employs an OFDM scheme operating in 5 GHz; and the data rate can reach up to 54 Mbps. The IEEE 802.11b variation uses spread spectrum techniques, i.e. Frequency-Hopping Spread Spectrum (FHSS) or Direct-Sequence Spread Spectrum (DSSS), and provides a data rate of up to 11 Mbps in the 2.4 GHz frequency band. IEEE 802.11n, which incorporates MIMO techniques, is an amendment to the IEEE 802.11 variations proposed before 2008. The peak data rate IEEE 802.11n can achieve is 600 Mbps using  $4 \times 4$  MIMO in a 40 MHz transmission channel [24] and [120]–[122].

• *WCDMA*: As discussed before, WCDMA is one of the 3G cellular standards, and has been widely deployed over 55 countries throughout the world by 2006, for example in Japan, Europe and Asia [208].Unlike the standards introduced above, WCDMA is not based on an OFDM structure, but instead uses a direct sequence spread spectrum technique to spread user information over a wide bandwidth by multiplying with a high

rate chipping sequence. The chip rate is 3.84 Million Chips Per Second (Mcps), resulting in a 5 MHz transmission channel. WCDMA can obtain a peak data rate of 2 Mbps [27].

# 2.4 TV White Space

With the development of wireless communication standards, increasing transmission bandwidth is required to achieve high data rate and thus more frequency spectrum is required to ensure these wireless communication standards can operate. The frequency spectrum available for wireless communication is a limited resource all over the world, and the usage of spectrum is usually regulated locally by regulatory institutions in specific countries. For example, broadcast TV services occupy spectrum which is licensed by regulatory institutions. However, this licensed spectrum is under-utilised [126]. As a result, the potential to better utilise this spectrum is very attractive to improve efficiency and provide more capacity for telecommunication services.

The information of spectrum for broadcast TV and Wi-Fi in the USA is illustrated in Figure 2.9. The range from 54 MHz to 806 MHz is reserved for broadcast TV and only licensed operators can transmit in these bands. With the conversion from analogue to digital television, a large amount of licensed spectrum is being released, and this is often referred to as "TV white space". The TV white space spectrum is fragmented and can vary geographically. The spectrum at 2.4 GHz and 5 GHz can be used for Wi-Fi devices [28], and as shown in Figure 2.9, resides at much higher frequencies. The propagation characteristics of these bands are well suited to short distance communications (up to 100 metres), but less suited to longer distances. The TV white space bands have much longer range (up to 5 miles) and better indoor penetration than WiFi operating in 2.4 GHz or 5 GHz bands [111] [112].



Figure 2.9: Wireless Communication Spectrum in the USA [28].

In summary, TV white space is a portion of broadcast TV spectrum released by the switch from analogue to digital TV, and this spectral resource can be actively used for unlicensed wireless communication, thus improving the spectrum efficiency and providing more capacity for wireless communication. In addition, white space spectrum resources are at low frequencies, and provide significant advantages for wireless communication compared to cellular frequencies in the Gigahertz range and WiFi at 2.4 GHz and 5 GHz. The applications of wireless communication in the TV white space spectrum will be considered in detail in Chapter 5.

# **Chapter 3**

# **Dynamic Reconfiguration Technologies in FPGA**

In this chapter, two dynamic reconfiguration techniques for FPGA designs are introduced: PR and DRP. Both techniques offer attractive benefits in building an SDR system as will be discussed later in this chapter. The definition, evolution, principle and design considerations of PR technique are presented respectively, using reconfigurable filters as an example to demonstrate that PR can be achieved successfully using both external and internal reconfiguration methods. Finally, the principles and architecture of the DRP technique are discussed, and an illustrative example provided.

# 3.1 Partial Reconfiguration

#### 3.1.1 Overview

A property of FPGAs is that they may be completely reconfigured to perform different functionalities by downloading corresponding configuration files. However, disadvantages are the long reconfiguration overhead and low efficiency, as the device has to stop operating during the reconfiguration process. Therefore switching functionality in this way is less well suited to platforms with demanding real-time requirements. Partly to cater for this type of scenario, Xilinx provides an advanced reconfiguration technology referred to as PR, which allows one or several portion(s) of the FPGA to be reconfigured on the fly while the rest continue to operate unaffected [29]. This enables the end user to dynamically change functionalities by downloading different partial bitstream files, resulting in a higher degree of operational flexibility compared to conventional FPGA reconfiguration [30].

The region which is not affected during PR is referred to as the *static region* and the functions implemented within the static region are called *static modules*. The reconfigurable functions which may be swapped in the process of PR are referred to as *Reconfigurable Modules* (RMs). Each physical area of the device on which PR is performed is defined as a *PR region*.

To take an illustrative example, an FPGA device can perform multiple functions, e.g. communication, video and mathematical functions simultaneously, as shown in Figure 3.1. The communication and video functions which will continue to operate during PR are the static modules. In this example, the "mathematical function" is required to perform addition, subtraction, multiplication and division functions and these four functions can be switched according to the user's requirements using the PR technique. The math PR region has four associated RMs, and they share the allocated hardware resources with time multiplexing. The module switching can be implemented by downloading the corresponding partial bitstreams to the device and the partial bitstream files can be generated using PR design tools and flow.



Figure 3.1: Partial Reconfiguration Illustrative Example.

### 3.1.2 PR Evolution

The evolution of PR could be divided into three stages: first difference-based, then module-based and most recently, partition-based PR designs. Xilinx provides the relevant software design tools to cater for each PR design methodology, and the performance and convenience of PR design have increased as the design tools have been developed. These different phases are described in the sections below.

#### 3.1.2.1 Difference-based PR Flow

Only small logic modifications are allowed in the difference-based PR design flow. The design process is as follows: first, the designer has to generate the bitstream as the initial configuration file. Then, the implemented design files can be slightly manually modified by changing BRAM contents and LUT equations. Finally, the designer can create the difference bitstream which involves only the difference between the modified and the initial files. Therefore, users can download the initial configuration file to the device and switch functionalities by downloading the difference bitstream files [31].

#### 3.1.2.2 Module-based PR Flow

Following the difference-based flow, PR developed into a more advanced design method, referred as to Module-based PR or Early Access Partial Reconfiguration (EAPR). This introduced the concepts of reconfigurable and static regions in the FPGA device, and allows multiple RMs to be implemented within the reconfigurable regions. As a result, this PR design methodology is based on modules rather than small logic modifications to an existing architecture, resulting in significantly increased design flexibility.

The communication between static and reconfigurable regions, or between two reconfigurable regions, was a big challenge. Xilinx provided a pre-routed hardware component called a *bus macro* to solve that problem. The bus macro has to be placed to straddle the boundaries to ensure communication between static and reconfigurable regions or two reconfigurable regions. Since the width of a bus macro is 4 bits in the case of Virtex-5 devices, designers have to calculate the number of bus macros required and manually place these components in the PR design tool [32].

#### 3.1.2.3 Partition-based PR Flow

With the release of their 12.x design suite, Xilinx introduced a new PR design flow based on hierarchical design, which offers improvements in timing results and design reuseability compared to previous PR design flows. Two attractive changes were made: the bus macro component was replaced by partition pins, and the tools were improved to permit implementation results of RMs to be preserved and imported to another PR project. Partition pins can be generated automatically by the PR design tool, and users are able to apply timing constraints to the partition pins to ensure PR implementations can be performed successfully with high speed requirements. The preservation of implementation results improves RM reuseability and reduces FPGA implementation time, and thus the development cycle of system design can be shortened [33].

With regard to the latest PR flow [33], the reconfigurable region of the FPGA device on which PR is performed is referred to as the *Reconfigurable Partition* (RP). Each RP is mutually independent of the others in physical implementation. In other words, the logic and functionality of an RP may be swapped using the technique of PR, while the rest of the FPGA device (that is, the other RPs and static logic) can continue their operation unaffected. The RM is defined as the swappable functionality within the RP. One RP may have multiple associated RMs, only one of which occupies the RP at any given time, i.e., they share the allocated hardware resources with time multiplexing. In this thesis, all of the PR examples are based on the latest PR design flow.

## **3.2 PR Principle**

A bitstream file containing all of the configuration information for the FPGA device is downloaded to define its functionality. Normally, it is a mix of instruction and data information, and allows the data to be written to the configurable logic and configure the entire FPGA as desired. In general, it consists of three parts: checksum value, configuration data and header. The header describes the type of operation on configuration logic and the address information [34]. It also includes the command to perform a startup sequence after the bitstream is downloaded to the FPGA to ensure configuration integrity. The startup sequence implies the "DONE" signal (which indicates that the whole configuration process is complete) is released, I/O pins become active and internal reset signal is deasserted [35] [206]. The most important part of the bitstream is the configuration data. It contains the necessary content of configuration and the memory address to which the data write in order to reconfigure the FPGA [35]. The checksum part is used to perform a Cyclic Redundancy Check (CRC) in order to confirm the accuracy of configuration data in the downloading process. The checksum is calculated in advance and added into the bitstream file, then checked against the value calculated after the bitstream is downloaded into the configuration logic. If they are not identical, this indicates that errors have occurred, and the configuration process has to be repeated.

The conventional configuration sequence is shown in Figure 3.2 (a). After the FPGA is powered up and the voltage rises to a stable level, it is ready to accept a bitstream file. After the downloading process, the CRC check and startup processes are performed [35]. When all of these steps are finished, the device takes on user status, which means that the FPGA is performing user designed functions. If the user would like to perform another function, the configuration process described above (apart from the stages of power on and voltage stabilisation), would be repeated.

However, the *partial* bitstream only contains the configuration data and address information of the RM, unlike the conventional bitstream file which contains configuration data for the entire FPGA. Although a conventional configuration file must be downloaded in the first instance to initialise the device, as Figure 3.2 (b) describes, designers could thereafter achieve function switching with ease by downloading partial bitstreams. In general, partial bitstreams are required to reconfigure one or some limited regions of the FPGA and thus the partial bitstream files are much smaller than the conventional configuration file. Another important point to note is that there is no startup process after the partial bitstream is downloaded to the device. As a result, these two points are the main reasons that PR enables the reconfiguration overhead of switch functionalities to be reduced dramatically. Therefore, PR has the potential to meet the requirements of real-time SDR systems.



Figure 3.2: Configuration Process Comparison: (a) Conventional Configuration Mode; (b) PR Mode.

# **3.3 PR Design Considerations**

Defining the floorplanning of the RP could be viewed as one of the most important parts in the process of PR design. Judiciously selecting a reasonable size and position of RP can permit performance optimisation in terms of dynamic power and partial bitstream size, thus avoiding unnecessary reconfiguration overhead. Modern FPGAs are partitioned into clock regions to improve their clock distribution, and in the case of the Virtex-5 device considered here, each clock region is 20 CLBs high, 8 BRAMs, 8 DSP48Es and spans half of the die. The position of the RP with respect to clock regions is important.

An important concept in the floorplanning process is the *frame*. The configuration memory is grouped by columns, and the column can be further divided into subcolumns called frames. The frame is the smallest unit of configuration memory and all operations must be based on whole frames. In terms of the Virtex-5 device, the frame is also distributed by clock region with 20 CLBs height and 1 CLB width, as shown in Figure 3.3 [30] [33] [35]. The partial bitstream only includes information for frames required by the RP, so the number of frames used in the design decides the size of the partial bitstream. Therefore, the size of the RP directly impacts on the reconfiguration overhead. In addition, the position of the RP is also significant, particularly with respect to the clock regions. For example, one RP of a given size can be placed within one clock region or spread across two clock regions; the latter occupies more frames, so the partial bitstream increases dramatically and potentially even doubles.



Figure 3.3: Virtex-5 LX 110T Configuration Architecture.

# 3.4 PR Reconfiguration Methods

#### 3.4.1 Overview

There are two methods of implementing PR reconfiguration: external PR and internal PR, as depicted in Figure 3.4.



Figure 3.4: PR Implementation Methods: (a) External PR; (b) Internal PR.

To achieve external PR, the full and partial bitstreams could be stored in the DSP's or PC's associated memory and these processors could control the downloading of different bitstreams to the FPGA through the Joint Test Action Group (JTAG) port, as shown in the Figure 3.4 (a).

In the case of internal reconfiguration, the FPGA is able to achieve the dynamic functionality switching according to the user's requirements automatically in isolation. It employs embedded processors (e.g. MicroBlaze or PowerPC) to run software to control the reconfiguration within the FPGA device. The software application enables

bitstream files stored in the Compact Flash (CF) card to write to the Internal Configuration Access Port (ICAP) so as to perform internal PR.

### 3.4.2 PR Configuration Control Example

Since Finite Impulse Response (FIR) filters play an important role in wireless communication systems, reconfigurable FIR filters are taken as an example of performing PR implementations with the two methods respectively.

### 3.4.2.1 External PR Control

The entire architecture of external PR is shown in Figure 3.5. There is one RP involving three associated RMs: lowpass, highpass and bandpass filters. The signal generator component is used to create a unit impulse signal to verify the coefficients of each filter design, and the ChipScope component provided by Xilinx is employed to monitor the on-chip signals after downloading partial bitstreams to the device.



Figure 3.5: External PR Architecture.

Figure 3.6 shows the implementation of the lowpass filter on the Virtex-5 LX 110T device. The filter RP is highlighted by the yellow box, and the ChipScope component (highlighted by the green box) is placed close to the filter RP to receive the data.



Figure 3.6: Implementation of the External PR.

The results comparison is shown in Figure 3.7. In each sub-figure, (a), (b) and (c), the upper results (black background) were obtained using the WaveScope component in System Generator to verify the coefficients of the different types of filters, while the lower results (white background) were captured by the ChipScope Analyzer after downloading the corresponding partial bitstream files and applying the same input stimulus. Based on the on-chip verifications presented, it is clearly seen that reconfigurable FIR filters with external PR control have been implemented successfully.



Figure 3.7: External PR Results: (a) Lowpass Filter Coefficients; (b) Highpass Filter Coefficients; (c) Bandpass Filter Coefficients.

### 3.4.2.2 Internal PR Control

The entire architecture of internal PR is described by Figure 3.8. The MicroBlaze is the embedded processor used to run software to control FPGA reconfiguration, and the PLB is used to transfer data and instructions between the peripherals and processor. The terminal peripheral is responsible for showing messages on the host PC via a serial port. Since bitstream files have to be stored in the CF card, it is necessary to place System Advanced Configuration Environment (SYSACE) interfaces between the PLB



Figure 3.8: Internal PR Architecture.

and CF card peripheral, which allows the processor to access files stored on the CF card through the SYSACE interfaces [36]. Similarly, the processor is able to send the configuration data to the ICAP through the HWICAP interfaces. The filter RP involving three types of filters could be connected to the MicroBlaze through the Intellectual Property Interface (IPIF), which is customised according to the filter RP. The placed and routed result is illustrated in Figure 3.9.

Although the filter RP has the same size and position in both PR designs, the internal PR design is more complex than the external PR design. MicroBlaze is a soft core and occupies a large number of programmable logic resources, e.g. slices and BRAMs. Similarly, the ICAP and its interface account for a number of slices and BRAMs. Internal PR can thus allow an FPGA to control and perform PR in isolation, at the cost of more complex architecture and increased hardware resource utilisation.



Figure 3.9: Implementation for the Internal PR.

The PR test results are shown in Figure 3.10. Partial bitstreams can be controlled in software to perform corresponding reconfiguration. The coefficients of lowpass, highpass and bandpass filters are shown in Figures 3.10 (a), (b) and (c) respectively. Based on the results obtained, it is demonstrated that reconfigurable FIR filters with internal PR can be achieved successfully.

The blanking bitstream does not contain any user logic and is a special case of partial bitstream. No processing is performed and thus the -1 is sent as the output by the blanking bitstream, as shown in Figure 3.10 (d). In general, the blanking bitstream can be swapped in to reduce the power consumption if no function requires to be implemented by the RP during a certain period [197].



Figure 3.10: Internal PR Results: (a) Lowpass Filter Coefficients; (b) Highpass Filter Coefficients; (c) Bandpass Filter Coefficients; (d) Blanking.

#### 3.4.3 PR Analysis

The comparison results of external and internal PR implementations are listed in Table 3.1. Since both designs have the same filter RP size and position, and are implemented on the Virtex-5 LX 110T device, they have the same partial and full bitstream size. The largest difference between the two designs is the downloading speed of configuration file. The external PR design employs the JTAG port to configure the device and the maximum clock rate is 66 MHz with 1-bit data width. Therefore, the maximum bandwidth is 66 Mbps. However, the ICAP is used in the internal PR example, and its maximum clock rate is 100 MHz with 32 bits for the data width. As a result, the data transfer rate could reach 3.2 Gbps.

Therefore, the results of reconfiguration download time can be estimated using the maximum bandwidth for external and internal methods, as given in Table 3.1. The internal PR example has a much shorter reconfiguration time as a result of the faster interface, and occupies much more hardware resources as the MicroBlaze soft core and ICAP peripheral require a large number of slices and BRAMs to perform internal PR control. The reconfiguration overhead using the PR technique can achieve a reduction of 96% compared to conventional configuration for both cases. It is important to note that this figure is linked to the relationship between the size of the FPGA device, and the size of the RP, and therefore is specific to this example.

Section 3.4 has demonstrated two different methods of implementing PR design for an FIR filter example. It was shown that internal PR has faster reconfiguration speed and enables reconfiguration by the FPGA in isolation compared to the external PR method. However, these benefits are at the cost of a more complex architecture and additional hardware resources than the external method.

|                                               | External PR | Internal PR |
|-----------------------------------------------|-------------|-------------|
| Number of Frames                              | 22          | 22          |
| Full Bitstream (KB)                           | 3799        | 3799        |
| Partial Bitstream (KB)                        | 165         | 165         |
| Full Reconfiguration<br>Download Time (ms)    | 460.5       | 9.5         |
| Partial Reconfiguration<br>Download Time (ms) | 20          | 0.4125      |
| Bitstream Size Reduction                      | 96%         | 96%         |
| Slice                                         | 221         | 1500        |
| BRAM                                          | 6           | 20          |

 Table 3.1: PR Performance Comparison.

# 3.5 Dynamic Reconfiguration Port

## 3.5.1 Overview

One of the primary functions of the DCM is to synthesise a desired clock frequency from a fixed frequency input oscillator [37]. The output frequency of a DCM is controlled via the relationship between the input clock, and the supplied multiplier and divisor parameters, as given in (3.1) [35]. Any integer values within a defined range can be supplied as the multiplier and divisor in order to obtain the desired output frequency. In the case of Virtex-5 DCMs, the multiplier range is from 2 to 33, and the divisor range is from 1 to 32.

$$f_{clk_{output}} = f_{clk_{input}} \times \frac{Multiplier+1}{Divisor+1}.$$
(3.1)

In Virtex-5 devices, the DCM primitive includes reconfigurable ports called DRPs, which allow the multiplier and divisor values to be supplied at runtime such that the operating clock frequency may be dynamically changed according to user's

requirements, for a single, fixed frequency clock source. The configuration is shown in Figure 3.11.



Figure 3.11: Architecture of DRP.

The essence of the DRP architecture is to integrate a state machine alongside an advanced DCM primitive (labelled as "DCM\_ADV" in Figure 3.11) to make full use of these ports, and to control multiplier and divisor values dynamically. According to [35], several defined steps have to be performed in sequence to achieve reconfiguration. First, the DCM has to go to the reset state. Second, the state machine starts to read from the hex address *50h* if either the multiplier or divisor value is reconfigured. Third, it starts to send the command by writing to the same hex address to instruct the DCM primitive to receive new values. Finally, the DCM reset is released and the output frequency is changed.

The "DRDY" port in the DCM is used to indicate that the read and write cycles are completed, and to instruct the state machine to start the next step. Multiplier and divisor values, both with a wordlength of 8 bits, are concatenated to form a 16-bit control word, where the multiplier value occupies the most significant portion, and the divisor the least significant portion. This combined word is supplied to the "DI" port of the DCM primitive. The "DRP\_start" port determines when to start the DRP cycle, and is controlled by external logic.

In view of the DRP behaviour mentioned above, designers can create logic capable

of changing multiplier and divisor values dynamically, in order to obtain different output clock frequencies from a single oscillator source, as required by the user application.

## 3.5.2 DRP Example

In this example, a clock oscillator with 256 MHz is chosen as the source from which to obtain three further different clock frequencies: 245.76, 358.4 and 240 MHz, from (3.2), (3.3) and (3.4) respectively. These clock frequencies will be further utilised in Chapters 4 and 5.

245.76MHz = 
$$256$$
MHz  $\times \frac{23+1}{24+1}$ . (3.2)

358. 4MHz = 
$$256$$
MHz  $\times \frac{13+1}{9+1}$  (3.3)

$$240 \text{MHz} = 256 \text{MHz} \times \frac{14+1}{15+1}. \tag{3.4}$$

Results from a post-route simulation of the DRP clock frequency design are illustrated in Figure 3.12. The top figure shows clock frequency switching through the entire DRP process, while Figures 3.12 (a), (b), (c) and (d) are obtained when zoomed in to the regions indicated in the main figure. In the case of Figure 3.12 (a), since the "DRP\_start" port remains zero, the output clock frequency is the same as the input clock, i.e., 256 MHz. In Figure 3.12 (b), the DRP architecture starts when the "DRP\_start" goes high. Integer multiplier and divisor values of 23 and 24 respectively are sent to the DRP architecture and wait for the DCM to write to the hex address *50h* in the third step. When the "Done" signal goes high, the output clock frequency has been changed to 245.76 MHz and thereafter keeps operating continuously. Similarly, the 358.4 MHz and 240 MHz frequencies can be obtained when multiplier values are equivalent to 13 and 14, and divisor values are 9 and 15, respectively. 25 clock cycles in total are calculated according to the post-route simulation to process dynamic frequency switching, i.e., between the "DRP start" changing from high to low, and the



Figure 3.12: DRP Clock Frequency Results.

"Done" changing from low to high to indicate that the process is complete. Furthermore, this processing period is identical for all of the frequency switching scenarios discussed above.

## 3.6 Concluding Remarks

This chapter has introduced two dynamic reconfiguration technologies for FPGA designs: PR and DRP. The principles of PR and DRP have been explained, and examples of both provided.

Reconfiguration overhead is reduced dramatically with PR technique, while enabling a high degree of design flexibility. Therefore, PR is able to switch functionalities with much less time than conventional FPGA implementation, which complies with the requirements of real-time SDR systems.

Reconfigurable FIR filters are taken as an example to perform PR implementations using two different control methods. It was shown that internal PR has faster reconfiguration speed and enables reconfiguration by the FPGA in isolation compared to the external PR method. However, these benefits are at the cost of a more complex architecture and additional hardware resources than the external method.

The DRP architecture is able to change the clock frequencies synthesised by a DCM from a fixed input oscillator by switching the values of multiplier and divisor dynamically. As a result, DRP has potential to be applied in SDR systems to reduce the number of clock oscillators required, and thus to reduce the power consumption and cost for the entire system.

Normally PR design does not directly dynamically change the clock frequency for a given input clock oscillator, because the DCM used to synthesise the clock is not implemented in reconfigurable logic [30] [32] [33]. Consequently, PR is insufficient to implement all aspects of switching radio functionalities in isolation. Therefore, the technique of DRP could be combined with PR to address the difficulties of communication standard or mode switching in terms of clock frequency and dependent functionalities, and this will be discussed in detail in Chapter 6.

# **Chapter 4**

# **Digital Front End Design**

This chapter starts by introducing the concepts, principles and architecture of the proposed digital front end. Then, measurement methods for evaluating the design against performance metrics, namely the spectral emission mask, error vector magnitude and adjacent channel leakage ratio, are discussed respectively. The design considerations and architectures of digital up converters for multiple standards, e.g. LTE, IEEE 802.16, WCDMA and IEEE 802.11, are presented in detail. The implementation and performance results obtained demonstrate that when designed and evaluated individually, each of the digital up converter designs can provide good performance with low hardware resource utilisation.

The proposed digital up converter design for supporting multiple standards and modes has a similar architecture with the key difference that only one of the modes is required to operate at any given time, in accordance with the needs of an SDR system. Therefore, these designs have the potential to be implemented with the dynamic technologies discussed in Chapter 2, and this will be explored in detail in Chapter 6.

## 4.1 Overview

With the rapid expansion of the wireless communication, an increasing number of standards and mode are emerging in the market. As discussed in Section 2.1, the SDR
device is a platform that employs a set of programmable hardware devices to support multiple wireless standards and modes. Every wireless communication standard has special design parameters which are different from those of other standards (and modes of the same standard). This design parameter diversity can be divided into two groups according to the processing stage: baseband design diversity and RF design diversity.

Baseband design diversity refers to the differences between a variety of standards in the baseband processing component, which in particular involves various coding schemes, for instance convolutional coding, turbo coding; and various modulation schemes, for example Quadrature Phase Shift Keying (QPSK), or Quadrature Amplitude Modulation (QAM). Baseband modulation diversity will be discussed in Chapter 6.

RF design diversity involves different transmission parameters, e.g. power, out of band emissions and sample rates, which are processed after the baseband processing component. However, it is a huge challenge to build an RF section which is able to cope with the RF design diversity to support a variety of standards or modes, because RF components, for example analogue filters, are not generally reconfigurable. In addition, the analogue filters are very expensive and result in higher power consumption [38]–[41]. Therefore, a practical method of meeting the SDR requirements is to replace some analogue components with a digital equivalent, which is capable of providing more accurate functionality and flexible reconfiguration control at reasonable costs. This is referred as to the DFE [38] and [41]–[43].

## 4.2 Digital Front End Principle

The DFE is also referred to as the DUC in the transmitter and Digital Down Converter (DDC) in the receiver. The analysis in this thesis is based on the DUC, and the architecture for the transmitter front end is shown in Figure 4.1.



Figure 4.1: The Front End Architecture in the Transmitter.

In Figure 4.1, the front end in the transmitter chain is divided into two sections: DFE and Analogue Front End (AFE). The DFE performs channelisation digitally and the AFE is required to remove the spectral images after the DAC [46]. The interpolation from baseband to IF frequency is necessary and plays an important role in the DFE, as it relaxes the requirements of the analogue filter and reduces the cost and power consumption [47]. The higher the interpolation factor in the DFE, the more relaxed the requirements for the analogue filter after the DAC. Moreover, a higher IF sampling rate permits a larger IF carrier frequency range, as the IF carrier frequency can be digitally controlled between  $-f_{\rm IF}/2$  and  $f_{\rm IF}/2$ , as shown in Figure 4.2. Here  $f_{\rm IF}$  and  $f_{\rm BB}$  denote IF sampling rate and baseband sampling rate respectively. The selection of IF sampling rate will be discussed in detail in Section 4.5.



Figure 4.2: IF and Baseband Sampling Rates.

### 4.3 Digital Up Converter Architecture

The architecture of the DUC is described by Figure 4.3. The purposes of the DUC are to perform channelisation to control out of band emissions to meet the requirements of the standard specifications, to modulate the signal onto a carrier at the desired frequency, and to increase the sampling rate to the IF sampling rate, which is equivalent to the rate of the DAC [48].



Figure 4.3: Architecture of DUC.

The channel filter is employed for pulse shaping so that the out of band emissions can be controlled to achieve the requirement of the spectral emission mask. In the interpolation section, a set of FIR filters are used to remove the spectrum imaging effect produced when the sample rate is raised. The Direct Digital Synthesiser (DDS) component synthesises sine and cosine waves which modulate the interpolated I and Q data from baseband to IF.

This type of DUC architecture only requires one DAC, and performs quadrature modulation with the generated digital sine and cosine signals. Consequently, the imbalance between I and Q channel amplitude and phase, referred to as the I/Q mismatch, can be solved using a digital implementation so as to improve the modulation quality in the transmission chain [45]–[46] and [49]–[51]. Therefore, this DUC architecture is adopted to support multiple standards and modes in this research work.

## 4.4 Performance Metrics of Digital Up Converter

#### 4.4.1 Spectral Emission Mask

The Spectral Emission Mask (SEM) could be viewed as the most important of all the DUC performance criteria. The SEM is generally defined by the development organisation and is applied to shape the transmitter spectral emission in order to meet the out of band spectral emission, and thus to ensure that an acceptably low level of interference is presented to users of adjacent frequency bands [52] [53]. The SEM is often presented as a graphical representation in terms of the Power Spectral Density (PSD), and the mask varies according to the different modes and standards.

### 4.4.2 Error Vector Magnitude

The Error Vector Magnitude (EVM) is a significant metric used to measure transmission signal quality. It quantifies the difference, which could be referred to as the error vectors, between the measured constellation points and the ideal reference constellation points. The definition of the EVM is shown in Figure 4.4. The reference signal is  $R_k$  and the measured signal is  $M_k$ . Therefore the error vector is  $E_k = M_k - R_k$ . From a mathematical perspective, the EVM is defined as the Root Mean Square (RMS) value between a collection of measured error vectors and the ideal reference, as shown in (4.1) [55]–[60].

$$EVM_{RMS} = \frac{\sqrt{\frac{1}{N} \sum_{i=1}^{N} ((I_{mi} - I_{ri})^{2} + (Q_{mi} - Q_{ri})^{2})}}{\sqrt{\frac{1}{N} \sum_{i=1}^{N} (I_{ri}^{2} + Q_{ri}^{2})}}.$$
(4.1)

The  $I_m$  and  $Q_m$  are the inphase and quadphase components of the measured signal and the  $I_r$  and  $Q_r$  are the inphase and quadphase of the ideal reference signal. N is the number of points to be calculated. In general, the EVM values are averaged over a large number of symbols and are expressed as a percentage.



Figure 4.4: EVM Definition.

A performance requirement based on EVM has become one of the important sections for many standards, including WCDMA and LTE. In [61]–[65], the EVM measurement flow for the OFDM signals is described as depicted in Figure 4.5. There are a variety of factors affecting the EVM performance, for example RF analogue distortion, noise, and the Inter-Symbol Interference (ISI) introduced in the digital processing components [58].



Figure 4.5: DUC EVM Measurement Flow for OFDM-based Design.

In this thesis, the impairment of transmission quality focuses on the DUC component rather than the entire transmitter system. In other words, the EVM caused by the DUC is the only factor considered. Hence the DUC component is implemented with fixed point and the DDC is implemented in floating point. The Additive White Gaussian Noise (AWGN) between the transmitter and receiver is removed because it results in more impairments unrelated to those generated by the DUC. Since the sample rate of the input data has been increased in the DUC component, the received signal has to be down converted to the baseband. The input data for testing DUC performance is based on the OFDM structure. The received signal after timing synchronisation could pass through the FFT component with the OFDM test data. Finally, the constellation plot comparison between the measured signals and the reference signals are obtained. The reference signals come from the input data processed by FFT transformation. The EVM value is calculated according to (4.1).

#### 4.4.3 Adjacent Channel Leakage Ratio

The Adjacent Channel Leakage Ratio (ACLR) is a concept for the LTE and WCDMA standards but is excluded in the IEEE 802.16 and 802.11 specifications. The ACLR is to measure the power leakage from the main channel into the adjacent channels and is an important parameter used to establish whether two communication systems are capable of coexisting on adjacent frequencies [53] [66]. In other words, it is used to estimate the interference between the principal transmission channel and the neighbouring channels [67].

The LTE standard defines two coexistence scenarios: LTE–LTE and LTE– WCDMA. For LTE–LTE coexistence, the adjacent channels have the same bandwidth as the LTE bandwidth. Figure 4.6 shows the LTE–LTE coexistence when the LTE (10 MHz) bandwidth is selected. The 1st adjacent channels are at 10 MHz offset frequency with 10 MHz bandwidth. Similarly, the 2nd adjacent channels are 20 MHz offset



Figure 4.6: 1st and 2nd Adjacent Carrier for LTE-LTE Coexistence.

frequency with 10 MHz bandwidth. However, the adjacent channel bandwidth is fixed at 5 MHz regardless of LTE bandwidth in the case of LTE-WCDMA scenario [58] [68]. For simplification, only the LTE–LTE coexistence scenario is studied to measure the ACLR values in the case of the LTE standard.

## 4.5 **DUC Design Considerations**

In Sections 4.2 and 4.3, the concepts and architecture of the DUC were introduced while Section 4.4 defined performance metrics. In this section, some key design factors have to be considered before proceeding to the DUC designs themselves.

Every standard or mode has its own baseband sample rate, which is usually defined by the standard draft specifications, e.g., LTE, WCDMA, and IEEE 802.16. However, IEEE 802.11 is an exception in the sense that it can be defined by the designer. 30 Million Samples Per Second (Msps) is selected for the IEEE 802.11n (20 MHz) in this study [210]. The diagram of various sample rates for different standards and modes is shown in Figure 4.7. The basis of supporting multiple standards and modes is to process the data at the proper sample rate in the DUC component according to the requirements of the various standards.



Figure 4.7: Sample Rates for Various Standards and Modes.

Since the sample rate of each standard is fixed and cannot be altered, the selection of a reasonable IF sample rate and system clock plays an important role in the process of DUC design. In general, the IF sampling rate should be an integer multiple of the input sample rate, and FPGA system clock frequency should be an integer multiple of IF sampling rate. From the perspective of efficient implementation, a system clock frequency of at least double the IF sampling rate should be chosen to allow the filters to be time division multiplexed. Sharing hardware permits a reduction in FPGA implementation cost to be achieved, most significantly in terms of multipliers which are often implemented using dedicated resources, as reviewed in Chapter 2.

The purpose of the rest of this chapter is to build a number of DUC designs for various standards and modes. In addition, every DUC design requires to be optimised to the maximum extent in order that it can be implemented with dynamic reconfigurable technologies, as reviewed in Chapter 3, to build an SDR system with high performance and low hardware resources. Therefore, the DUC designs should be highly cost-effective and provide good performance with respect to the criteria outlined in Section 4.4.

A system clock frequency of 4 times the IF sampling rate is chosen for reasons of implementation efficiency. It is important to note that the required clock frequencies differ according to the standards and modes: specifically, a 245.76 MHz clock is needed for LTE and WCDMA standards, while IEEE 802.16e requires either a 256 MHz clock (for 3.5 MHz and 7 MHz bandwidths), or a 358.4 MHz clock (for 5 MHz and 10 MHz bandwidths). In addition, IEEE 802.11n requires a 240 MHz clock frequency. Therefore, in total four different clock frequencies are needed to ensure the correct DUC implementations for the standards and modes considered. The DUC design parameters of the four standards are therefore as summarised in Table 4.1.

| Design                 | Input Sample Rate<br>(Msps) | IF<br>(MHz) | System Clock<br>(MHz) |
|------------------------|-----------------------------|-------------|-----------------------|
| LTE (10 MHz)           | 7.68                        | 61.44       | 245.76                |
| LTE (5 MHz)            | 15.36                       | 61.44       | 245.76                |
| WCDMA                  | 3.84                        | 61.44       | 245.76                |
| IEEE 802.16e (10 MHz)  | 11.2                        | 89.6        | 358.4                 |
| IEEE 802.16e (7 MHz)   | 8                           | 64          | 256                   |
| IEEE 802.16e (5 MHz)   | 5.6                         | 89.6        | 358.4                 |
| IEEE 802.16e (3.5 MHz) | 4                           | 64          | 256                   |
| IEEE 802.11n (20 MHz)  | 30                          | 60          | 240                   |

Table 4.1: DUC Design Parameters.

The main reason for setting the system clock frequency to 4 times  $f_{IF}$  is to reduce the number of dedicated multipliers required. Based on the analysis of Figure 4.3, the complex outputs from the interpolation section are multiplied with the complex sinusoid by the mixer component, and thus the signal is translated to the IF. This process is expressed by (4.2).

In general, 4 multipliers in total are required to perform (4.2); since the system clock frequency is 4 times of the selected IF, there are four operations for every output sample. Therefore, only one dedicated multiplier, i.e. the DSP48E on the Virtex-5 LX 110T device, is required to perform the equation if the interpolated data and the DDS outputs are processed in the proper way, as described by Figure 4.8. The interpolated data (I and Q) and the outputs from the DDS( $\cos \omega n$  and  $\sin \omega n$ ) are supplied to the DSP48E component through the "a" port and "b" port respectively. It performs multiplication at the first operation and multiplication and subtraction at the second operation. Similarly, it executes the multiplication, multiplication and accumulation in the third and fourth operations. Therefore, the results of second and fourth operations are selected as the outputs of I and Q channels respectively after the DUC processing.



Figure 4.8: DSP48E Architecture for Complex Multiplication.

# 4.6 DUC Designs for LTE (10 MHz) and (5 MHz)

LTE is a wireless cellular standard with high data rates, low latency and great spectral efficiency and is considered as 3.9G to replace the widely deployed 3G standards. LTE flexibly supports various bandwidths from 1.4 MHz to 20 MHz. In this thesis, the DUC designs for LTE are required to support 10 MHz and 5 MHz, a subset of all possible modes. The SEM requirements for LTE (10 MHz) and (5 MHz) are summarised in Table 4.2 according to [68] [70]. An attenuation of 20 dB is added as a safety margin for distortion caused by analogue components in the transmitter.  $|\Delta A|$  refers to the range and its value can be calculated by summing the values of IF carrier frequency and half of the bandwidth.

| $ \Delta f $ in the range | Minimum Attenuation (dB) | with Safety Margin (dB) |
|---------------------------|--------------------------|-------------------------|
| 0 MHz to 1 MHz            | 40                       | 60                      |
| 1 MHz to 10 MHz           | 50                       | 70                      |
| >10 MHz                   | 52                       | 72                      |

Table 4.2: SEM Requirements for LTE (10 MHz) and (5 MHz) Bandwidth.

### 4.6.1 Filter Design Considerations

In the case of LTE (10 MHz), the sample rate should be increased from 15.36 Msps to 61.44 Msps and thus an interpolation factor of 4 is required. As discussed in Section 4.2, the channel filter is used for pulse shaping and the interpolation filters are used to remove the spectrum imaging effect produced when the sample rate is raised. In general, it is efficient to use a cascade of filters which each have a small rate change, instead of one filter (a single stage) with a large rate increment, which would be expensive and difficult to implement in practice [61] and [70]–[81]. Therefore, two options are considered for partitioning the sample rate factor:

• A single rate channel filter followed by two interpolation filters where each filter is up sampled by a factor of 2.

• A channel filter interpolated by 2, followed by one further filter interpolated by a factor of 2.

The LTE signal is based on an OFDM structure, and the LTE OFDM symbol properties are listed in Table 4.3. Taking into account the LTE SEM requirements and the LTE OFDM symbol properties presented in Table 4.2 and Table 4.3 respectively, channel filter designs for two options in the case of LTE (10 MHz) are as summarised in Table 4.4.

| BW<br>(MHz) | F <sub>s</sub><br>(Msps) | FFT<br>Size | Used<br>Subcarriers | Subcarrier<br>Spacing (kHz) | Occupied<br>BW (MHz) |
|-------------|--------------------------|-------------|---------------------|-----------------------------|----------------------|
| 5           | 7.68                     | 512         | 300                 | 15                          | 4.5                  |
| 10          | 15.36                    | 1024        | 600                 | 15                          | 9                    |

Table 4.3: LTE OFDM Symbol Properties.

| Filter<br>Configuration | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Option 1                | 15.36                    | 4.5                        | 5                          | 0.05                      | 60                        | 91   |
| Option 2                | 30.72                    | 4.5                        | 5                          | 0.05                      | 60                        | 181  |

Table 4.4: Two Channel Filter Designs for LTE (10 MHz).

The passband frequency,  $F_{pass}$  is defined as half of the occupied bandwidth and the stopband frequency,  $F_{stop}$  is defined as half of the total bandwidth due to the stringent SEM requirements (there is a very sharp transition band between 0 and 1 MHz within the  $|\Delta A|$  range with an attenuation of 60 dB in the stopband). The  $A_{pass}$  parameter denotes passband ripple while  $A_{stop}$  refers to stopband attenuation, and these are set to 0.05 dB and 60 dB respectively. These design parameters can result in the channel filters which require 91 and 181 taps for Options 1 and 2 respectively, both with a symmetric structure. Note that the filters are designed using FDATool in MATLAB (as are all other filter designs in this thesis). The channel filter design for Option 1 using FDATool with design parameters is illustrated in Figure 4.9.



Figure 4.9: Channel Filter Design using FDATool.

In the case of the interpolation filter, the Half Band (HB) filter is the natural choice when up-sampling by a factor of 2. The HB filter is a type of FIR filter where the transition region is centred at one quarter of the sampling rate. The advantage of the HB filter is that each odd coefficient is zero except the centre coefficient [82] [83], hence it requires less computation from a hardware implementation perspective. The HB filter designs for LTE (10 MHz) are described in Table 4.5. The passband cutoff frequency is set to 5 MHz, resulting in a modulated bandwidth of 10 MHz. The passband ripple is set to 0.003 dB, resulting in out of band attenuation around 80 dB. The first HB filter design for the Option 1 using FDATool is illustrated in Figure 4.10.

| :) |
|----|
|    |

| Option   | HB<br>Position | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | Taps |
|----------|----------------|--------------------------|----------------------------|---------------------------|------|
| Option 1 | 1st HB         | 30.72                    | 5                          | 0.003                     | 27   |
|          | 2nd HB         | 61.44                    | 5                          | 0.003                     | 11   |
| Option 2 | 1st HB         | 61.44                    | 5                          | 0.003                     | 11   |



Figure 4.10: HB Filter Design using FDATool.

Based on the parameters given in Tables 5 and 6, the details of the filter designs for the two options are as summarised in Table 4.6. Both options can meet the SEM requirements, but the implementation results are different, in particular in terms of the number of DSP48Es. In the case of Option 1, the channel filter, often the most expensive in all of the filter designs, is implemented at single rate to maximise the Hardware Over-sampling Rate (HOR) of the FPGA filter implementation and thus to reduce the number DSP48Es required, because the HOR decreases as the sample rate increments.

| Filter<br>Configuration | F <sub>s</sub><br>(Msps) | Channel<br>Filter Taps | 1st HB<br>Taps | 2nd HB<br>Taps | IF<br>(MHz) | No.<br>DSP48Es |
|-------------------------|--------------------------|------------------------|----------------|----------------|-------------|----------------|
| Option 1                | 15.36                    | 91                     | 27             | 11             | 61.44       | 12             |
| Option 2                | 15.36                    | 181                    | 11             | None           | 61.44       | 16             |

Table 4.6: Implementation Results Comparison for LTE (10 MHz).

The numbers of DSP48Es used for each filter are 8, 2, and 2 respectively, and thus the total number of DSP48Es is 12 for Option 1. Compared to Option 1, the channel filter is interpolated by a factor of 2 in Option 2. The HOR reduces by half when the sample rate is increased by a factor of 2. Moreover, the number of filter taps doubles when the sample rate is raised by 2 for a given set of channel filter design requirements. Therefore, in this case the number of DSP48Es needed for the channel filter increases dramatically, to 14. The 1st HB filter of Option 2 is the same as the 2nd HB filter in Option 1 and thus the number of DSP48Es is 2. As a result, the number of DSP48Es for Option 2 is 16 in total. As a result, when the channel filter is designed at the low sampling rate, this permits more sharing of components and thus lowers the resource cost.

A comparison of results for Options 1 and 2 in terms of DSP48Es usage is provided in Table 4.6. It is clear that Option 1 uses fewer DSP48Es, while meeting the SEM requirements. Therefore, Option 1 has been selected as the DUC architecture for the LTE (10 MHz) implementation. Furthermore, this architecture can also be applied to other OFDM-based standards, for example LTE (5MHz), IEEE 802.16e and IEEE 802.11n, all of which are considered in this thesis.

The magnitude response for each filter design for Option 1 is shown in Figure 4.11. The overall magnitude response of the cascade of channel and two HB filters for LTE (10 MHz) is shown in Figure 4.12. It is obvious that the passband frequency occupies 5 MHz bandwidth and results in a final modulated bandwidth of 10 MHz.



Figure 4.11: Magnitude Response of Each Filter: (a) Channel Filter; (b) 1st HB Filter; (c) 2nd HB Filter.



Figure 4.12: Overall LTE (10 MHz) DUC Filter Response.

#### 4.6.2 DUC Architecture

Based on the analysis in Section 4.6.1, the architecture designed for the LTE (10 MHz) DUC is as shown in Figure 4.13. The entire DUC consists of three components: filtering, DDS and mixer. The filtering section is employed to increase the sample rate by a factor of 4 from 15.36 Msps to 61.44 Msps, and to remove the imaging effect generated in the process of interpolation. The purpose of the mixer is to perform complex multiplication between the interpolated data from the I and Q channels, and the complex sinusoid from the DDS component.

In order to make full use of the potential for hardware sharing as discussed in Section 4.5, the data from the I and Q channels, and the DDS outputs are up-sampled by a factor of 4 to reach the system clock frequency, which ensures that only one DSP48E is required to perform complex multiplication. The result of the complex multiplication is then demultiplexed and down-sampled by a factor of 4 to reach the selected IF. The channel and HB filters are configured using Xilinx FIR Compiler 5.0 to support I and Q channels with time division multiplexing. In this study, all of the filters are implemented using Xilinx FIR Compiler 5.0.



Figure 4.13: Architecture of LTE (10 MHz) DUC.

The DDS component is managed by the frequency control block, and is used to generate a tunable carrier frequency. In general, the clock frequency of the DDS is equivalent to the IF sample rate ( $f_{IF}$ ), and thus the tunable range of the carrier frequency is between  $-f_{IF}/2$  and  $f_{IF}/2$ . The Spurious-Free Dynamic Rage (SFDR), defined as the

ratio of the spectral peak to the strongest spur, is viewed as the most critical design criterion for the DDS component. In addition, the granularity of the complex sinusoid is set by the frequency resolution, which is also significant in DDS design [84].

In this study, the DDS component is configured using the Xilinx DDS Compiler 4.0. The SFDR is set to 107 dB with 0.25 Hz frequency resolution. The data width required to control the carrier with a desired frequency can be calculated according to (4.3) [85].

$$B = \left\lceil \log_2 \frac{f_{c1k}}{\Delta f} \right\rceil. \tag{4.3}$$

In this case, the clock frequency of the DDS,  $(f_{clk})$  is 61.44 MHz, with 0.25 Hz frequency resolution, i.e.  $|\Delta f| = 0.25$  Hz, and thus in total 28 bits are required for frequency control. A phase dithered method is employed, which adds a randomising sequence to the accumulator output to suppress the structure of phase truncation noise, and thus the DDS reaches an SFDR of 107 dB [84] [85]. The PSD result for a DDS output at 15 MHz when the phase dithered method is used is illustrated in Figure 4.14.



Figure 4.14: DDS Output with 15 MHz ( $F_s = 61.44$  MHz).

The DUC design for LTE (10 MHz) can be shared by LTE (5 MHz) DUC, as their normalised passband and stopband requirements are identical. However, one additional HB filter is employed for the LTE (5 MHz) DUC as an interpolation factor of 8 is required to transition from the input sample rate to the selected IF sampling rate. The architecture of the LTE (5 MHz) DUC is depicted by Figure 4.15.



Figure 4.15: Architecture of LTE (5 MHz) DUC.

## 4.6.3 Performance

#### 4.6.3.1 Power Spectral Density of Transmitted signals

The PSD results of the DUCs for LTE (10 MHz) and (5 MHz) are depicted in Figure 4.16 and Figure 4.17 respectively. The SEMs are represented according to Table 4.2. Since the SEM is symmetric about the centre frequency, only the range from the centre frequency to  $f_{\rm IF}/2$  is presented, and this applies for all of the PSD results in this thesis.

Figure 4.16 shows that the LTE signal with 10 MHz bandwidth has been translated to 15 MHz and meets the SEM (with safety margin). Similarly, Figure 4.17 confirms that the output spectrum of the LTE signal with 5 MHz bandwidth is also centred at 15 MHz and meets the SEM requirements.



Figure 4.16: DUC Transmission Spectrum of LTE (10 MHz).



Figure 4.17: DUC Transmission Spectrum of LTE (5 MHz).

## 4.6.3.2Error Vector Magnitude

Based on the review in Section 4.4.2, the EVM is an important parameter to measure the transmission quality and thus it is necessary to calculate the EVM values

of the DUC designs. The maximum EVM allowed by the draft specification is shown in Table 4.7 [53] [58] [67]. Note that these requirements refer to the total system EVM, including both the digital and analogue circuits in the transmitter chain.

| Modulation Scheme | Required EVM |
|-------------------|--------------|
| QPSK              | 17.5%        |
| 16-QAM            | 12.5%        |
| 64-QAM            | 8%           |

Table 4.7: EVM Requirements of LTE.

The input data based on the OFDM structure using 64-QAM modulation scheme complies with the LTE standard. The output data of the DUC is captured and analysed to calculated the EVM value according to the EVM measurement flow, as discussed in Section 4.4.2. One sub-frame with 1ms duration contains 14 OFDM symbols and 1 radio frame consists of ten sub-frames in the LTE signal structure [86] [87]. The EVM calculation is over 10 LTE radio frames, and thus 1400 OFDM symbols are analysed, referred to (4.1), where N is equivalent to  $1400 \times 1024$ . The received constellation plot for LTE (10 MHz) with the 64-QAM is shown in Figure 4.18. The DUC of LTE (10 MHz) has an EVM of 1.08%. Similarly, the EVM values is 1.11% for the LTE (5 MHz) DUC, referred to (4.1), where N is equivalent to  $1400 \times 512$ . Note that the EVM values obtained above are due only to the DUC component, without any transmission noise. The DUC is one of the several components affecting the overall signal quality in the transmitter chain.



Figure 4.18: Received Constellation for LTE (10 MHz) with 64-QAM, without AWGN.

#### 4.6.3.3Adjacent Channel Leakage Ratio

The ACLR defined by the standard draft is the ratio of the rectangular filtered mean power of the main channel to the rectangular filtered mean power of the adjacent channels. The rectangular filter bandwidths are 9 MHz and 4.5 MHz for LTE (10 MHz) and (5 MHz) respectively, as defined in [68] and [88]. Based on the analysis of Section 4.4.3, the 1st and 2nd adjacent channels are at 10 and 20 MHz offset frequency from the main channel. The ACLR values are measured by filtering the adjacent channels with the rectangular filter. Values of ACLR1 and ACLR2 are 76.45 dB and 81.29 dB respectively as obtained by calculation for LTE (10 MHz). Similarly, values of ACLR1 and ACLR2 are 71.14 and 77.49 dB respectively in the case of LTE (5 MHz).

Taking into account the calculated EVM and ACLR values, performance metrics of the DUC designs for LTE (10 MHz) and (5 MHz) are as summarised in Table 4.8. Both

of the DUC designs achieve good performance in terms of EVM, reaching approximately 1%, which leaves sufficient margin for degradations introduced by other components, e.g. Crest Factor Reduction (CFR) and analogue modules, to meet the overall EVM requirement of 8% for 64-QAM.

|       | Requirements | Requirements + | DUC Performance |             |  |
|-------|--------------|----------------|-----------------|-------------|--|
|       |              | Safety Margin  | LTE (10 MHz)    | LTE (5 MHz) |  |
| EVM   | 8%           | None           | 1.08%           | 1.11%       |  |
| ACLR1 | 45 dB        | 65 dB          | 76.45 dB        | 71.14 dB    |  |
| ACLR2 | 45dB         | 65 dB          | 81.29 dB        | 77.49 dB    |  |

Table 4.8: DUCs of LTE (10 MHz) and (5 MHz) Performance Metrics.

CFR is a cost-effective technique to reduce the Peak to Average Power Ratio (PAPR) for the OFDM signal in order to increase power efficiency for power amplifiers. One of the disadvantages of OFDM relates to high PAPR, which results in inefficient operation for power amplifiers in the transmitter chain [89]–[91]. Moreover, an attenuation of 20 dB is added as a safety margin, because the CFR and analogue components may have a negative impact on SEM and ACLR results, and these more stringent requirements are also met by the proposed LTE DUC designs. As a result, it may be concluded that the DUC designs for LTE (10 MHz) and (5 MHz) obtain good performance in terms of SEM, EVM and ACLR measurements.

#### 4.6.4 Implementation Results

In this thesis, all of the designs are implemented targeting the Virtex-5 LX110T device with the Xilinx ISE 12.4 software suite. Table 4.9 details the hardware resource usage of the DUC architectures for LTE (10 MHz) and (5 MHz) in terms of LUTs, FFs, DSP48Es, BRAMs and slices. The  $f_{max}$  column shows that all of the modules can meet the timing requirements defined according to the DUC design considerations. The phase dithered method is used in the DDS component to achieve a SFDR of 107 dB.

As a result, the DDS component occupies a total of 9 BRAMs in both of the designs. The DUC for LTE (10 MHz) requires 684 slices, 13 DSP48Es and 9 BRAMs and the equivalent for LTE (5 MHz) occupies 666 slices, 9 DSP48Es and 9 BRAMs.

| DUC          | LUTs | FFs  | Slices | DSP48Es | BRAMs | $f_{\rm max}$ (MHz) |
|--------------|------|------|--------|---------|-------|---------------------|
| LTE (10 MHz) | 1164 | 1639 | 684    | 13      | 9     | 407.2               |
| LTE (5 MHz)  | 1257 | 1783 | 666    | 9       | 9     | 430.8               |

Table 4.9: Hardware Utilization of DUCs for LTE (10 MHz) and (5 MHz).

# 4.7 DUC Designs for IEEE 802.16e

IEEE 802.16e can be considered as the one of the most popular variations in the IEEE 802.16 standard family. Mobility has been improved significantly compared to other variations in the IEEE 802.16 standard family, such as IEEE 802.16a, IEEE 802.16d, and thus it is also viewed as one of 3.9G wireless cellular standards [24]. The IEEE 802.16e standard supports the following channel bandwidths: 1.75 MHz, 3.5 MHz, 7 MHz 14 MHz, 1.25 MHz, 5 MHz, 10 MHz, 15 MHz and 8.75 MHz [92]. A subset of the bandwidths mentioned above, 10 MHz, 5 MHz, 7 MHz and 3.5 MHz, are considered in this thesis as they are commonly used.

The standard also defines several "Zone Types" relating to the data, e.g. Partial Usage of Subchannels (PUSC), Full Usage of Subchannels (FUSC) and Optional FUSC. The subcarrier parameters for different zone types are listed in Table 4.10 when the total subcarriers are set to 1024 and 512 [93]–[95]. Based on Table 4.10, it is obvious that the Optional FUSC type has the highest ratio of used subcarriers to total subcarriers, and thus is the most challenging for meeting the SEM requirements in the process of DUC filter designs. Therefore, Optional FUSC is selected to cater for the DUC designs for IEEE 802.16e (10 MHz, 7 MHz, 5 MHz and 3.5 MHz).

|                                    | Zone Type |        |        |        |               |        |
|------------------------------------|-----------|--------|--------|--------|---------------|--------|
|                                    | PU        | SC     | FUSC   |        | Optional FUSC |        |
| Total<br>Subcarriers               | 1024      | 512    | 1024   | 512    | 1024          | 512    |
| Used<br>Subcarriers                | 841       | 421    | 851    | 427    | 865           | 433    |
| Guard<br>Subcarriers (left, right) | 92, 91    | 46, 45 | 87, 86 | 43, 42 | 80, 79        | 40, 39 |
| Ratio (Used/Total)                 | 0.821     | 0.822  | 0.831  | 0.834  | 0.845         | 0.846  |

Table 4.10: A Subset of Subcarrier Parameters for Different OFDMA Zone Type.

## 4.7.1 DUC Designs for IEEE 802.16e (10 MHz) and (5 MHz)

According to [54] and Table 4.1, the SEM requirements of IEEE 802.16e (10 MHz) and (5 MHz) bandwidth are listed in Table 4.11. An attenuation of 20 dB is added as a safety margin to the requirements to achieve good performance for the DUC design. The SEM requirement is illustrated in Figure 4.19.

Table 4.11: SEM Requirements of IEEE 802.16e (10 MHz) and (5 MHz) Bandwidth.

|                       | Zero  | Point A   | Point B   | Point C   | Point D   |
|-----------------------|-------|-----------|-----------|-----------|-----------|
| 10 MHz<br>Bandwidth   | 0 MHz | 4.75 MHz  | 5.45 MHz  | 9.75 MHz  | 14.75 MHz |
| 5 MHz<br>Bandwidth    | 0 MHz | 2.375 MHz | 2.725 MHz | 4.875 MHz | 7.375 MHz |
| Requirements          | 0 dB  | 0 dB      | -25 dB    | -32 dB    | -50 dB    |
| With<br>Safety Margin | 0 dB  | 0 dB      | -45 dB    | -52 dB    | -70 dB    |



Figure 4.19: SEM Requirements with Safety Margin for IEEE 802.16e (10 MHz) and (5 MHz).

As discussed in Section 4.6.1, operating the channel filter at single rate can result in a reduction in terms of DSP48Es. Therefore, the architecture of the designed IEEE 802.16e (10 MHz) DUC is described in Figure 4.20. The input sample rate is 11.2 Msps and the signal is interpolated by a factor of 8 to 89.6 Msps. The DDS part is under the control of the frequency control block and used to generate the desired carrier frequency. As for the LTE design, both the input data and the DDS outputs are upsampled by a factor of 4 to reach the system clock frequency in order to ensure that only one DSP48E is required to perform complex multiplication in the mixer component. The output sample rate of the mixer is then down-sampled by a factor of 4 to reach the selected IF sampling rate.



Figure 4.20: Architecture of IEEE 802.16e (10 MHz) DUC.

Taking into account the choice of Optional FUSC data, the OFDM symbol properties of the IEEE 802.16e (10 MHz) and (5 MHz) are as summarised in Table 4.12. As a result, the channel filter design considerations for IEEE 802.16e (10 MHz) are as detailed in Table 4.13.

| BW<br>(MHz) | F <sub>s</sub><br>(Msps) | FFT<br>Size | Used<br>Subcarriers | Subcarrier<br>Spacing (kHz) | Occupied<br>BW (MHz) |
|-------------|--------------------------|-------------|---------------------|-----------------------------|----------------------|
| 5           | 5.6                      | 512         | 433                 | 10.94                       | 4.737                |
| 10          | 11.2                     | 1024        | 865                 | 10.94                       | 9.463                |

Table 4.12: IEEE 802.16e (10 MHz) and (5 MHz) OFDM Symbol Properties.

 Table 4.13: Channel Filter Design Parameters for IEEE 802.16e (10 MHz).

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 11.2                     | 4.7315                     | 5.2685                     | 0.05                      | 60                        | 62   |

The passband frequency is defined as half of the occupied bandwidth. Since the PSD of the IEEE 802.16e (10 MHz) is attenuated by 45 dB between 4.75 MHz and 5.45 MHz, it is not as sharp a transition as LTE (10 MHz) and (5 MHz) with its attenuation of 60 dB, and the stopband frequency can be defined as  $BW_{total} - BW_{occupied}/2$  to reduce the number of filter taps required when compared to the LTE channel filter. The passband ripple and stopband attenuation are set to 0.05 dB and 60 dB respectively and

thus a total 62 taps with symmetric structure are needed for the channel filter.

For the interpolation filtering, it is necessary to split the rate change into multiple stages each with an interpolation factor of 2 according to Section 4.6.1. The design considerations and parameters of each HB filter are illustrated in Table 4.14. The filter designs for IEEE 802.16e (10 MHz) can be shared by the IEEE 802.16e (5 MHz) as they have identical normalised passband and stopband frequencies. In addition, another HB is required for the IEEE 802.16e (5 MHz) to reach IF sampling frequency.

|        | HB<br>Position | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | A <sub>pass</sub><br>(MHz) | Taps |
|--------|----------------|--------------------------|----------------------------|----------------------------|------|
|        | 1st HB         | 22.4                     | 5                          | 0.003                      | 83   |
| 10 MHz | 2nd HB         | 44.8                     | 5                          | 0.003                      | 15   |
|        | 3rd HB         | 89.6                     | 5                          | 0.003                      | 11   |
| 5 MHz  | 4th HB         | 89.6                     | 2.5                        | 0.003                      | 7    |

Table 4.14: HB Filter Designs for IEEE 802.16e (10 MHz) and (5 MHz).

The DUC architecture for IEEE 802.16e (5 MHz) is shown in Figure 4.18. Note that the channel filter and three HB filters can be shared with those of IEEE 802.16e (10 MHz). The additional HB is integrated as the final stage of the interpolation filter chain to provide a sample rate change from 5.6 Msps to the selected 89.6 Msps.



Figure 4.21: Architecture of IEEE 802.16e (5 MHz) DUC.

#### 4.7.2 DUC Performance for IEEE 802.16e (10 MHz) and (5 MHz)

### 4.7.2.1 Power Spectral Density of Transmitted Signals

The PSD results obtained from simulation of the DUCs for IEEE 802.16e (10 MHz) and (5 MHz) are depicted in Figure 4.22 and Figure 4.23 respectively, where the SEMs are obtained according to Table 4.11. Figure 4.22 shows that the signal has been modulated up to 15 MHz and meets the SEM. Similarly, Figure 4.23 confirms that the output spectrum of the data is centred at 15 MHz and meets the SEM requirements.



Figure 4.22: DUC Transmission Spectrum of IEEE 802.16e (10 MHz).



Figure 4.23: DUC Transmission Spectrum of IEEE 802.16e (5 MHz).

#### 4.7.2.2Error Vector Magnitude

The term Relative Constellation Error (RCE) represented in dB is another concept used to measure the transmission quality in some standards, e.g. IEEE 802.16e and 802.11n. Moreover, the values of EVM and RCE are related [96]. The equation which links them is shown in (4.4) [97]. Therefore, the RCE values can be obtained through the EVM calculation.

$$RCE(dB) = 20\log_{10} \frac{EVM}{100}.$$
 (4.4)

The input data contains 1000 OFDM symbols modulated with a 16-QAM scheme and is based on the Optional FUSC type with 1024 FFT and 512 FFT for IEEE 802.16e (10 MHz) and (5 MHz) respectively. Similarly, the EVM values of the DUCs for IEEE 802.16e (10 MHz) and (5 MHz) are equivalent to 0.99% and 1.01% respectively, according to (4.1), where N is equal to  $1000 \times 1024$  and  $1000 \times 512$  respectively. As a result, the RCE values are -40.09 dB and -39.91 dB respectively according to (4.4). Both RCE values are less than the -25 dB figure, defined by the IEEE 802.16e standard in the case of 16-QAM, and hence they meet the RCE requirements [95]. The received



constellation plot for IEEE 802.16e (10 MHz) DUC is shown in Figure 4.24.

Figure 4.24: Received Constellation for IEEE 802.16e (10 MHz) with 16-QAM, without AWGN.

## 4.7.3 DUC Designs for IEEE 802.16e (7 MHz) and (3.5 MHz)

There are several system types, e.g. A, B to G, defined by the IEEE 802.16e (7 MHz) and (3.5 MHz) standard draft, among which system type G has the most stringent SEM requirements. The SEM requirements of the system type G is listed with an attenuation of 20 dB for safety margin in Table 4.15 and is illustrated in Figure 4.25 [98].

|                       | Zero  | Point A  | Point B  | Point C | Point D | Point E | Point F  |
|-----------------------|-------|----------|----------|---------|---------|---------|----------|
| 7 MHz<br>Bandwidth    | 0 MHz | 3.5 MHz  | 3.5 MHz  | 5.0 MHz | 7.4 MHz | 14 MHz  | 17.5 MHz |
| 3.5MHz<br>Bandwidth   | 0 MHz | 1.75 MHz | 1.75 MHz | 2.5 MHz | 3.7 MHz | 7 MHz   | 8.75 MHz |
| Require-<br>ment      | 0 dB  | 0 dB     | -8 dB    | -32 dB  | -38 dB  | -50 dB  | -50 dB   |
| With Safety<br>Margin | 0 dB  | 0 dB     | -28 dB   | -52 dB  | -58 dB  | -70 dB  | -70 dB   |

Table 4.15: SEM Requirements of Type G for IEEE 802.16e (7 MHz) and (3.5 MHz) Bandwidth.



Figure 4.25: SEM Requirements with Safety Margin of Type G for IEEE 802.16e (7 MHz) and (3.5 MHz).

The architecture of DUC for IEEE 802.16e (7 MHz) is depicted in Figure 4.26. The input data is interpolated by a factor of 8, from 8 Msps to 64 Msps, and thus three HB filters are needed in cascade. Similar to previous designs, the sample rates of the output data from the filter and DDS components have to be increased to the system clock rate so as to make full use of the DSP48E in the complex multiplication.



Figure 4.26: Architecture of IEEE 802.16e (7 MHz) DUC.

For Optional FUSC data, the OFDM symbol properties of IEEE 802.16e (7 MHz) and (3.5 MHz) are as summarised in Table 4.16. Therefore, the channel filter design considerations for IEEE 802.16e (7 MHz) are as stated in Table 4.17.

| BW<br>(MHz) | F <sub>s</sub><br>(Msps) | FFT<br>Size | Used<br>Subcarriers | Subcarrier<br>Spacing (kHz) | Occupied<br>BW (MHz) |
|-------------|--------------------------|-------------|---------------------|-----------------------------|----------------------|
| 3.5         | 4                        | 512         | 433                 | 7.81                        | 3.382                |
| 7           | 11.2                     | 1024        | 865                 | 7.81                        | 6.756                |

Table 4.16: IEEE 802.16e (7 MHz) and (3.5 MHz) OFDM Symbol Properties.

 Table 4.17: Channel Filter Designs for IEEE 802.16e (7 MHz).

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 8                        | 3.378                      | 3.622                      | 0.05                      | 60                        | 97   |

Similarly, the passband frequency is defined as the half of the occupied bandwidth and the stopband frequency is defined as  $BW_{total} - BW_{occupied}/2$  to reduce the number of taps required for the filter. The passband ripple and stopband attenuation are set to 0.05 dB and 60 dB respectively and thus in total 97 taps with a symmetric structure are needed for the channel filter.

For the interpolation filters, it is necessary to split the interpolation factor into

multiple stages, each with an interpolation factor of 2, according to the analysis in Section 4.6.1. The filter designs for the IEEE 802.16e (7 MHz) DUC can be shared by the IEEE 802.16e (3.5 MHz) design as they have identical normalised passband and stopband frequency requirements. However, another HB is required for IEEE 802.16e (3.5 MHz) to reach the same IF frequency. The design considerations of each HB filter are given in Table 4.18.

| Bandwidth | HB<br>Position | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | A <sub>pass</sub><br>(MHz) | Taps |
|-----------|----------------|--------------------------|----------------------------|----------------------------|------|
|           | 1st HB         | 16                       | 3.5                        | 0.003                      | 71   |
| 7 MHz     | 2nd HB         | 32                       | 3.5                        | 0.003                      | 15   |
|           | 3rd HB         | 64                       | 3.5                        | 0.003                      | 11   |
| 3.5 MHz   | 4th HB         | 64                       | 1.75                       | 0.003                      | 7    |

Table 4.18: HB Filter Designs for IEEE 802.16e (7 MHz) and (3.5 MHz).

The DUC architecture of the IEEE 802.16e (3.5 MHz) DUC is shown in Figure 4.27. The channel and three HB filters can be shared with those of IEEE 802.16e (7 MHz). The additional HB is integrated in the final stage of the interpolation filters to achieve an overall sample rate increase from 4 Msps to 64 Msps.



Figure 4.27: Architecture of IEEE 802.16e (3.5 MHz) DUC.

## 4.7.4 DUC Performance for IEEE 802.16e (7 MHz) and (3.5 MHz)

## 4.7.4.1 Power Spectral Density of Transmitted Signals

The PSD results of the DUCs for IEEE 802.16e (7 MHz) and (3.5 MHz) are depicted in Figure 4.28 and Figure 4.29 respectively. The spectra show that IEEE 802.16e (7 MHz) and (3.5 MHz) signals have been translated to 15 MHz and meet the SEM requirements.



Figure 4.28: DUC Transmission Spectrum of IEEE 802.16e (7 MHz).



Figure 4.29: DUC Transmission Spectrum of IEEE 802.16e (3.5 MHz).

### 4.7.4.2Error Vector Magnitude

The data used to test the IEEE 802.16e (7 MHz) and (3.5 MHz) DUCs has the same structure as for IEEE 802.16e (10 MHz) and (5 MHz) DUCs, as reviewed in Table 4.10. Using the calculation methods detailed previously, the EVM values of the DUC for IEEE 802.16e (7 MHz) and (3.5 MHz) with 16-QAM modulation are equivalent to 1.00% and 1.01% respectively, according to (4.1), where N is equal to  $1000 \times 1024$  and  $1000 \times 512$  respectively. Therefore, the corresponding RCE values are -40.00 dB and -39.91 dB respectively. As a result, the performance of the DUC design is demonstrated to meet the requirements defined by the standard draft. The received constellation plot for the IEEE 802.16e (7 MHz) is shown in Figure 4.30.


Figure 4.30: Received Constellation for IEEE 802.16e (7 MHz) with 16-QAM, without AWGN.

## 4.7.5 Implementation Results

Table 4.19 provides the hardware resources required for the IEEE 802.16e DUC designs. The DUC of IEEE 802.16e (7 MHz) occupies the most DSP48Es and the DUC of IEEE 802.16e (3.5 MHz) uses the most slices, reaching 12 and 846 respectively. The maximum frequency is analysed to ensure that all of the DUC designs can meet the timing requirements for the system (358.4 MHz for IEEE 802.16e (10 MHz) and (5 MHz), and 256 MHz for IEEE (7 MHz) and (3.5 MHz)).

| DUC               | LUTs | FFs  | Slices | DSP48Es | BRAMs | $f_{\rm max}$ (MHz) |
|-------------------|------|------|--------|---------|-------|---------------------|
| 802.16e (10 MHz)  | 1357 | 1818 | 599    | 11      | 9     | 387.1               |
| 802.16e (7 MHz)   | 1411 | 1926 | 804    | 12      | 9     | 433.3               |
| 802.16e (5 MHz)   | 1407 | 2041 | 819    | 6       | 9     | 396.4               |
| 802.16e (3.5 MHz) | 1453 | 2066 | 846    | 8       | 9     | 406.2               |

**Table 4.19:** Hardware Utilisation of DUCs for IEEE 802.16e.

## 4.8 DUC Design for IEEE 802.11n

IEEE 802.11 standard family is the most well known WLAN standard and it has a number of variations. IEEE 802.11n incorporates MIMO techniques, and is an amendment to the IEEE 802.11 variations proposed before 2008. It has two defined channel bandwidths: 20 MHz and 40 MHz. This thesis focuses only on the 20 MHz bandwidth mode.

## 4.8.1 DUC Design for IEEE 802.11n (20 MHz)

The SEM requirements of the IEEE 802.11n (20 MHz) variant defined by the standard specification are listed in Table 4.20. As for the other standards considered, an attenuation of 20 dB is added as a safety margin [99].

|                    | Zero  | Point A | Point B | Point C | Point D |
|--------------------|-------|---------|---------|---------|---------|
|                    | 0 MHz | 9 MHz   | 11 MHz  | 20 MHz  | 30 MHz  |
| Requirements       | 0 dB  | -20 dB  | -28 dB  | -45 dB  | -45 dB  |
| With Safety Margin | 0 dB  | -40 dB  | -48dB   | -65 dB  | -65 dB  |

Table 4.20: SEM Requirements for IEEE 802.11n (20 MHz) Bandwidth.

IEEE 802.11n data is also based on an OFDM structure, and the OFDM symbol properties are listed in Table 4.21, according to the standard draft. The FFT size is

fixed to 64, and 56 subcarriers are used (52 for data and 4 for pilot carriers). The subcarrier spacing is 312.5 kHz and therefore the occupied bandwidth is 17.5 MHz.

| BW    | F <sub>s</sub> | FFT  | Used        | Subcarrier    | Occupied |
|-------|----------------|------|-------------|---------------|----------|
| (MHz) | (Msps)         | Size | Subcarriers | Spacing (kHz) | BW (MHz) |
| 20    | 30             | 64   | 56          | 312.5         | 17.5     |

Table 4.21: IEEE 802.11n (20 MHz) OFDM Symbol Properties.

The sample rate has been interpolated by a factor of 2 in the process of IEEE 802.11n DUC design. As a result, only one HB filter is required. In the case of the channel filter, the passband frequency is defined as half of the occupied bandwidth, and the stopband frequency is defined to the  $BW_{total} - BW_{occupied}/2$ . The passband ripple and stopband attenuation are set to 0.05 dB and 60 dB respectively, resulting in a filter of length 36 taps with symmetric structure. For the HB filter, the passband cutoff frequency is defined as half of the total bandwidth, and the passband ripple is set to 0.003 dB, resulting in a filter design of 27 taps. The filter designs are listed in Table 4.22 and the architecture of the IEEE 802.11n (20 MHz) is shown in Figure 4.31.

Table 4.22: Filter Designs for IEEE 802.11n (20 MHz).

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 30                       | 8.75                       | 11.25                      | 0.05                      | 60                        | 36   |
| HB Filter         | 60                       | 10                         | None                       | 0.003                     | None                      | 27   |



Figure 4.31: Architecture of IEEE 802.11n (20 MHz) DUC.

## 4.8.2 DUC Performance for IEEE 802.11n (20 MHz)

## 4.8.2.1 Power Spectral Density of Transmitted Signals

The PSD results of the DUC for IEEE 802.11n (20 MHz) is depicted in Figure 4.32 where the SEM is defined according to Table 4.20. Figure 4.32 shows that the signal is centred at 15 MHz and meets the SEM.



Figure 4.32: DUC Transmission Spectrum of IEEE 802.11n (20 MHz).

#### **4.8.2.2Error Vector Magnitude**

The maximum RCE value permitted by the standards is -28 dB when 64-QAM modulation is used. The input data based on the 1000 OFDM symbols complies with the specifications of IEEE 8802.11n (20 MHz). Similarly, the EVM value of the IEEE 802.11n DUC is 1.05%, referred to (4.1), where N is equal to  $1000 \times 64$  respectively. The corresponding RCE value is equivalent to -39.58 dB according to (4.4). Therefore, the DUC design of IEEE 802.11n meets the RCE requirements defined by the standard specifications. The EVM performance of the IEEE 802.11n DUC with 64-QAM modulation is illustrated in Figure 4.33.



Figure 4.33: Received Constellation for IEEE 802.11n (20 MHz) with 64-QAM, without AWGN.

## 4.9 DUC Design for WCDMA

The signals of the three standards (LTE, IEEE 802.16e, 802.11n) discussed earlier in this chapter are based on the OFDM structure. However, the WCDMA standard is somewhat different, in the sense that the baseband signals are not based on an OFDM structure. The bandwidth of WCDMA is fixed to 5 MHz and the SEM requirements are as described in Table 4.23 with an attenuation of 20 dB as a safety margin [71].

Table 4.23: SEM Requirements for WCDMA.

| $ \Delta f $ in the range | Minimum Attenuation (dB) | with Safety Margin (dB) |
|---------------------------|--------------------------|-------------------------|
| 2.5 MHz to 2.7 MHz        | 36                       | 56                      |
| 2.7 MHz to 3.5 MHz        | 36-48                    | 56-68                   |
| 3.5 MHz to 12 MHz         | 50                       | 70                      |

## 4.9.1 Filter Design Considerations

Unlike the three OFDM-based standards, the WCDMA standard defines the type and the parameters of the channel filter. The Square Root Raised Cosine (SRRC) filter with roll-off  $\alpha$ =0.22 is specified for pulse shaping in the transmitter chain [100]–[102]. The SRRC filter in the transmitter is usually implemented to match an equivalent SRRC filter in the receiver, and thus to implement the Raised Cosine response over the link. The Raised Cosine response meets the Nyquist criterion and has the property of zero-ISI. The SRRC filter also has the benefit of constraining transmission bandwidth as compared to the rectangular pulse shaping filter [103].

In general, the SRRC filter is implemented with interpolation factors of 2, 4 or 8 [104]–[106]. In this example, an interpolation factor of 2 is chosen. Since the overall sample rate is increased from 3.84 Msps to 61.44 Msps, one SRRC filter and three HB filters are required in cascade to perform interpolation, giving a total interpolation ratio of 16. The architecture of the designed WCDMA DUC is illustrated in Figure 4.34.



Figure 4.34: Architecture of WCDMA DUC.

The filter design parameters for WCDMA are summarised in Table 4.24. The passband frequency is defined to be half of the occupied bandwidth, i.e.  $3.84 \div 2 = 1.92$  (MHz). A 51-tap symmetric SRRC filter is designed. Since the SEM specifies a sharp transition band between 2.5 MHz and 2.7 MHz with an attenuation of 56 dB, Chebyshev windowing can be applied in the process of SRRC filter design to reduce sidelobe levels, so as to meet the stringent SEM requirements while maintaining the same filter length [107]. In this example, the Chebyshev attenuation is set to 30 dB in FDATool.

|        | $F_s$ (Msps) | $F_{pass}$ (MHz) | Roll-off α | $A_{pass}$ (dB) | Taps |
|--------|--------------|------------------|------------|-----------------|------|
| SRRC   | 7.68         | 1.92             | 0.22       |                 | 51   |
| 1st HB | 15.36        | 2.34             |            | 0.001           | 27   |
| 2nd HB | 30.72        | 2.34             |            | 0.001           | 15   |
| 3rd HB | 61.44        | 2.34             |            | 0.001           | 11   |

Table 4.24: Filter Designs for WCDMA.

For the HB filters, the passband frequency is defined as  $3.84 \times 1.22 \div 2 = 2.34$  (MHz), and the passband ripple is set to 0.001 dB. Therefore, 27, 15 and 11 taps are required for the three HB filters respectively to meet this design specification [71].

### 4.9.2 DUC Performance for WCDMA

#### 4.9.2.1 Power Spectral Density of Transmitted Signals

The PSD result of the DUC for WCDMA is depicted in Figure 4.35. The SEMs are obtained according to Table 4.23. The result shows that the WCDMA signal has been modulated up to 15 MHz and meets the SEM requirements.



Figure 4.35: DUC Transmission Spectrum of WCDMA.

### 4.9.2.2Error Vector Magnitude

There is a slight difference in the EVM definition between WCDMA and the other standards which are based on OFDM symbols. The reference waveform and the error vector waveform have to pass through a matched filter, and then the EVM calculation can be performed according to (4.1), as discussed earlier with regard to the WCDMA standard [100]–[102].

The input data for the WCDMA DUC contains 10000 symbols assuming QPSK modulation with a Spreading Factor (SF) of 128. The maximum EVM value allowed

in the standard is 17.5% with QPSK modulation. Here, the matched filter is viewed as a cascade of one SRRC filter and three HB filters, with 543 taps in total according to the designs of SRRC and HB filters in Table 4.24. Therefore, the calculated EVM value of WCDMA DUC is equivalent to 0.93%, referred to (4.1), where N is equal to 10000. The received constellation plot of the WCDMA DUC is shown in Figure 4.36.



Figure 4.36: Received Constellation for WCDMA with QPSK, without AWGN.

## 4.9.2.3Adjacent Channel Leakage Ratio

In terms of ACLR, the standard defines the ACLR as the ratio of the in-band power processed by the matched filter, to the power in an adjacent channel also processed by the matched filter. The first adjacent channels are at 5 MHz offset frequency and the second adjacent channels are at 10 MHz offset frequency. The ACLR1 and ACLR2 are limited to 45 dB and 50 dB respectively [100]–[102]. The matched filter used in the

ACLR calculation is identical to that used in the EVM calculation. The achieved ACLR1 and ACLR2 values are 70.15 dB and 75.20 dB respectively. The overall performance of the WCDMA DUC can be summarised in Table 4.25, and the specifications are clearly met.

|       | Requirements | Requirements +<br>Safety Margin | WCDMA Design |
|-------|--------------|---------------------------------|--------------|
| EVM   | 17.5%        |                                 | 0.93%        |
| ACLR1 | 45 dB        | 65 dB                           | 70.15 dB     |
| ACLR2 | 50 dB        | 70 dB                           | 75.20 dB     |

Table 4.25: DUCs for WCDMA Performance Metrics.

## 4.9.3 Implementation Results

The implementation results for the DUC designs for IEEE 802.11n (20 MHz) and WCDMA are listed in Table 4.26. Both of the designs are able to achieve their respective timing requirements. The IEEE 802.11n (20 MHz) DUC occupies 488 slices and 12 DSP48Es, and the WCDMA DUC uses 708 slices and 5 DSP48Es. Also, 9 BRAMs are used for the DDS component in both cases. The timing requirements of the DUC designs are at least 240 MHz for the 802.11n (20 MHz) and 245.76 MHz for WCDMA. It is obvious that both of the designs can meet these timing requirements.

 Table 4.26: Hardware Utilization of DUCs for IEEE 802.11n and WCDMA.

| DUC              | LUTs | FFs  | Slices | DSP48Es | BRAMs | $f_{\rm max}$ (MHz) |
|------------------|------|------|--------|---------|-------|---------------------|
| 802.11n (20 MHz) | 991  | 1438 | 488    | 12      | 9     | 442.1               |
| WCDMA            | 1229 | 1682 | 708    | 5       | 9     | 435.2               |

## 4.10 Concluding Remarks

This chapter has introduced the concept, principle and architecture of the DFE in the transmitter chain. In addition, the measurements of key performance metrics for the DUC architecture have been presented. Filter design considerations for LTE (10 MHz) and (5 MHz), IEEE 802.16e (10 MHz, 7 MHz, 5 MHz and 3.5 MHz), IEEE 802.11n (20 MHz) and WCDMA were discussed in detail. Furthermore, the DUC performance results obtained demonstrated that the developed designs can meet these standard requirements. The implementation results of DUC designs are also provided, and these will be further analysed in Chapter 6.

The DUC designs are able to solve the problems of RF diversity for multiple standards or modes, which are considered to comply with the requirements of the SDR system. However, it is impractical and less well suited to the SDR philosophy to integrate all of the DUC designs discussed in this chapter into a single FPGA device. Besides, only one DUC design is required to operate at any given time which would result in an unnecessarily large device size, high power consumption and low efficiency.

For this reason, the DUC designs could be implemented with dynamic reconfigurable technologies, i.e. PR and DRP, as discussed in Chapter 3, to provide functionality switching and to reduce the device size and power consumption significantly. Moreover, some components of the baseband processing could also be implemented with dynamic reconfigurable technologies to solve the problems of baseband design diversity, and this aspect will be analysed in detail in Chapter 6.

Recently, wireless opportunities in TV White Space have created a new interesting area where SDR can be applied. The secondary use of TV spectrum is unlikely to be driven by only one standard, and therefore there exits the opportunity to support multi-standards at (low power) community basestations, and to use one FPGA hardware platform to potentially support multiple standards from the myriad of new and emerging candidate TV White Space standards. Therefore the DUC designs for

conventional standards or modes could be extended to the special applications of TV white spaces to enhance the overall SDR system performance, and this will be considered in Chapter 5.

## Chapter 5

# **TV White Space Application**

In the previous chapter, DUC designs for several conventional standards (LTE, IEEE 802.16e, IEEE 802.11n and WCDMA) were discussed in detail. Following this, the current chapter will consider DUC designs from conventional standards to "special editions" for TV White Space (the secondary use of TV spectrum for applications such as rural broadband and machine to machine communication).

The concepts and features of the TV white space are introduced, and design challenges for TV white space are presented, in particular that of meeting the SEM requirements. The DUC designs and performance of IEEE 802.11n, LTE and IEEE 802.16e special editions for TV white space applications are discussed in detail thereafter, along with hardware implementation results. The DUC designs for conventional standards in Chapter 4 and TV white space variants in Chapter 5 will be further compared and analysed in Chapter 6.

## 5.1 Overview

With the conversion from analogue to digital television, a large amount of licensed spectrum will be released, and this is often referred to as *TV white space*. White space spectrum resources are at low frequencies, and provide significant advantages for wireless communication, compared to cellular frequencies in the Gigahertz range:

• Better Performance: The communication can provide better indoor penetration and longer distance due to the enhanced propagation properties of the white space frequencies. Moreover, longer distance implies wider coverage. The transmission distance of the commonly used IEEE 802.11 is less than 100 meters in 2.4 GHz or 5 GHz bands. However, taking advantage of white space spectrum, the transmission range can be extended to 5 miles at maximum, and thus the coverage could reach a 75 square mile area [111] [112].

• Lower Cost: Since the coverage can be increased dramatically, the number of base stations would be reduced significantly, and therefore the deployment cost would drop. In the case of rural environments, it is not cost-effective to deploy telephone copper or cable infrastructure in harsh landscapes such as rugged mountains or sea to enable access to the Internet. Therefore, wireless technology is viewed as a solution very applicable to the rural area broadband access problem [113] [114].

Wireless Internet technology can take advantage of white space spectrum to enable wider coverage and higher throughput compared to the DSL broadband. There might be more extensive available channel resources in rural areas compared to the urban environment, due to the reduced number of services operating in the spectrum, thus offering potentially higher bandwidth in these locations, whereas DSL over long distances cannot guarantee transmission with high throughput. As a result, the Capital Expenditure (CapEx) and Operating Expenditure (OpEx) involved in providing high speed rural broadband could be decreased significantly by using wireless as apposed to wired infrastructure, while maintaining good performance and quality of service [115].

## 5.2 TV White Space Spectrum Resource

The frequency spectrum available for wireless communication is a limited resource all over the world, and the usage of spectrum is generally regulated locally by regulatory institutions in specific countries. Broadcast TV services usually occupy the licensed channels to transmit signals and this licensed spectrum is not available for use by unlicensed devices according to the regulatory rules in many countries. In general, the term "licensed transmission" refers to analogue TV, Digital TV and wireless microphone service. In addition, with the release of the Report & Order by the Federal Communications Commission (FCC) in the United States in November 2008, the freed white space spectrum can be used by unlicensed devices (fixed and portable) as long as existing licensed channels are protected from the interference generated by their transmissions [108].

The TV white space spectrum is divided into 6 MHz wide channels in the USA and 8 MHz wide channels in the UK. The conversion from analogue to digital TV has been completed in the USA, and the FCC has approved robust rules for operation of unlicensed devices in white space spectrum, while the Office of Communications (Ofcom) in the UK is in the process of analogue and digital television signals switching at the time of writing, and the regulations are yet to be finalised. The analysis and designs for TV white space considered in this thesis are therefore based on the FCC rules.

TV white space spectrum is distributed at Very High Frequency (VHF) and Ultra High Frequency (UHF). There are two types of TV white space devices: fixed devices (e.g. base stations or access points on customer premises) and portable devices (e.g. laptops or mobile phones) [109]. The allocation of the white space spectrum in the USA is as follows: Channels 2-13 are at VHF but the Channels 3 and 4 are not available for fixed devices in order to avoid interference with external devices (e.g. DVD players). Most of the white space channels are at UHF, starting with Channel 14 (470 MHz) and ending at Channel 51 (680 MHz). Channel 37 is reserved for radio astronomy measurements. Channels 14-20 are only permitted for use by fixed devices, and the rest of the channels may be used by either fixed or portable devices. The channel information for fixed and portable devices in the USA is summarised in Table 5.1 [110].

| TV Channel | Frequency | Frequency (MHz)  | Devices            |
|------------|-----------|------------------|--------------------|
| 2          | VHF       | 54-60            | Fixed              |
| 5-6        | VHF       | 76-88            | Fixed              |
| 7-13       | VHF       | 174-216          | Fixed              |
| 14-20      | UHF       | 470-512          | Fixed              |
| 21-35      | UHF       | 512-602          | Fixed and Portable |
| 36, 38     | UHF       | 602-608, 614-620 | Portable           |
| 39-51      | UHF       | 620-698          | Fixed and Portable |

Table 5.1: TV Channels for fixed and portable devices in the USA.

## 5.3 TV White Space Technical Challenges

As discussed earlier, TV white space devices are able to offer many attractive features. However, the use of white space presents technical design challenges compared to devices used in other frequency bands. Specially, as proposed by authors in [110], a successful TV white space device has to overcome these design challenges: spectrum sensing, geo-location, dynamic frequency access, and meeting the stringent SEM requirements defined by the FCC.

Spectrum sensing is used to detect which channels are occupied by TV signals or wireless microphones, and which channels are unoccupied, before the device starts to operate. The geo-location ability required by the FCC rules for both fixed and some portable mode devices implies that the device can access the Internet and consult incumbent databases. The incumbent databases maintain the usage information of spectrum resources at VHF and UHF for many devices, e.g. digital TV transmissions, microphones and other white space devices. In addition, white space devices have to report their location to the incumbent databases as the usage of spectrum resources varies geographically.

The FCC defined strict limitations on transmission in white space spectrum via the

SEM to minimise the interference caused to existing licensed transmissions. Note that the FCC SEM requirement is the only aspect addressed in this thesis, and the other technical challenges discussed above, namely spectrum sensing, geo-location and dynamic frequency access are excluded from this study.

In order to prevent excessive interference, and to make full use of white space spectrum, the FCC defined three limitations on out-of-band spectral emissions; these limitations relate to the adjacent channel, beyond adjacent channel, and special channel (Channel 37) respectively, as listed in Table 5.2 [110] [116].

| Device   | Adjacent Channel (dB) | Beyond Adjacent (dB) | Channel 37 (dB) |
|----------|-----------------------|----------------------|-----------------|
| Fixed    | 55                    | 69                   | 95              |
| Portable | 55                    | 53                   | 79              |

 Table 5.2: SEM Requirements for white space devices Defined by FCC.

The SEM requirements for adjacent channels and beyond adjacent channels are essential rules, and the requirement for the special channel is additional when using channels neighbouring the special channel. According to Table 5.1, the majority of the white space spectrum is distributed at Channels 14-35 and Channels 39-51 in the case of fixed devices. Channels 14-35 have more available spectrum than Channels 39-51 and only the spectrum range at channels 14-35 is considered in this study.

The SEM proposed by the FCC is shown by the red line in Figure 5.1. In this study, Channel 33 has been chosen for TV white space transmission, because its SEM has the most exacting limitations: two essential rules and one additional rule for special Channel 37, as suggested in [110]. Note that the SEM limitations refer to transmission of any communication standard using the white space spectrum. In other words, any deployed standard has to use a modified SEM to comply with the rules defined by the FCC. The IEEE 802.11n scaled from 20 MHz to 5 MHz is taken as an example, shown by the blue line in Figure 5.1. As a result, the modified SEM of IEEE 802.11n (5 MHz) designed to comply with the FCC rules is described in Figure 5.2.



Figure 5.1: SEM Requirements of IEEE 802.11n (5 MHz) and White Space.



Figure 5.2: Combined SEM Requirements of White Space and IEEE 802.11n (5MHz).

Based on the analysis of Chapter 4, a DUC could be designed to enable devices to meet the strict SEM requirements defined by each standard and mode. Therefore, the DUC designs could also be adapted for white space devices to cope with the SEM requirements resulting from the combination of the communication standards and FCC rules. The DUC design considerations and methods used for conventional standards and modes may be applied to the TV white space DUC designs. Since authors in [117]

declared that IEEE 802.11, IEEE 802.16 and LTE have the potential to be deployed in the white space spectrum, the DUC designs of IEEE 802.11n, IEEE 802.16e and LTE for TV white space applications are considered in this white space aspect of the study.

Moreover, the white space spectrum is fragmented and varies from place to place, even from time to time. Therefore, it is necessary and efficient to enable TV white space DUC designs to support variable bandwidths according to the geographical environment. For example, when two or four neighbouring channels are available, the white space device is able to operate on a wide channel width to increase the data throughput. If the data throughput requirement is not too high, the white space device is capable of scaling to a narrower channel width. This adaptive ability for variable bandwidth is another attractive feature of the TV white space DUC designs developed in this study.

Since the freed TV white space spectrum can be used for any wireless communication without license, as long as these unlicensed transmissions do not interference with protected transmissions, multiple standards or modes could coexist in the TV white space spectrum. In addition, new and emerging candidate TV white space standards such as IEEE 802.11af, 802.22, and 3GPP LTE-TDD have the potential to provide more services. Moreover, the situations of TV white space spectrum and communication services vary geographically and thus TV white space devices are required to support a number of standards or modes in order to operate successfully in multiple locations. Therefore, there is a motivation that TV white space devices should be capable of supporting more standards in the future. In this thesis, IEEE 802.11n (5 MHz, 10 MHz and 20 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and LTE (5 MHz and 10 MHz), i.e. three standards and nine modes in total, are supported within the developed system to ensure that it has the potential to be applied widely. To the best of the author's knowledge, this type of flexible DUC design for white space has not been considered before.

## 5.4 IEEE 802.11n DUC Designs for TV White Space

IEEE 802.11 (WiFi) over white space spectrum, referred to as the "WhiteFi" or "Super WiFi", has been investigated by a number of researchers and companies [109] [115] [118] [119]. The IEEE 802.11n variant is an amendment to IEEE 802.11 standards which is widely deployed for WLANs and can provide very high data throughput of up to 600 Mbits [120]–[122]. Consequently, IEEE 802.11n is selected for TV white space applications in this study. Since the white space is fragmented and the DUC for white space is intended to support adaptive channel bandwidth, the bandwidth of IEEE 802.11n could be divided into three scenarios: 5 MHz, 10 MHz and 20 MHz, among which the 5 MHz is the basic format assuming only one TV channel is available, and 10 MHz and 20 MHz are additional options when two or four neighbouring available channels are aggregated.

## 5.4.1 DUC Design Considerations

Based on the analysis of Chapter 4, the selection of a reasonable IF sampling rate and system clock plays an important role in the process of DUC design. The selected IF sampling rate should be equal to or less than half of the system clock for time division multiplexing, resulting in hardware reduction especially in terms of the DSP48Es.

In the case of IEEE 802.11n (20 MHz) DUC, discussed in Chapter 4, the IF sampling rate is 60 Msps and system clock is 240 MHz. According to Table 5.1, the TV white space spectrum for fixed devices is distributed from 470 MHz (Channel 14) to 602 MHz (Channel 35) and 620 MHz (Channel 39) to 698 MHz (Channel 51) with range of 132 MHz and 78 MHz respectively. The IF sampling rate of 60 Msps implies that the DDS component in the DUC operates at 60 MHz and thus the carrier frequency may be controlled within a range of 60 MHz, ranging from  $-f_{\rm IF}/2$  to  $f_{\rm IF}/2$ . This IF carrier frequency range cannot cover all of the spectrum between 470 MHz and 602 MHz, i.e. 132 MHz, and therefore the IF sample rate has to be increased in the DUC designs for

TV white space so as to maximise the carrier frequency range. Taking this factor into account, the DUC designs for IEEE 802.11n for TV white space use an IF sampling rate of 120 Msps and system clock of 240 MHz.

A comparison of carrier frequency ranges using these two IF sample rates is shown in Figure 5.3. Since a sufficiently high IF sampling rate cannot be achieved, analogue components are required to up-convert the signal to the higher RF frequency range after the IF and DAC processing [123]–[126]. Here, the centre is assumed to be located at the lower edge frequency of Channel 29 (i.e. 560 MHz), an assumption which will apply to all DUC performance analysis in this chapter. As a result, the carrier frequency can be programmed with the range of channels from 24-33 when the IF sampling rate is 60 Msps. However, this range can be extended to Channels 19-38 with the 120 Msps IF sampling rate.



Figure 5.3: Comparison of Different IF Sample Rates in TV White Space.

Compared to the designs in Chapter 4, the IF sampling rate of IEEE 802.11n (20 MHz) DUC has been increased to the half of the system clock while the system clock remains the same, and this also applies to other white space DUC designs in this chapter. As a result, the parameters of the DUC designs for TV white space are as listed in Table 5.3.

| TVWS Design            | Input Sample Rate<br>(Msps) | IF<br>(MHz) | System Clock<br>(MHz) |
|------------------------|-----------------------------|-------------|-----------------------|
| IEEE 802.11n (5 MHz)   | 7.5                         | 120         | 240                   |
| IEEE 802.11n (10 MHz)  | 15                          | 120         | 240                   |
| IEEE 802.11n (20 MHz)  | 30                          | 120         | 240                   |
| IEEE 802.16e (10 MHz)  | 11.2                        | 179.2       | 358.4                 |
| IEEE 802.16e (7 MHz)   | 8                           | 128         | 256                   |
| IEEE 802.16e (5 MHz)   | 5.6                         | 179.2       | 358.4                 |
| IEEE 802.16e (3.5 MHz) | 4                           | 128         | 256                   |
| LTE (5 MHz)            | 15.36                       | 122.88      | 245.76                |
| LTE (10 MHz)           | 7.68                        | 122.88      | 245.76                |

Table 5.3: DUC Design Parameters for TV White Space.

## 5.4.2 White Space DUC Design for IEEE 802.11n (5 MHz)

The IEEE 802.11n (5 MHz) DUC for white space can be obtained from the conventional IEEE 802.11n (20 MHz) DUC implementation by reducing the sampling rate by a factor of 4, and thus its sampling rate after baseband processing is 7.5 Msps. According to the analysis in Section 4.6.1, it is efficient to employ the channel filter at single rate to achieve a reduction in terms of DSP48Es. A total interpolation factor of 16 is required and thus four HB filters are needed. The architecture of the IEEE 802.11n (5 MHz) DUC is illustrated in Figure 5.4.



Figure 5.4: Architecture of IEEE 802.11n (5 MHz) DUC for TV White Space.

With regard to the DDS component, the SFDR requirement is increased to 112 dB from 107 dB in Chapter 4, with 0.25 Hz frequency resolution, so as to meet the out of band attenuation of 95 dB, as specified in the FCC rules. Compared to the phase dithered method used in Chapter 4, the Taylor series corrected method is employed to ensure that the DDS can reach a SFDR of 112 dB. The Taylor series corrected method uses several DSP48Es to calculate corrections from the discarded fractional bits to achieve a higher SFDR than the phased dithered method [85]. The clock frequency of the DDS is 120 MHz and the carrier frequency can be programmed within a range of 120 MHz (i.e. -60 MHz to 60 MHz).

The mixer component has to be amended compared to the conventional IEEE 802.11n (20 MHz) DUC discussed in Section 4.8. Since only two clock cycles are available for each IF output sample, two DSP48Es are needed to perform the complex multiplication according to (4.2), which normally requires 4 real multipliers. The DSP48E architecture is described in Figure 5.5, and both DSP48Es have a similar architecture. The DSP48E for the I channel output employs P=P-(A\*B) as the operation in the multiplexer, while the P=P+(A\*B) is used in DSP48E for the Q channel output. The operation modes for I and Q channel outputs are as listed in Table 5.4.



Figure 5.5: DSP48E Architecture with 120 Msps IF Sampling Rate for I Channel Output.

| Outputs | OPMODE 1 | OPMODE 2  |
|---------|----------|-----------|
| Ι       | P=(A*B)  | P=P-(A*B) |
| Q       | P=(A*B)  | P=P+(A*B) |

Table 5.4: Operation Modes for I and Q Channel Outputs.

Taking into account SEM requirements of IEEE 802.11n (5 MHz) for white space in Figure 5.2, and the IEEE 802.11n OFDM symbol properties in Table 4.21 on page 87, the white space DUC filter designs for IEEE 802.11n (5 MHz) can be summarised in Table 5.5.

All of the filter designs are implemented by FDATool in MATLAB. The design parameters,  $F_s$ ,  $F_{pass}$  and  $A_{pass}$  refer to the sample frequency, passband frequency and passband ripple respectively.

Based on the analysis in Section 4.8, the passband and stopband frequencies are set to 8.75 MHz and 11.25 MHz respectively for IEEE 802.11n (20 MHz). As a result, the

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 7.5                      | 2.1875                     | 2.8125                     | 0.01                      | 90                        | 53   |
| 1st HB            | 15                       | 2.5                        |                            | 0.0005                    |                           | 35   |
| 2nd HB            | 30                       | 2.5                        |                            | 0.0005                    |                           | 15   |
| 3rd HB            | 60                       | 2.5                        |                            | 0.0005                    |                           | 11   |
| 4th HB            | 120                      | 2.5                        |                            | 0.0005                    |                           | 11   |

Table 5.5: DUC Filters for IEEE 802.11n (5 MHz).

passband frequency and stopband frequency for IEEE 802.11n (5 MHz) are scaled to 2.1875 MHz and 2.8125 MHz respectively. When passband ripple and stopband attenuation are set to 0.01 dB and 90 dB respectively, the channel filter requires 53 taps in total with a symmetric structure. Each HB filter is interpolated by a factor of 2 and the stopband attenuation of each HB filter has to guarantee the filter chain as a whole can meet the SEM requirements. Therefore, the passband ripple is set to 0.0005 dB, which ensures that the out of band attenuation is 100 dB for the 1st HB filter. Note that the filter coefficients are quantised to 22 bits with 21 fractional bits, and that the output of the DUC component has to be configured with 21 fractional bits in order to achieve a stopband attenuation for the entire DUC design of approximately 100 dB. The magnitude response for each filter design is shown in Figure 5.6.



Figure 5.6: Magnitude Response of Each Filter: (a) Channel Filter; (b) 1st HB Filter; (c) 2nd HB Filter; (d) 3rd HB Filter; (e) 4th HB Filter.

The overall magnitude response of the cascade of channel and HB filters for TV white space IEEE 802.11n (5 MHz) DUC is shown in Figure 5.7. With the filter coefficients quantised to 22 bits, it is obvious that the stopband attenuation can meet the exacting SEM requirements.



Figure 5.7: Overall TV White Space IEEE 802.11n (5 MHz) DUC Filter Response.

#### 5.4.3 White Space DUC Designs for IEEE 802.11n (10 MHz) and (20 MHz)

The IEEE 802.11n (10 MHz) and (20 MHz) variants are additional options for white space applications, and can be employed when two or four available TV channels are aggregated. As for the earlier DUC designs presented in Chapter 4, the filter designs of the white space IEEE 802.11n (5 MHz) DUC can be shared by the white space IEEE 802.11n (10 MHz) DUC and (20 MHz) DUC as they have identical normalised passband and stopband frequencies requirements. In addition, the fourth HB filter is excluded for white space IEEE 802.11n (10 MHz) DUC, and the third and fourth HB filters are excluded for white space IEEE 802.11n (20 MHz) DUC, in order that they share the same IF sampling rate. The architectures of IEEE 802.11n (10 MHz) DUC and (20 MHz) DUC for white space are shown in Figure 5.8 and Figure 5.9 respectively.



Figure 5.8: Architecture of IEEE 802.11n (10 MHz) DUC for TV White Space.



Figure 5.9: Architecture of IEEE 802.11n (20 MHz) DUC for TV White Space.

## 5.4.4 DUC Performance for White Space IEEE 802.11n

In Section 4.4, performance metrics of DUC designs were discussed. These metrics can also be applied to measure the performance of DUC designs for TV white space. Therefore, the SEM and EVM measurements of the DUC designs for white space are also considered.

#### 5.4.4.1 Power Spectral Density of Transmitted Signals

The PSD result of IEEE 802.11n (5 MHz) DUC for TV white space is depicted in Figure 5.10. As for previous DUC designs, only the range from centre frequency to the  $f_{\rm IF}/2$  is considered in this study as the SEM is symmetric about the centre frequency. As discussed in Section 5.4.1 and Section 5.3, the centre frequency is assumed to be at the lower edge of Channel 29, i.e. 560 MHz, and the most demanding SEM requirements apply when Channel 33 is chosen for transmission, according to [110].



Figure 5.10: DUC Transmission Spectrum of IEEE 802.11n (5 MHz) for TV White Space.

For this reason, the signal has been translated to 27 MHz, (i.e. to Channel 33) to test whether the SEM requirements are met when only one TV channel is used for transmission. Since Channel 37 is not available for any unlicensed transmission, the PSD in this channel has to be lower than 95 dB for fixed white space devices according to Table 5.2. As a result, it is obvious that the DUC of IEEE 802.11n (5 MHz) for TV white space can meet the SEM requirements resulting from combining the IEEE 802.11n standard and the FCC white space rules.

Similarly, the PSD result of IEEE 802.11n (10 MHz) DUC for TV white space is depicted in Figure 5.11. When two neighbouring TV channels are combined, Channels



Figure 5.11: DUC Transmission Spectrum of IEEE 802.11n (10 MHz) for TV White Space.

32 and 33 are used. In this case, a carrier frequency of 24 MHz is required to modulate the interpolated data from I and Q channels up to Channels 32 and 33. The results demonstrate that it can meet all of the SEM requirements and minimise the interference to the licensed Channel 37.

In the case of IEEE 802.11n (20 MHz) DUC for TV white space, the PSD result is illustrated in Figure 5.12. Four channels (30-33) are aggregated to provide sufficient bandwidth to transmit IEEE 802.11n with 20 MHz bandwidth. As a result, the signal has been translated to 18 MHz to occupy Channels 30-33 to verify that the SEM is met. All of the PSD results demonstrate that the DUC designs can meet the SEM requirements to enable unlicensed communication transmission in TV white space spectrum without affecting the transmissions of licensed channels.



Figure 5.12: DUC Transmission Spectrum of IEEE 802.11n (20 MHz) for TV White Space.

## **5.4.4.2Error Vector Magnitude**

Although the EVM requirements for white space applications are not specified in the standard draft, the EVM of the DUC can be measured as discussed in Section 4.4.2 on page 52 and Section 4.8.2 on page 88. The OFDM-based input data is supplied to the DUC component, and then is decimated by an ideal DDC component. The received data is synchronised with the input data, and an FFT of both sets of data is performed to calculate the EVM value. In the case of IEEE 802.11n (5 MHz) for white space, the DUC has an EVM value of 1.34%, referred to (4.1), where N is equal to  $1000 \times 64$ . The EVM performance with 64-QAM is illustrated as an example in Figure 5.13.



Figure 5.13: Received Constellation for IEEE 802.11n (5 MHz) with 64-QAM, without AWGN.

Similarly, the EVM values of IEEE 802.11n (10 MHz) DUC and (20 MHz) DUC for TV white space are equivalent to 1.20% and 1.09% respectively, referred to (4.1), where N is equal to  $1000 \times 64$ . Both of the designs have better EVM performance as their architecture have fewer interpolation stages than IEEE 802.11n (5 MHz) (they have three and two stages, respectively), as shown in Figure 5.4, Figure 5.8 and Figure 5.9. More HB filers can cause more errors and thus reducing the number of interpolation stages can improve the EVM performance of DUC designs [207].

#### 5.4.5 Implementation Results

The implementation results of white space DUCs for IEEE 802.11n are listed in Table 5.6. Compared to the hardware utilisation of IEEE 802.11n (20 MHz) (Table 4.26 on page 94), it is obvious that the occupied slices and DSP48Es for white

space designs are significantly higher than the equivalent non-white-space designs. Longer filters, increased fractional wordlength coefficients and input data, higher IF sampling rate and the more exacting spectral purity requirements of the DDS are the main reasons for the increase in slices and DSP48Es.

In [127], when the filter coefficient width is equal to or less than 18 for signed and the input data is equal to or less than 25, only one DSP48E slice is required. If the filter coefficient width is larger than 18 and the input data is equal to or less than 25, two DSP48E slices are required. However, when the filter coefficient width is larger than 18 and the input data is larger than 25, four DSP48Es are needed to perform the same multiplication. Note that all of the cases mentioned above are targeting the Virtex-5 device. In case of white space DUC designs, the fractional wordlength of the coefficients has to be increased to at least 21, i.e. larger than 18, and thus the output results of each FIR compiler must be limited to less than 25 in order to minimise the DSP48E resource utilisation. When the IF sampling rate is the half of the system clock, it performs two operations for every output sample and thus one more DSP48E is employed to perform complex multiplication in the mixer component compared to the DUC designs in Chapter 4. In addition, since the Taylor series corrected method is required in the DDS compiler to achieve the higher SFDR figure of white space, it occupies 3 DSP48Es and 1 block RAM to achieve a SFDR of 112 dB. Therefore, meeting the combined PSD requirements of IEEE 802.11n and FFC rules for white space is at the cost of increased hardware resource utilisation due to these factors.

| DUC                            | LUTs | FFs  | Slices | DSP48Es | Block RAM | $f_{\rm max}$ (MHz) |
|--------------------------------|------|------|--------|---------|-----------|---------------------|
| 5 MHz                          | 1669 | 2528 | 860    | 24      | 1         | 407.3               |
| 10 MHz                         | 1658 | 2525 | 1031   | 29      | 1         | 338.9               |
| 20 MHz                         | 1608 | 2473 | 995    | 30      | 1         | 368.3               |
| DDS for<br>5, 10 and 20<br>MHz | 53   | 213  | 83     | 3       | 1         | 518.7               |

Table 5.6: Hardware Utilisation of white space DUCs for IEEE 802.11n.

## 5.5 IEEE 802.16e DUC Designs for TV White Space

IEEE 802.16 also has the potential to be applied in the TV white space spectrum. The IEEE 802.16e standard supports various channel bandwidths from 1.75 MHz to 15 MHz. In this study, IEEE 802.16e with four different bandwidths are selected: 3.5 MHz, 5 MHz, 7 MHz and 10 MHz. The IEEE 802.16e (3.5 MHz and 5 MHz) variants are basic formats and can be implemented with one available TV channel. IEEE 802.16e (7 MHz) and (10 MHz) provide additional bandwidth when two unoccupied channels are aggregated.

## 5.5.1 White Space DUC Designs for IEEE 802.16e (5 MHz) and (10 MHz)

From the analysis of Section 4.7.1 on page 73 and Table 5.3, the sampling rate and the system clock of white space DUC designs for IEEE 802.16e (5 MHz) and (10 MHz) are 179.2 MHz and 358.4 MHz respectively, resulting in a carrier frequency range of 179.2 MHz. Therefore, the architectures of the DUC designs are as shown in Figures 5.14 and 5.15.



Figure 5.14: Architecture of IEEE 802.16e (5 MHz) DUC for TV White Space.



Figure 5.15: Architecture of IEEE 802.16e (10 MHz) DUC for TV White Space.

In the case of the IEEE 802.16e (5 MHz) DUC for white space, the input data has to be interpolated by a factor of 32 and thus 5 HB filters are required. The selected IF sampling rate is 179.2 MHz thus the DDS is controlled to generate carrier frequencies within the range -89.6 MHz to 89.6 MHz. Similarly, 4 HB filters are needed in total with interpolation by a factor of 16 in the case of IEEE 802.16e (10 MHz) DUC for white space. The filter designs used in the IEEE 802.16e (10 MHz) DUC for white space derive from those in the IEEE 802.16e (5 MHz) DUC for white space, as they have identical normalised passband and stopband requirements.

Taking into account the SEM requirements for IEEE 802.16e (5 MHz) and (10 MHz) (see Table 4.11 on page 73), OFDM symbol properties (Table 4.12 on page 75) and the white space requirements proposed by FCC (Table 5.2 on page 101), the filter designs for TV white space IEEE 802.16e (5 MHz) DUC are listed in Table 5.7. The IEEE 802.16e (10 MHz) DUC for white space can share the channel filter and the 1-4 HB filters with the IEEE 802.16e (5 MHz) DUC.

The passband frequency is defined as  $BW_{occupied}/2$  and the stopband frequency is defined as  $BW_{total} - BW_{occupied}/2$ . The passband ripple and stopband attenuation are set to 0.01 dB and 90 dB respectively and thus in total 95 taps with a symmetric structure are needed for the channel filter. The passband ripple of the HB filter is set to 0.0005 dB to ensure that the filter design has sufficient attenuation in the stopband to meet the SEM requirements.

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 5.6                      | 2.369                      | 2.631                      | 0.01                      | 90                        | 95   |
| 1st HB            | 11.2                     | 2.5                        |                            | 0.0005                    |                           | 103  |
| 2nd HB            | 22.4                     | 2.5                        |                            | 0.0005                    |                           | 19   |
| 3rd HB            | 44.8                     | 2.5                        |                            | 0.0005                    |                           | 11   |
| 4th HB            | 89.6                     | 2.5                        |                            | 0.0005                    |                           | 11   |
| 5th HB            | 179.2                    | 2.5                        |                            | 0.0005                    |                           | 11   |

Table 5.7: DUC Filters for IEEE 802.16e (5 MHz).

## 5.5.2 DUC Performance for White Space IEEE 802.16e (5 MHz) and (10 MHz)

## 5.5.2.1 Power Spectral Density of Transmitted Signals

The PSD result of IEEE 802.16e (5 MHz) DUC for TV white space is illustrated in Figure 5.16. The range from the central point in the frequency range to  $f_{\rm IF}/2$  is considered. The central point is located at the lower edge of Channel 29, and thus a carrier of frequency 27 MHz is generated by the DDS component so as to translate the signal into Channel 33.

The PSD result of IEEE 802.16e (10 MHz) DUC for TV white space is depicted in Figure 5.17. A carrier of 24 MHz is required when Channels 32-33 are used. Both sets of PSD results demonstrate that the DUC designs can simultaneously meet the SEM requirements defined by the standard and FCC.


Figure 5.16: DUC Transmission Spectrum of IEEE 802.16e (5 MHz) for TV White Space.



Figure 5.17: DUC Transmission Spectrum of IEEE 802.16e (10 MHz) for TV White Space.

## **5.5.2.2Error Vector Magnitude**

The EVM requirement for IEEE 802.16e for TV white space has not become a part of the standard specifications, but an EVM measurement for the DUC can be performed as discussed before. The input data contains 1000 OFDM symbols based on 16-QAM Optional FUSC type, with 1024 and 512 FFT sizes for white space IEEE 802.16e (10 MHz) and (5 MHz) respectively. EVM values are calculated based on the method in Section 4.7.2 on page 77. The EVM values generated by the DUC for white space IEEE 802.16e (5 MHz) and (10 MHz) are equivalent to 1.05% and 1.03% respectively, according to (4.1), where N is equal to  $1000 \times 512$  and  $1000 \times 1024$  respectively. The received and the input constellation plots for 16-QAM modulation for TV white space IEEE 802.16e (5 MHz) DUC are shown in Figure 5.18.



Figure 5.18: Received Constellation for IEEE 802.16e (5 MHz) with 16-QAM, without AWGN.

# 5.5.3 White Space DUC Designs for IEEE 802.16e (3.5 MHz) and (7 MHz)

Based on the analysis of Section 4.7.3 on page 79 and according to Table 5.3, the sampling rate and the system clock rate of white space DUC designs for IEEE 802.16e (3.5 MHz) and (7 MHz) are 128 MHz and 256 MHz respectively. The IEEE 802.16e (3.5 MHz) for white space is a basic format catering for a single channel, while IEEE 802.16e (7 MHz) is an additional format used when two neighbouring channels are available. Similar to IEEE 802. 16e (10 MHz) and (5 MHz), the filter designs of the IEEE 802.16e (3.5 MHz) DUC for white space can be shared by the IEEE 802.16e (7 MHz) DUC for white space, and the resulting DUC architectures are illustrated in Figure 5.19 and Figure 5.20.



Figure 5.19: Architecture of IEEE 802.16e (3.5 MHz) DUC for TV White Space.



Figure 5.20: Architecture of IEEE 802.16e (7 MHz) DUC for TV White Space.

With regard to IEEE 802.16e (3.5 MHz) DUC for TV white space, the sampling rate has to be increased from 4 Msps to 128 Msps, giving a total interpolation ratio of 32. As a result, 5 HB filters are required to perform the interpolation. The HB filters 1–4 can be shared by the IEEE 802.16e (7 MHz) DUC for TV white space.

Considering the factors of SEM requirements for IEEE 802.16e (3.5 MHz) and (7 MHz) (Table 4.15 on page 80), OFDM symbol properties (Table 4.16 on page 81) and the SEM requirements for TV white space (Table 5.2), the filter designs for TV white space IEEE 803.16e (3.5 MHz) DUC are as summarised in Table 5.8.

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 4                        | 1.691                      | 1.809                      | 0.01                      | 90                        | 150  |
| 1st HB            | 8                        | 1.75                       |                            | 0.0003                    |                           | 91   |
| 2nd HB            | 16                       | 1.75                       |                            | 0.0003                    |                           | 19   |
| 3rd HB            | 32                       | 1.75                       |                            | 0.0003                    |                           | 15   |
| 4th HB            | 64                       | 1.75                       |                            | 0.0003                    |                           | 11   |
| 5th HB            | 128                      | 1.75                       |                            | 0.0003                    |                           | 11   |

Table 5.8: DUC Filters for IEEE 802.16e (3.5 MHz).

The passband frequency is defined as  $BW_{occupied}/2$  and the stopband frequency is defined as  $BW_{total} - BW_{occupied}/2$ . The channel filter of 150 taps is generated when the passband ripple and stopband attenuation are set to 0.01 dB and 90 dB respectively. In the case of the HB filters, the passband cutoff frequency is set to 1.75 MHz with a passband ripple of 0.0003 dB, which provides the 1st HB with a stopband attenuation of around 100 dB and ensures that the filter chain can meet the SEM requirements.

#### 5.5.4 DUC Performance for White Space IEEE 802.16e (3.5 MHz) and (7 MHz)

## 5.5.4.1 Power Spectral Density of Transmitted Signals

The PSD result of the IEEE 802.16e (3.5 MHz) DUC for TV white space is illustrated in Figure 5.21. The central point in the frequency range coincides with the lower edge of Channel 29 and thus a carrier of 27 MHz is generated by the DDS component so as to translate the signal into Channel 33.



Figure 5.21: DUC Transmission Spectrum of IEEE 802.16e (3.5 MHz) for TV White Space.

Similarly, the PSD result of IEEE 802.16e (7 MHz) DUC for TV white space is depicted in Figure 5.22. A complex sinusoid of 24 MHz is generated to modulate the interpolated signal in order that Channels 32-33 are occupied. In Channel 37, the PSD result of the design is lower than 100 dB, which complies with the additional FCC SEM requirements. Both of the results demonstrate they are capable of meeting the SEM requirements defined by standard and by the FCC.



Figure 5.22: DUC Transmission Spectrum of IEEE 802.16e (7 MHz) for TV White Space.

# 5.5.4.2Error Vector Magnitude

The EVM values of the DUC for white space IEEE 802.16e (3.5 MHz) and (7 MHz) are equivalent to 1.11% and 1.09% respectively, according to (4.1), where N is equal to  $1000 \times 512$  and  $1000 \times 1024$  respectively. The received and the input constellation plots with 16-QAM modulation are taken as an example for TV white space IEEE 802.16e (3.5 MHz), and are shown in Figure 5.23.



Figure 5.23: Received Constellation for IEEE 802.11n (3.5 MHz) with 16-QAM, without AWGN.

## 5.5.5 Implementation Results

The implementation results of white space DUCs for IEEE 802.16e are listed in Table 5.9. Notably, the occupied slices and DSP48Es in the white space designs have increased significantly compared to the hardware utilisation of conventional DUCs for IEEE 802.16e (Table 4.19 on page 86). Longer filters, increased fractional wordlengths for the coefficients and input data, a higher IF sampling rate and the more exacting spectral purity requirements of the DDS component are the main reasons for the increased usage of slices and DSP48Es, as discussed previously.

Based on the results presented in Table 5.9, the DUC of IEEE 802.16e (7 MHz) for white space occupies the most DSP48Es and the DUC of IEEE 802.16e (10 MHz) for white space uses the most slices, reaching 33 and 1184 respectively. The maximum frequency is analysed to test whether all of the DUC designs can meet the timing

requirements for the system (358.4 MHz for white space IEEE 802.16e (10 MHz) and (5 MHz), 256 MHz for white space IEEE (7 MHz) and (3.5 MHz)). It is obvious that all of the designs are able to meet the critical timing requirements.

| DUC                    | LUTs | FFs  | Slices | DSP48Es | Block RAMs | $f_{\rm max}$ (MHz) |
|------------------------|------|------|--------|---------|------------|---------------------|
| 5 MHz                  | 1982 | 3044 | 1142   | 27      | 1          | 408.2               |
| 10 MHz                 | 1848 | 2923 | 1184   | 30      | 1          | 387.9               |
| DDS for<br>5 & 10 MHz  | 54   | 267  | 102    | 3       | 1          | 495.8               |
| 3.5 MHz                | 2043 | 3129 | 1114   | 28      | 1          | 376.2               |
| 7 MHz                  | 1953 | 3066 | 1173   | 33      | 1          | 411.9               |
| DDS for<br>3.5 & 7 MHz | 52   | 256  | 93     | 3       | 1          | 459.6               |

Table 5.9: Hardware utilisation of white space DUCs for IEEE 802.16e.

# 5.6 LTE DUC Designs for TV White Space

Authors in [117] expressed the opinion that LTE has great potential to be applied in TV white space. Since the purpose of LTE high speed data for mobile phones and terminals is to replace the current existing 3G wireless mobile standards, e.g., WCDMA, it has great market potential. LTE in white space spectrum may have larger coverage and better indoor penetration, which are attractive features for both commercial operators and customers: larger coverage area means fewer base stations are required to maintain the same area compared to the conventional LTE transmission in the Gigahertz range, resulting in lower CapEx and OpEx and thus operators could reduce costs and provide better service. Better indoor penetration means customers could receive better signal quality in buildings and enjoy better service. In summary, LTE in TV white space has a potential market in the future, and this can be concluded only from a technical perspective.

LTE supports flexible various bandwidths from 1.4 MHz to 20 MHz. Choosing bandwidth with 1.4 MHz or 3 MHz may require new system clock frequencies and can increase entire design complexity. In this thesis, TV white space DUC designs for LTE are only required to support 10 MHz and 5 MHz.

## 5.6.1 White Space DUC Designs for LTE (5 MHz) and (10 MHz)

Based on the analysis of Section 4.6.1 on page 59 and according to Table 5.3, the IF sampling rate and the system clock of white space DUC designs for LTE (5 MHz) and (10 MHz) are 122.88 MHz and 245.76 MHz respectively, resulting in a carrier frequency range of 122.88 MHz. Therefore, the architectures of the DUC designs are shown in Figure 5.24 and Figure 5.25.



Figure 5.24: Architecture of LTE (5 MHz) DUC for TV White Space.



Figure 5.25: Architecture of LTE (10 MHz) DUC for TV White Space.

In the case of the LTE (5 MHz) DUC for white space, the input data has to be interpolated by a factor of 16 and thus 4 HB filters are required. The selected IF sampling rate is 122.88 MHz, thus the DDS is controlled to generate the any carrier frequency with a range of -61.44 MHz to 61.44 MHz. Similarly, 3 HB filters are needed in total with interpolation by a factor of 8 in the case of LTE (10 MHz) DUC for white space. The filter designs in the LTE (10 MHz) DUC for white space derive from those in the LTE (5 MHz) DUC for white space as they have identical normalised passband and stopband requirements.

Considering the factors of SEM requirements for LTE (5 MHz) and (10 MHz) (Table 4.2 on page 59), OFDM symbol properties (Table 4.3 on page 60) and the SEM requirements for TV white space (Table 5.2), the filter designs for TV white space LTE (5 MHz) DUC are summarised in Table 5.10.

|                   | F <sub>s</sub><br>(Msps) | F <sub>pass</sub><br>(MHz) | F <sub>stop</sub><br>(MHz) | A <sub>pass</sub><br>(dB) | A <sub>stop</sub><br>(dB) | Taps |
|-------------------|--------------------------|----------------------------|----------------------------|---------------------------|---------------------------|------|
| Channel<br>Filter | 7.68                     | 2.25                       | 2.5                        | 0.01                      | 90                        | 136  |
| 1st HB            | 15.36                    | 2.5                        |                            | 0.0003                    |                           | 31   |
| 2nd HB            | 30.72                    | 2.5                        |                            | 0.0003                    |                           | 15   |
| 3rd HB            | 61.44                    | 2.5                        |                            | 0.0003                    |                           | 11   |
| 4th HB            | 122.88                   | 2.5                        |                            | 0.0003                    |                           | 11   |

Table 5.10: DUC Filters for LTE (5 MHz).

The passband frequency is defined as  $BW_{occupied}/2$  and the stopband frequency is defined as  $BW_{total}/2$ . The passband ripple and stopband attenuation are set to 0.01 dB and 90 dB respectively and thus in total 136 taps with a symmetric structure are needed for the channel filter. The passband ripple of the HB filter is set to 0.0003 dB to ensure that the filter design has the sufficient attenuation in the stopband to meet the SEM requirements.

## 5.6.2 DUC Performance for White Space LTE (5 MHz) and (10 MHz)

## 5.6.2.1 Power Spectral Density of Transmitted Signals

The PSD result of LTE (5 MHz) DUC for TV white space is illustrated in Figure 5.26. The central frequency point coincides with the lower edge of Channel 29 and thus a carrier of 27 MHz is required from the DDS component to translate the signal into Channel 33.



Figure 5.26: DUC Transmission Spectrum of LTE 5 MHz for TV White Space.

Similarly, the PSD result of LTE (10 MHz) DUC for TV white space is depicted in Figure 5.27. A complex sinusoid at 24 MHz is generated to modulate the interpolated signal in order that Channels 32-33 are occupied. In Channel 37, the PSD result of the design is lower than 100 dB, which complies with the additional FCC SEM requirements. Both of the results demonstrate they are capable of meeting the SEM requirements defined by the standard and by the FCC.



Figure 5.27: DUC Transmission Spectrum of LTE (10 MHz) for TV White Space.

# 5.6.2.2Error Vector Magnitude

The OFDM-based data to the DUC complies with the LTE standard with a 64-QAM modulation scheme. 10 radio frames, i.e. a total of 1400 OFDM symbols, are used to calculate the EVM value. The DUC of white space LTE (5 MHz) has an EVM of 1.52%, and the EVM value is 1.27% for the LTE (10MHz) DUC, referred to (4.1), where N is equivalent to  $1400 \times 512$  and  $1400 \times 1024$  respectively. The received constellation plots for white space LTE (5 MHz) DUC with 64-QAM are shown in Figure 5.28.



Figure 5.28: Received Constellation for White Space LTE (5 MHz) with 64-QAM without AWGN.

## 5.6.3 Implementation Results

The implementation results of white space DUCs for LTE are listed in Table 5.11. Similarly, the numbers of slices and DSP48Es occupied in the white space designs have increased significantly compared to the hardware utilisation of conventional DUCs for LTE (Table 4.9 on page 72). Based on the analysis of Table 5.11, the DUC of LTE (10 MHz) for white space occupies the most DSP48Es and the DUC of LTE (5 MHz) for white space uses the most slices, reaching 33 and 1170 respectively. The DDS compiler occupies 3 DSP48Es using the Taylor series corrected method to achieve a SFDR of 112 dB. The maximum frequency is analysed to ensure that all of the DUC designs could meet the timing requirements for the system (245.76 MHz for white space LTE (5 MHz) and (10 MHz)). It is clear that all of the designs are able to meet the critical timing requirements.

| DUC                   | LUTs | FFs  | Slices | DSP48Es | Block RAMs | $f_{\rm max}$ (MHz) |
|-----------------------|------|------|--------|---------|------------|---------------------|
| 5 MHz                 | 1749 | 2730 | 1170   | 27      | 1          | 300.5               |
| 10 MHz                | 1712 | 2681 | 1026   | 33      | 1          | 310.0               |
| DDS for<br>5 & 10 MHz | 52   | 256  | 93     | 3       | 1          | 459.6               |

Table 5.11: Hardware Utilisation of white space DUCs for LTE.

# 5.7 Concluding Remarks

This chapter has given an overview of TV white space and the benefits it can offer compared to conventional communication standards. It has also introduced specific technical design challenges; this study only focuses on one of the technical challenges, namely the challenging SEM requirements. The SEM requirements can be met using an appropriate DUC design, as discussed in Chapter 4. Therefore, the DUC designs for conventional communication standards and modes require to be updated so as to cater specifically for white space applications. In this thesis, white space DUC design considerations and methods for IEEE 802.11n (5 MHz, 10 MHz and 20 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and LTE (5 MHz and 10 MHz) are proposed, and DUC architectures are presented.

The performance and implementation results of white space DUC designs are detailed and analysed. The results obtained demonstrate that all of the white space DUC designs can meet the stringent SEM requirements of white space, and provide good performance in terms of EVM. The implementation results of white space DUC designs imply that all of the DUC architectures are capable of meeting their respective critical timing requirements, although hardware resource utilisation is increased significantly compared to the non-white space DUC designs from Chapter 4. Longer filters, increased coefficient and input data widths, a higher IF sampling rate, and a higher spectral purity requirement for the DDS component are the primary reasons for the increased slice and DSP48E count.

The white space DUC designs are an extension of the conventional DUC designs. All of the mentioned DUC designs in Chapter 4 and Chapter 5 will be integrated into a single FPGA device with PR and DRP technologies in Chapter 6. This will enhance the SDR system performance significantly as the device is able to support LTE (5 and 10 MHz), IEEE 802.16e (3.5, 5, 7 and 10 MHz), IEEE 802.11n (20 MHz), WCDMA, white space LTE (5 and 10 MHz), white space IEEE 802.16e (3.5, 5, 7 and 10 MHz); 4 communication standards and 17 modes in total. Moreover, in white space operation it supports adaptive bandwidth according to the geographic environment when one or two even four adjacent channels are available. The design methodology and implementation for a reconfigurable DUC to support all of these standards and modes will be introduced in Chapter 6.

# Chapter 6

# **Hierarchical Design**

In Chapter 3, two dynamic reconfiguration technologies, i.e. PR and DRP based on FPGA were introduced. In Chapters 4 and 5, DUC designs for conventional standards, LTE, IEEE 802.16e, IEEE 802.11n and WCDMA; and for TV white space applications of LTE, IEEE 802.16e and IEEE 802.11n, were presented.

This chapter starts by considering design methods based on conventional FPGA reconfiguration to support multiple standards in an SDR system. Then, design methods based on PR technology only to support a variety of standards or modes are introduced. All of the design methods based on conventional FPGA reconfiguration and PR technology are shown to have their drawbacks from the perspectives of hardware resource utilisation or design flexibility and update-ability, and thus a design method based on a combination of PR and DRP is proposed. This is shown to combine the advantages of the design methods discussed previously, so as to support various standards and modes in an SDR system with low hardware resource utilisation and cost, while maintaining a high degree of design flexibility, expansibility, reusability and ease of updating.

Following this, the baseband diversity introduced in Chapter 4 is further analysed. A hierarchical design methodology based on a single FPGA for SDR systems is proposed based on the analysis of multiple standards and the developed architecture. Implementation results including hardware resource utilisation and timing results are presented and evaluated.

# 6.1 Conventional FPGA Designs in an SDR System

#### 6.1.1 Conventional Transmitter Chain in an SDR System

The SDR system considered here is able to support multiple communication standards and services with a single programmable terminal device. As discussed in Chapter 2, FPGAs are reprogrammable and have the flexibility to implement designs for a variety of communication standards. In addition, they also support computation in parallel, specifically for filtering and Fast Fourier Transform (FFT) processing in the physical layer, and thus play an important role in SDR systems [1] [48] [128]–[132]. The conventional transmitter in the SDR system is illustrated in Figure 6.1.



Figure 6.1: Conventional Transmitter Chain in SDR System.

With regard to wireless protocols, there are three major components in the transmitter chain: coding, modulation and DUC [133]. The coding component may involve a variety of encoders, for example a Reed-Solomon encoder, Convolutional

encoder, Turbo encoder, Low-Density Parity-Check (LDPC) encoder and different types of punctured convolutional codes and interleavers to support multiple standards. The modulation component consists of various mapper schemes (QPSK, 16-QAM and 64-QAM etc.), and two types of transforms: an Inverse Fast Fourier Transform (IFFT) and a spreader. The IFFT is an essential component in supporting the various standards which are based on OFDM structures, e.g. LTE, IEEE 802.16 and 802.11. Of those considered in this study, the spreader is only used for the WCDMA standard.

The data received from the Media Access Control (MAC) layer is at the bit level and is passed to the coding component to perform coding, puncturing and interleaving. The output of the coding component is also at the bit level, but expanded to include extra coding bits. The data output by the coding component is supplied to the mapper in the modulation component, which generates a signal based on the defined constellation symbols (I and Q channels). The constellation symbols are further processed by the IFFT component to provide output data from the modulation component with the desired FFT size and Cyclic Prefix (CP) length to support all of the standards based on OFDM symbols. Alternatively, the constellation symbols could be supplied to the spreader to cater for the WCDMA standard. Following the modulation stage, the I and Q channel data is interpolated and multiplied with a complex sinusoid to generate the final output waveform.

Both coding and modulation components operate in the baseband and thus these two components are able to cope with baseband design diversity as discussed in Chapter 4. The DUC component can also solve the problems of RF diversity. The FPGA device is capable of implementing all of the components mentioned above in the SDR system. However, based on the analysis of [134]–[139], the bit level operations could be implemented efficiently in software in DSPs or GPPs. Therefore, this thesis will consider only the modulation and DUC components typically implemented on FPGA. The development of a coding engine capable of supporting the various coding algorithms, puncturing and interleaving using DSPs or GPPs is excluded from this study.

#### 6.1.2 Fixed Multiple Standards Design

In conventional FPGA designs, there are two methods used to support multiple standards on a single device. The first is to implement all of the functional components on the FPGA device and enable the functionalities to be switched to the desired standard or mode. This method is referred to as the *fixed multiple standards design* in this thesis. The second method is to achieve functionality or design switching by downloading the corresponding bitstreams to reprogram the entire device, and this is referred to as the *programmable multiple standards design*. Four standards: LTE, IEEE 802.16e, IEEE 802.11n and WCDMA are taken as examples to analyse these conventional FPGA design methods in an SDR system.

The fixed multiple standards design method is illustrated in Figure 6.2. The data from the coding engine has to pass through a demultiplexer whose purpose is to select different modulation and DUC functionalities according to the user's requirements. Another multiplexer is added in order to create a single output for the various standards or modes, as only one of the standards is required to operate at any given time. Standards switching can be implemented with ease by controlling the parameters of the demultiplexer and multiplexer because all of the designs supported are implemented in parallel on the device.



Figure 6.2: Architecture of Fixed Multiple Standards Design in SDR System.

Although this method is able to support multiple standards and modes in the SDR system, it also has several drawbacks. As the number of standards and services supported in the SDR system increases, more functionalities have to be implemented on the device. As a result, a larger size of FPGA device may be required to provide sufficient hardware resource to complete the implementation, resulting in higher equipment cost. Since the device is intended to operate only one standard or mode at a time, it is inefficient to implement the superset of functionalities. In addition, the unused designs resident on the FPGA may lead to much higher power consumption than is necessary [140]. In general, larger FPGA devices mean larger SDR system size, higher equipment cost and power consumption. The expansibility and reusability of the design is low as the entire design has to be redesigned and implemented when one standard or mode is integrated. Taking all of these factors into account, the fixed multiple standards method is not well suited to an SDR system designed to support a variety of standards and modes.

# 6.1.3 Programmable Multiple Standards Design

The programmable multiple standards design method is as shown in Figure 6.3. Each standard or mode has different design parameters for the modulation and DUC components, and standard or mode switching is achieved by downloading the corresponding bitstream file to reconfigure the entire device. For example, suppose that the device is operating the LTE standard during a certain period, and then the IEEE 802.16e standard is required according to the user's needs. Consequently, the bitstream file for IEEE 802.16e is downloaded to the FPGA to reconfigure the entire device in order to change standard as the user requires.

Compared to the fixed multiple standards design method, this method overcomes its major shortcoming of being highly inefficient, as discussed in Section 6.1.2. Since it is able to exploit the reprogrammability of the FPGA to switch standards or modes, there is no need to implement all of the designs in parallel, which leads to a much smaller



Figure 6.3: Architecture of Programmable Multiple Standards Design in SDR System.

FPGA being required, and thus lower equipment cost and power consumption.

However, the reconfiguration overhead (time required to reprogram the whole device) is significant, and therefore this approach cannot meet the requirements of a real-time SDR device based on the analysis of Chapter 1. The reconfiguration overhead is defined as the period which starts at initiating download of the bitstream file, and ends when the FPGA device starts to operate in the new standard mode, as discussed in Chapter 3. The reconfiguration process involves performing a startup sequence after the bitstream has been downloaded to the FPGA. During the startup sequence, the "DONE" signal (which indicates that the whole configuration process is complete) is released, I/O pins become active and the internal reset signal is deasserted [35] [206]. In addition, the operation of the FPGA is disrupted during this period because

conventional FPGA reconfiguration involves halting the entire device. Therefore, standards or modes switching based on this method suffers from a time consuming reconfiguration overhead and FPGA disruption, even when only small modifications in the design need to be made.

# 6.2 Multiple Standards Design based on PR

#### 6.2.1 Practicalities of PR

Based on the analysis of Chapter 3, PR is capable of overcoming all of the drawbacks of fixed and programmable multiple standards designs. PR allows one or several portion(s) of the FPGA to be reconfigured on the fly while the rest continue to operate unaffected [29]. This enables the end users to dynamically change functionalities by downloading different partial bitstream files, resulting in a higher degree of operational flexibility compared to conventional reconfiguration [30].

There is no need to implement all of the reconfigurable functionalities in parallel on a single FPGA device, resulting in lower hardware resource utilisation. Moreover, design implementation with PR enables a reduction in terms of power consumption, as will be explained and analysed in Chapter 7.

The partial bitstream only carries configuration data and address information and this ensures that there is no startup process in the PR overhead. Since the partial bitstreams reconfigure one or some areas of the FPGA rather than the entire FPGA device, the size of the partial bitstream is much smaller than a conventional configuration file, which means that a shorter bitstream downloading time is required. Moreover, the latest PR method flows enable reconfigurable function preservation and importing, which increases design reusability and shortens development cycles.

A number of studies concerning PR-enabled SDR architectures have been published in recent years [140]–[152] and [154]–[156]. In summary, the attractive features of PR can enable lower hardware resource utilisation, shorter reconfiguration

overhead and higher design reusability. Related studies prove that PR has been successfully applied to FPGA-based SDR system to replace conventional FPGA design methods.

Despite there having been a number of studies in the area, those mentioned above only focus on baseband processing components to support multiple standards. However, as evident from the reviews of Chapters 4 and 5, full standard or mode switching between LTE, IEEE 802.16, IEEE 802.11 and WCDMA or even emerging standards means that not only the processing logic, but also the clock frequency, has to be reconfigured in order to support the DFE. The DCM used to synthesise the clock from the input clock oscillator is not part of the reconfigurable logic, and therefore its operation cannot be controlled via PR. As a result, changing radio functionality may not be fully achieved by PR alone.

#### 6.2.2 Multiple Clock Oscillators Design

Two methods can be used to enable PR to meet the requirements of standard or mode switching in logic functionalities and clock frequency. The first is referred to as the *multiple clock oscillators design*, which employs several clock oscillators to cater for each standard or mode in this study. The second is the *normalised clock oscillator design*, which employs a single clock oscillator with a normalised frequency according to the standards or modes supported.

The architecture of the multiple clock oscillators design with PR is illustrated in Figure 6.4. From the analysis of Chapters 4 and 5, four clock frequencies are required, i.e. 240 MHz, 245.76 MHz, 256 MHz and 258.4 MHz. A multiplexer selects the correct frequency for the chosen DUC designs. A change in DUC design would be implemented by downloading the corresponding partial bitstreams. Together, the clock multiplexer and PR designs ensure that the standard or mode switching is implemented correctly. For example, the clock frequency can be changed from 245.76 MHz to 358.4 MHz by sending a control signal to the clock selector, and the DUC operation can be



Figure 6.4: Architecture of Multiple Clock Oscillators with PR in the SDR System.

switched by downloading the partial bitstream.

This method is capable of solving the problem of changing clock frequency in accordance with standard or mode switching. However it also has some limitations. The first is that it must employ multiple clock oscillators, resulting in higher equipment cost. In addition, these oscillators have to operate all of the time (whether in active use or not), which would lead to high power consumption. The second drawback is that it does not possess good expansibility. Although the DUC component with the PR design method is capable of integrating the designs for new standards, additional clock frequencies are very likely to be required, and thus more clock oscillators would have to be rebuilt to support new designs. Consequently, it would be difficult to update this architecture to support new standards and services.

#### 6.2.3 Normalised Clock Oscillator Design

The alternative method, i.e., normalised clock oscillator with PR design, is depicted in Figure 6.5. Compared to the multiple clock oscillators method, the biggest difference is that only one clock oscillator is required, specified to oscillate at a



Figure 6.5: Architecture of Normalised Clock Oscillator with PR in the SDR System.

normalised frequency for the several standards for this method.

In Figure 6.5, LTE (5 MHz), WCDMA and IEEE 802.16e (5 MHz) are taken as an example. The input data rates to the DUC for these three standards are 7.68 Msps, 3.84 Msps and 5.6 Msps respectively, as discussed earlier. The normalised clock frequency should be the Least Common Multiple (LCM) of the frequencies supported, and thus the normalised frequency is 268.8 MHz for these three standards. The interpolation factor of LTE (5 MHz) is equivalent to 35. Similarly, the interpolation factors of WCDMA and IEEE 802.16e (5 MHz) are 70 and 48 respectively. Consequently, the DUCs could be designed with interpolation factors of 35, 70 and 48 for LTE (5 MHz), WCDMA and IEEE 802.16e (5 MHz) respectively, permitting these three DUC designs to be implemented with PR. Standard switching could be achieved by downloading the corresponding partial bitstreams while maintaining one single clock oscillator only.

This method could support three standards as shown earlier. However, the number of standards or modes that can be supported using this method is very limited. With an increase in the number of standards or modes, it becomes difficult to find a normalised frequency which is feasible for implementing on the FPGA. For example, the normalised frequency for LTE (10 MHz), WCDMA and IEEE 16e (10 MHz) is 537.6

MHz, which is a very high timing requirement that most FPGAs could not achieve. If IEEE 802.16e (3.5 MHz) were additionally integrated, the normalised frequency would become even more complicated and the timing requirement more difficult to achieve.

Another factor to consider is that this method does not have good expansibility and reusability. With any addition of standards or modes, the normalised clock frequency is likely to be changed and thus the clock oscillator has to be changed too. Furthermore, the DUC modules have to be thoroughly redesigned, resulting in a longer development cycle once the normalised frequency has been changed. On the other hand, since the interpolation factor, e.g. LTE (5 MHz) with the factor of 35, is not based on an interpolation factor of 2, the channel and interpolation filter designs become more complicated and employ more hardware resources than those discussed Chapters 4 and 5. Similar situations can happen for the DUC designs of WCDMA and IEEE 802.16e (5 MHz). Considering the factors discussed above, this method is less well suited to the SDR system than the multiple clock oscillators design method.

# 6.3 DRP-PR Design Method

As discussed in earlier sections, PR technology is capable of switching standards or modes within the baseband component. However, in isolation PR is insufficient to implement all aspects of switching radio functionalities when the DUC component is involved. DRP technology has the ability to reconfigure the clock frequency, which can significantly counteract the deficiency of PR technology. As a result, the technique of DRP could be combined with PR to address the difficulties of communication standard or mode switching in terms of clock frequency and dependent functionalities. An architecture which utilises both PR and DRP in the SDR system is shown in Figure 6.6.



Figure 6.6: Architecture of DRP-PR Design Method in SDR System.

As reviewed in Section 3.5, DRP is capable of generating various clock frequencies to serve the FPGA design, by supplying the appropriate values to the DCM within the DRP architecture. This scheme uses a single fixed clock oscillator. In this study, an oscillator frequency of 256 MHz is chosen, as from this, 245.76 MHz, 358.4 MHz and 240 MHz can all be generated according to (3.2)-(3.4) on page 45. These four clock frequencies can support LTE (5 MHz and 10 MHz) and their TV white space equivalents, IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and their TV white space equivalents, IEEE 802.11n (20 MHz) and TV white space IEEE 802.11n (5 MHz, 10 MHz and 20 MHz) and WCDMA as analysed previously. DUC designs can be implemented with PR to switch functionalities with ease; these reconfigurable DUC designs are based on the DUC designs in Chapters 4 and 5.

Therefore, the DRP-PR approach proposed has significant reconfiguration ability, not only in functionalities, but also in terms of clock frequency. This approach can overcome all of the shortcomings of both the multiple clock oscillators and normalised clock oscillator methods described in Sections 6.2.2 and 6.2.3 respectively, and even conventional FPGA design methods. Therefore, this combined DRP-PR architecture is selected for the SDR system in this thesis, and will be analysed further during the remainder of this chapter.

# 6.4 **Baseband Diversity**

The SDR system aims to build a single architecture to support multiple standards or modes, rather than design different architectures to cater for each. Consequently, it is necessary to identify the commonalities between these standards in order to develop an efficient DRP and PR based design [157]. As discussed in Section 6.1.1 with respect to the transmitter chain, there are three major components: coding, modulation and the DFE. Only the modulation component is considered to reside within the baseband processing system, as the coding component can be implemented in software in DSPs or GPPs and thus can be omitted from this study. In this thesis, the analysis undertaken is based on the example of the downlink transmitter chain for four different standards (LTE, IEEE 802.16e, WCDMA and IEEE 802.11n).

The function of the modulator is to map bits from the data source into symbols for transmission. The modulation architectures and parameters differ significantly between the communication standards considered here.

In the case of the LTE downlink physical layer, QPSK, 16-QAM and 64-QAM modulation schemes are all employed, and the baseband signal has an adaptive OFDM structure, supporting FFT sizes from 128 up to 2048. The CP is necessarily added after the IFFT component to counter inter symbol inference [58]. The normal CP length varies according to the channel bandwidth defined in the specification [158]; for example, the CP length of the 10MHz bandwidth variant may be 80 or 72 samples, while the 5MHz variant employs 40 and 36 samples. Similarly, IEEE 802.16e has a scalable OFDM physical layer to support FFT sizes from 128 to 2048. In this case, the CP length can be 1/4, 1/8, 1/6 or 1/32 of the frame duration [54] [94] [95]. For the IEEE 802.11n standard, the FFT size is fixed to 64 and the CP length can be 1/4 or 1/8 of the frame duration [99]. The modulation properties of TV white space LTE are identical to those of conventional LTE. Similarly, the data of TV white space versions of IEEE 802.16e and IEEE 802.11n have identical modulation properties to those of the original standards.

However, the WCDMA standard is somewhat different, in the sense that the baseband signals are not based on an OFDM structure, and the physical layer of the WCDMA standard features a spreader block instead of an IFFT. QPSK and Orthogonal Variable Spreading Factor (OVSF) codes are used to perform modulation. The SF ranges from 4 up to 512 [100].

The parameters of the four considered standards are summarised in Table 6.1. Note that, as a subset of all possible LTE, IEEE 802.16e and IEEE 802.11n modes are analysed, only three of the FFT sizes mentioned above require to be supported.

| Standards                                                   | Mapper                   | OFDM        |                                                 | OVSF                                     |
|-------------------------------------------------------------|--------------------------|-------------|-------------------------------------------------|------------------------------------------|
|                                                             |                          | FFT Size    | CP Length                                       | SF                                       |
| LTE/White Space LTE<br>5/10 MHz                             | QPSK<br>16-QAM<br>64-QAM | 512<br>1024 | 80,72<br>40,36<br>samples                       | None                                     |
| IEEE 802.16e/<br>White Space IEEE 802.16e<br>3.5/5/7/10 MHz | QPSK<br>16-QAM<br>64-QAM | 512<br>1024 | 1/4, 1/8,<br>1/16, 1/32<br>of frame<br>duration | None                                     |
| IEEE 802.11n/<br>White Space IEEE 802.11n<br>5/10/20 MHz    | QPSK<br>16-QAM<br>64-QAM | 64          | 1/4, 1/8 of<br>frame<br>duration                | None                                     |
| WCDMA                                                       | QPSK                     | None        | None                                            | 4, 8, 16,<br>32, 64,<br>128, 256,<br>512 |

| Table 6.1: | Modulation | Parameters. |
|------------|------------|-------------|
|------------|------------|-------------|

# 6.5 Hierarchical Design Methodology

Based on the analysis of the DRP and PR reconfiguration technologies, and the studied standards and modes, a hierarchical, layer-based design method for an SDR transmitter architecture based on a single FPGA device is proposed. The methodology is illustrated in Figure 6.7.

The first layer is divided into two RPs: modulation and DUC, in accordance with the functions in the transmitter chain in this study. As described in Section 3.1.2.3, an RP is defined as an area of the FPGA device to which PR is applied; it has the ability to dynamically change function. RMs are the swappable functionality associated with an RP. One RP may have multiple associated RMs, only one of which occupies the RP at any given time, such that the RMs are operated on the FPGA with time multiplexing.

As is evident from the requirements in Table 4.1 on page 57 and in Table 5.3 on page 106, in order to support standard or mode switching, not only the RMs, but also the clock frequencies require to be reconfigured. Consequently, DCMs with the DRP architecture are created to generate the various clock frequencies required to serve each RP, e.g. in order to change standards from TV white space LTE (10 MHz bandwidth) to IEEE 802.16e (10 MHz bandwidth), the clock frequency has to be reconfigured from 245.76 MHz to 358.4 MHz. Similarly, the clock frequency has to be changed from 245.76 MHz to 240 MHz when operation switches from LTE (5 MHz) to TV white space IEEE 802.11n.

The DUC RP consists of a variety of associated RMs: white space LTE (5 MHz and 10 MHz), white space IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz), white space IEEE 802.11n (5 MHz, 10 MHz and 20 MHz), LTE (5 MHz and 10 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz), IEEE 802.11n (20 MHz) and WCDMA. This represents 4 standards and 17 modes in total.



Figure 6.7: Hierarchical Design Methodology for SDR Transmitter Architecture.

The second layer derives from the further division of the first layer. For example, the modulation RP can be split into a mapper RP and a transform RP. Based on analysis of the modulation parameters of the four standards considered, the mapper RP has three associated RMs: QPSK, 16-QAM and 64-QAM constellation modules. The transform RP provides two RMs to implement the IFFT and spreader functions respectively.

The benefits of this architecture are primarily that it increases the FPGA device utilisation efficiency dramatically, and that it reduces the number of hardware devices in the physical layer compared with the SDR architecture described in Section 1. Judiciously choosing and floorplanning reasonable RPs is an important factor in making efficient use of FPGA hardware resources, and allows all of the radio functions for the transmitter to be integrated on a single FPGA device. Note how the proposed architecture is implemented related to earlier discussions in Section 3.4, i.e. the design uses external PR control. Therefore, one GPP is employed to control when and which partitions of the FPGA are reconfigured according to the user's requirements.

This proposed architecture has several advantages. It permits enhanced device reuse, because via PR, various RMs can time-share the hardware resource in one RP. In addition, the DRP technology provides the ability to reconfigure clock frequency, which reduces the number of clock oscillators required. The combination of PR and DRP reconfiguration technologies strengthens the degree of flexibility in the design, and hence is very relevant to the requirements of SDR. The increasing sophistication of PR is fundamental to SDR applications, as advances in the technology allow RMs to be switched with shorter reconfiguration times. Taking all of these factors into account, the combination of PR and DRP leads to a less complex and lower cost SDR architecture than any of the other options considered.

# 6.6 System Architecture

Following the hierarchical design approach described in Section 6.5, the implementation architecture developed to support the four standards is illustrated in Figure 6.8. Two DCMs with DRP are employed: one is used to control the mapper RP, and the other is for the DUC RP.



Figure 6.8: An Implementation of SDR Architecture with PR-DRP to Support Multiple Standards.

The IFFT module in the transform RP is implemented using a Xilinx FFT core, with the Radix-4 burst I/O structure, to process the data. This structure uses fewer resources than the pipelined streaming structure. In the case of the LTE standard, the subcarrier spacing ( $\Delta f$ ) is 15 kHz, hence the interval between two points is equal to 66.7µs. In order to accommodate the LTE standard, the processing time required by the FFT core must be shorter than this interval so that the core can process the data correctly and continuously. For the FFT core with the Radix-4 burst structure and supporting FFT sizes up to 1024 as designed in Xilinx ISE 12.4, the latency is 34.26 µs when the clock frequency is set to 100 MHz. In other words, the core can satisfy the needs of the LTE standard with a 100 MHz clock. Using similar logic, the processing ability of the core can also meet the requirements of IEEE 802.16e. In the case of IEEE 802.11n, the subcarrier spacing ( $\Delta f$ ) is 312.5 kHz, hence the interval between two points is equal to 3.2µs. The latency is 2.38 µs when the FFT size is set to 64 with the same clock, which also meets the requirements of IEEE 802.11n standard.

It is also notable that the FFT core is capable of changing the FFT size and CP length during operation, thus enabling various types of OFDM symbols to be created to comply with the requirements of mode switching between the LTE, IEEE 802.16e and 802.11n standards, and modes thereof.

The mapper DCM uses a 100 MHz crystal oscillator as the input clock to generate three clock frequencies; 200, 400 and 600 MHz, to serve the QPSK, 16-QAM and 64-QAM modules respectively. The data output by the mapper RP can then be fed to the transform RP, operating at a clock frequency of 100 MHz. For the WCDMA implementation, the spreader modules contain the OVSF code generator which supports SFs from 4 to 512.

Two simple dual port BRAMs are used to bridge the clock domain boundary between the modulation processing components and the DUC RP. The 256 MHz from the clock oscillator is chosen as the input clock for the DUC DCM component, as the output frequencies (245.76MHz, 358.4 MHz and 240 MHz) may be derived from this frequency by setting different integer multiplier and divisor values, as illustrated in (3.2), (3.3) and (3.4) on page 45. For example, the generation of the 245.76 MHz clock requires a multiplier value of 23, and a divisor value of 24. Similarly, the 358.4 MHz and 240 MHz clocks can be obtained when the multiplier values are equivalent to 13 and 14, and the divisor values are 9 and 15, respectively.

The clock divider RP involves four RMs (which divide by factors of 8, 16, 32 and 64) and is added to generate appropriate clock frequencies for reading output data from the BRAM. For example, the input sample rate of LTE with 10 MHz bandwidth is 15.36 Msps. Therefore, the factor 16 clock divider is applied to ensure that the data

from the BRAM is fed into the DUC at the corresponding rate.

The DUC RMs are based on the DUC designs in Chapters 4 and 5, i.e. LTE with 5 and 10 MHz bandwidths, WCDMA, IEEE 802.16e with 3.5, 5, 7 and 10 MHz bandwidths, IEEE 802.11n 20 MHz, white space LTE with 5 and 10 MHz bandwidths, white space IEEE 802.16e with 3.5, 5, 7 and 10 MHz bandwidths, and IEEE 802.11n with 5, 10 and 20 MHz bandwidths. The channel and interpolation filters are implemented using the Xilinx FIR Compiler to support I and Q channels with time division multiplexing. Finally, the DDS component is configured using Xilinx DDS Compiler to generate the desired complex sinusoid so that the data frequency can be modulated from baseband to  $f_{\rm IF}$ .

# 6.7 Results and Analysis

This section begins by introducing the floorplanning of the DRP-PR architecture to support multiple standards and modes. Three scenario implementations are considered to demonstrate the standard switching processes using DRP and PR technologies. The implementation results of conventional FPGA design methods, i.e., (i) fixed multiple standards design, (ii) programmable multiple standards design, (iii) multiple clock oscillators design and (iv) normalised clock oscillator design discussed in Sections 6.1.2–6.13 and Sections 6.2.2–6.2.3 respectively, and (v) the proposed DRP-PR design are then listed and compared. The analysis is concluded by summarising all of the design methods discussed.

#### 6.7.1 Floorplanning of the Proposed Architecture on Virtex-5 LX 110T Device

Defining the floorplanning of the RP may be viewed as one of the most important aspects of PR design. Selecting a reasonable size and position of each RP can permit performance optimisation in terms of dynamic power and partial bitstream magnitude, thus avoiding unnecessary energy consumption and reconfiguration overhead.

Modern FPGAs are partitioned into clock regions to improve their clock distribution, and in the case of the Virtex-5 device considered here, each clock region is 20 CLBs high, containing 8 BRAMs and 8 DSP48Es, and spans half of the die [37]. As discussed in Chapter 3, the frame is also distributed by clock region with 20 CLBs height and 1 CLB width. The partial bitstream only involves configuration information based on frames, so the number of frames used in the design decides the size of partial bitstream. Therefore, the choice of RP could impact reconfiguration overhead greatly. The chosen floorplanning for the entire design on a Virtex-5 LX 110T device is illustrated in Figure 6.9, as defined using Xilinx PlanAhead design tool.

The DSP48Es play an important role in DUC designs, based on the analysis of Chapters 4 and 5, and for this SDR system the maximum number of DSP48Es required is 33 in order to satisfy both the TV white space IEEE 802.16e (7 MHz) and white space LTE (10 MHz) designs. As a result, the DUC RP spans five clock regions and occupies the largest area in the design. With regard to the transform RP, the occupied DSP48Es and BRAMs are 9 and 10, respectively (to cater for the OFDM module) and thus it spans two clock regions. Since the mapper RP and clock divider RP use few slices without DSP48Es and BRAMs, their areas are much smaller compared to the modulation RP and DUC RP. The areas of all of the RPs (especially the DUC RP) are designed to include additional resource in order to achieve critical timing requirements, e.g. 358.2 MHz for IEEE 802.16e (5 MHz and 10 MHz), and TV white space variations on these. Another benefit of defining RP areas with extra resource is to allow easier integration of additional, potentially more complex RMs in the future, for example, to support future standards and modes within the SDR system.


Figure 6.9: Floorplanning of the SDR Architecture in PlanAhead.

## 6.7.2 Scenario Implementations

In this study all of the designs are implemented on the Virtex-5 LX110T device using the Xilinx ISE 12.4 software suite. The various RPs are represented in terms of area constraints on the device. In order to demonstrate the different implementation results obtained when PR and DRP technologies are performed to switch standards and services, three scenarios are considered as examples. These scenarios are defined as follows:

• *Scenario 1*: Standard switching from LTE (10 MHz) with 64-QAM modulation to TV white space IEEE 802.16e (7 MHz) with 16-QAM modulation.

• *Scenario 2*: Standard switching from TV white space IEEE 802.16e (7 MHz) with 16-QAM modulation to WCDMA

• *Scenario 3*: Standard switching from WCDMA to TV white space IEEE 802.11n 20 MHz with 64-QAM modulation.

Design parameters have to be modified in the process of standards switching, and thus some reconfiguration actions have to be made to the reconfigurable DCMs and RPs, so as to change not only clock frequencies but also functionalities. The reconfiguration actions are summarised in Table 6.2.

|           | Reconfiguration Actions |                  |               |                  |                |                                     |  |
|-----------|-------------------------|------------------|---------------|------------------|----------------|-------------------------------------|--|
| Scenarios | DCMs based on DRP       |                  | PR bitstreams |                  |                |                                     |  |
|           | Mapper<br>(MHz)         | DUC<br>(MHz)     | Mapper        | Clock<br>Divider | Trans-<br>form | DUC                                 |  |
| 1         | 600 to<br>400           | 245.76 to<br>256 | 16-QAM        | 1/32             |                | TVWS<br>IEEE<br>802.16e<br>(7 MHz)  |  |
| 2         | 400 to 200              | 256 to<br>245.76 | QPSK          | 1/64             | Spreader       | WCDMA                               |  |
| 3         | 200 to<br>600           | 245.76 to 240    | 64-QAM        | 1/8              | OFDM           | TVWS<br>IEEE<br>802.11n<br>(20 MHz) |  |

Table 6.2: Scenario Implementations based on DRP-PR Architectures.

In the case of standard switching from LTE (10 MHz) with 64-QAM to TV white space IEEE 802.16e (7 MHz), the mapper DCM has to be modified from 600 MHz to 400 MHz so as to ensure the data output by the mapper RP can be fed to the transform

RP operating at a clock frequency of 100 MHz. Also, the DUC DCM has to be changed to generate a clock frequency of 256 MHz (from 245.76 MHz) according to Table 6.2. In terms of PR implementation, the three corresponding partial bitstreams have to be downloaded according to the user's requirements. In this scenario, partial bitstreams of 16-QAM, the factor 32 clock divider, and the DUC for TV white space IEEE 802.16e (7 MHz) are required. Since both LTE and IEEE 802.16e employ an FFT core to generate OFDM symbols, the partial bitstream for OFDM does not need to be downloaded. However, it is necessary to change the parameters of the FFT core, e.g. the CP length. Only the mapper DCM results are shown in this chapter, as the results for the DUC DCM presented in Section 3.5.2 on page 45 have already demonstrated that it can be implemented successfully.

The post place and route simulation results for the mapper DCM are illustrated in Figure 6.10. The top figure shows the changes of clock frequency in the entire process for the three scenarios. Figures 6.10 (a), (b), (c) and (d) represent that the clock is operating at 600 MHz, 400 MHz, 200 MHz and 600 MHz respectively, and these views are obtained by enlarging the corresponding sections of the upper waveform. Comparing Figures 6.10 (a) and (b) which represent the clock frequency alteration required by Scenario 1, it is clear that the clock frequency has been changed from 600 MHz to 400 MHz successfully, with a new multiplier value of 7, and the same divisor value.



Figure 6.10: Mapper DCM Clock Frequency Results for the Proposed Three Scenarios.

The implementation results obtained from the PR process of swapping functionalities from LTE (10 MHz) to TV white space IEEE 802.16e (7 MHz) are depicted in Figure 6.11. It is clearly seen that the DUC modules have been swapped within the DUC RP, as the TV white space IEEE 802.16e (7 MHz) occupies more hardware resource. Also, the mapper modules have been changed within the mapper RP. Since the clock divider modules use few slices, it is difficult to perceive the difference between the two standards from the floorplannings in Figure 6.11. The OFDM module remains the same, as there is no partial bitstream to download for the



LTE (10 MHz)

**TVWS IEEE 802.16e (7 MHz)** 

Figure 6.11: Implementation Results for Scenario 1.

transform RP. All of the static logic continues to operate during the PR process.

In the case of scenario 2, standard switching from TV white space IEEE 802.16e (7 MHz) to WCDMA, the mapper DCM has to be modified from 400 MHz to 200 MHz to cater for the mapper switching from 16-QAM to QPSK. Based on Figures 6.10 (b) and (c), a new multiplier value of 3 is required, and the divisor value remains the same. Also, a multiplier value of 23 and divisor value of 24 are needed for the DUC DCM to generate 245.76 MHz from 256 MHz.

The implementation results after the PR process are illustrated in Figure 6.12. The OFDM module is replaced with the spreader module by downloading the spreader partial bitstream to cater for WCDMA. The spreader module consists of OVSF generators, and performs spreading of the input data via an Exclusive-OR operation with the generated OVSF codes. Therefore, it employs fewer slices compared to the OFDM module as is evident from the floorplanning, and does not occupy any DSP48Es or BRAMs. Similarly, the WCDMA DUC occupies fewer slices and



TVWS IEEE 802.16e (7 MHz)

**WCDMA** 

Figure 6.12: Implementation Results for Scenario 2.

DSP48Es but more BRAMs compared to the DUC for TV white space IEEE 802.16e (7 MHz). Partial bitstreams of QPSK and a factor 64 clock divider are also required.

In the case of scenario 3, i.e. standard switching from WCDMA to TV white space IEEE 802.11n (20MHz), the mapper DCM has to be modified from 200 MHz to 600 MHz. As shown in Figures 6.10 (c) and (d), a value of 11 is sent to the multiplier, and the divisor remains the same; this performs the required clock frequency switching. Since the DUC DCM has to change frequency from 245.76 MHz to 240 MHz, a multiplier value of 14 and a divisor value of 15 are required.

In terms of PR, the spreader module is replaced with the OFDM module by downloading the OFDM partial bitstream, as shown in Figure 6.13. The implementation results of OFDM module (i.e. placement, routing, and timing results in TV white space IEEE 802.11n (20 MHz)), are identical to those for LTE (10 MHz) and TV white space IEEE 802.16e (7 MHz). In fact, the implementation results of the OFDM module are identical to all other DUC designs except the WCDMA design discussed in Chapters 4 and 5, because it can be reserved after the initial



TVWS IEEE 802.11n (20 MHz)

Figure 6.13: Implementation Results for Scenario 3.

implementation, and subsequently imported as required to any DUC designs based on an OFDM symbol structure.

#### **6.7.3 Implementation Results**

The implementation results for all DUC designs were provided individually in Chapters 4 and 5. Considering Table 4.9 on page 72, Table 4.19 on page 86, Table 4.26 on page 94, Table 5.6 on page 117, Table 5.9 on page 128, and Table 5.11 on page 134, together with the hardware utilisation of OFDM and spreader modules, overall hardware resource usage can be summarised as shown in Table 6.3.

Table 6.3 gives the hardware resources used by each of the modules without PR in terms of LUTs, FFs, DSP48Es, BRAMs and slices. The  $f_{\text{max}}$  column shows that all of the modules can meet the timing requirements according to the demands of the architecture. Since the mapper and clock divider modules occupy few slices without any DSP48Es and BRAMs, the number of resources they consume is insignificant

|       | Standards (Modes)    | LUTs | FFs  | Slices | DSP<br>48Es | BRAMs | f <sub>max</sub><br>(MHz) |
|-------|----------------------|------|------|--------|-------------|-------|---------------------------|
|       | LTE (5 MHz)          | 1257 | 1783 | 666    | 9           | 9     | 430.8                     |
|       | LTE (10 MHz)         | 1164 | 1639 | 684    | 13          | 9     | 407.2                     |
|       | 802.16e 3.5 (MHz)    | 1453 | 2066 | 846    | 8           | 9     | 406.2                     |
|       | 802.16e (5 MHz)      | 1407 | 2041 | 819    | 6           | 9     | 396.4                     |
| DUC   | 802.16e (7 MHz)      | 1411 | 1926 | 804    | 12          | 9     | 433.3                     |
|       | 802.16e (10 MHz)     | 1357 | 1818 | 599    | 11          | 9     | 387.1                     |
|       | WCDMA                | 1229 | 1682 | 708    | 5           | 9     | 435.2                     |
|       | 802.11n (20 MHz)     | 991  | 1438 | 488    | 12          | 9     | 442.1                     |
|       | WS 802.11n (5 MHz)   | 1669 | 2528 | 860    | 24          | 1     | 407.3                     |
|       | WS 802.11n (10 MHz)  | 1658 | 2525 | 1031   | 29          | 1     | 338.9                     |
|       | WS 802.11n (20 MHz)  | 1608 | 2473 | 995    | 30          | 1     | 368.3                     |
|       | WS 802.16e (3.5 MHz) | 2043 | 3129 | 1114   | 28          | 1     | 376.2                     |
|       | WS 802.16e (5 MHz)   | 1982 | 3044 | 1142   | 27          | 1     | 408.2                     |
|       | WS 802.16e (7 MHz)   | 1953 | 3066 | 1173   | 33          | 1     | 411.9                     |
|       | WS 802.16e (10 MHz)  | 1848 | 2923 | 1184   | 30          | 1     | 387.9                     |
|       | WS LTE (5 MHz)       | 1749 | 2730 | 1170   | 27          | 1     | 300.5                     |
|       | WS LTE (10 MHz)      | 1712 | 2681 | 1026   | 33          | 1     | 310                       |
| Trans | OFDM                 | 1714 | 2183 | 797    | 9           | 10    | 297.3                     |
| form  | Spreader             | 34   | 9    | 13     | 0           | 0     | 375.7                     |

Table 6.3: Hardware Resource Utilisation without PR.

compared to the DUC and transform RPs, and consequently can be omitted from further analysis. As a result, Table 6.3 only covers the hardware utilisation of the DUC and transform RPs.

Table 6.4 shows the practical hardware utilisation statistics of the RMs with the proposed architecture. Compared to Table 6.3, the number of LUTs increases for the same modules, and this is primarily due to insertion of the partition pins. Partition pins

|       | Standards (Modes)    | LUTs | FFs  | Slices | DSP<br>48Es | BRAMs | f <sub>max</sub><br>(MHz) |
|-------|----------------------|------|------|--------|-------------|-------|---------------------------|
|       | LTE (5 MHz)          | 1288 | 1665 | 582    | 9           | 9     | 352.0                     |
|       | LTE (10 MHz)         | 1183 | 1519 | 516    | 13          | 9     | 297.5                     |
|       | 802.16e 3.5 (MHz)    | 1494 | 1909 | 576    | 8           | 9     | 361.0                     |
|       | 802.16e (5 MHz)      | 1446 | 1888 | 671    | 6           | 9     | 395.3                     |
| DUC   | 802.16e (7 MHz)      | 1439 | 1799 | 543    | 12          | 9     | 316.0                     |
|       | 802.16e (10 MHz)     | 1378 | 1684 | 583    | 11          | 9     | 367.9                     |
|       | WCDMA                | 1268 | 1551 | 569    | 5           | 9     | 345.7                     |
|       | 802.11n (20 MHz)     | 1000 | 1351 | 459    | 12          | 9     | 282.4                     |
|       | WS 802.11n (5 MHz)   | 1681 | 2375 | 779    | 24          | 1     | 280.7                     |
|       | WS 802.11n (10 MHz)  | 1661 | 2357 | 771    | 29          | 1     | 275.0                     |
|       | WS 802.11n (20 MHz)  | 1623 | 2282 | 724    | 30          | 1     | 334.6                     |
|       | WS 802.16e (3.5 MHz) | 2117 | 2953 | 861    | 28          | 1     | 328.3                     |
|       | WS 802.16e (5 MHz)   | 2056 | 2889 | 938    | 27          | 1     | 394.5                     |
|       | WS 802.16e (7 MHz)   | 2012 | 2927 | 912    | 33          | 1     | 284.0                     |
|       | WS 802.16e (10 MHz)  | 1908 | 2783 | 804    | 30          | 1     | 389.9                     |
|       | WS LTE (5 MHz)       | 1809 | 2599 | 793    | 27          | 1     | 375.8                     |
|       | WS LTE (10 MHz)      | 1758 | 2562 | 803    | 33          | 1     | 322.6                     |
| Trans | OFDM                 | 1724 | 2183 | 698    | 9           | 10    | 251.6                     |
| form  | Spreader             | 45   | 9    | 17     | 0           | 0     | 196.7                     |

Table 6.4: Hardware Resource Utilisation with PR.

enable communication between reconfigurable logic and the surrounding static logic, and their implementation is based on LUTs. In other words, some LUTs must be added to every RM as partition pins in order for successful PR implementation. The numbers of FFs and Slices in Table 6.4 are slightly reduced compared to the corresponding modules as given in Table 6.3. This can be attributed to implementation compression, as the RMs have to be placed and routed within the RP regions, rather than the entire

device. However, the spreader is an exception. In fact, the spreader module has to be artificially augmented with input and output ports, so that the module has the same interface as the OFDM RM (equivalent interfaces are a requirement for RMs associated with a particular RP). In this case, several partition pins are added, hence the number of slices for the spreader increases.

With respect to PR design rules, the size of the RP must be specified to accommodate the most complicated module. Therefore, the worst case scenarios in terms of slices, DSP48Es and BRAMs for all RMs associated with a particular RP are selected from Table 6.3 to determine the required size of that RP.

## 6.8 Implementation Comparisons

In this section, the implementation results of the proposed DRP-PR architecture are compared to the SDR systems based on conventional FPGA designs and PR-only designs as discussed at the start of this chapter. All of the result comparisons demonstrate that the proposed architecture is the best solution for the SDR system in terms of hardware resource utilisation, reconfiguration overhead, design flexibility, expansibility, power consumption and cost.

# 6.8.1 Comparison of Proposed DRP-PR Design with Fixed Multiple Standards Design

As discussed in Chapters 4 and 5, different modes within one standard may share filter designs, e.g. LTE (5 MHz and 10 MHz), IEEE 802.16e (3.5 MHz and 7 MHz, 5 MHz and 10 MHz), TV white space LTE (5 MHz and 10 MHz), TV white space IEEE 802.16e (3.5 MHz and 7 MHz, 5 MHz and 10 MHz), and TV white space IEEE 802.11n (5 MHz, 10 MHz and 20 MHz), as their normalised passband and stopband requirements are identical. LTE (5 MHz) and (10 MHz) are taken as an example, as shown in Figure 6.14. The DDS component can also be shared, because both the LTE (5 MHz) and (10 MHz) variants use the same clock frequency.



Figure 6.14: Filter Designs of LTE (10 MHz) and (5 MHz).

As a result, the hardware usage of the LTE (10 MHz) bandwidth DUC, plus the third HB filter of the LTE (5 MHz) DUC, could represent that of both the (5 MHz) and (10 MHz) bandwidth modes combined. Similarly, the hardware utilisation of the IEEE 802.16e (7 MHz) DUC, plus the fourth HB filter required for the (3.5 MHz) mode, and the IEEE 802.16e (10 MHz) DUC plus the fourth HB filter in the (5 MHz) DUC, could represent that of all the IEEE 802.16e modes combined. The same technique can be applied to the other standards in the study, e.g. TV white space IEEE 802.11n, IEEE 802.16e and LTE.

The hardware usage of the HB filters is illustrated in Table 6.5. The 3rd HB filter in LTE (5 MHz), 4th HB filter in IEEE 802.16e (3.5 MHz) and 4th HB filter in IEEE 802.16e (5 MHz) each occupies 1 DSP48E due to symmetry of HB filter coefficients. Similarly, the 3rd HB filter in white space IEEE 802.11n (5 MHz) uses 3 DSP48Es, and the HB filters for all other white space DUC designs occupy 7 DSP48Es respectively.

Notably, the wordlength of TV white space HB filters is increased from 18 to 22 compared to those in conventional standards, resulting in more DSP48Es being required to perform the same calculation. The clock frequencies of the HB filters in conventional standards are a quarter of their respective system clock frequencies, and thus four operations can be performed for every sample. However, the clock

| Design                           | Taps | Slices | DSP48Es | RAMs |
|----------------------------------|------|--------|---------|------|
| 3rd HB LTE (5 MHz)               | 7    | 124    | 1       | 0    |
| 4th HB IEEE 802.16e (3.5 MHz)    | 7    | 138    | 1       | 0    |
| 4th HB IEEE 802.16e (5 MHz)      | 7    | 143    | 1       | 0    |
| 3rd HB WS IEEE 802.11n (5 MHz)   | 11   | 152    | 3       | 0    |
| 4th HB WS IEEE 802.11n (5 MHz)   | 11   | 259    | 7       | 0    |
| 4th HB WS LTE (5 MHz)            | 11   | 252    | 7       | 0    |
| 5th HB WS IEEE 802.16e (3.5 MHz) | 11   | 251    | 7       | 0    |
| 5th HB WS IEEE 802.16e (5 MHz)   | 11   | 269    | 7       | 0    |
| Total                            |      | 1588   | 34      | 0    |

Table 6.5: Hardware Usage for HB Filters.

frequencies of HB filters in TV white space are only half of their respective system clocks, which means that the clock frequencies have doubled and only two operations are possible per sample. These are two main reasons that the number of DSP48Es increases dramatically for the HB filters in the TV white space versions.

Taking into account the identified commonality between modes of the same standard as discussed above, the DUC would require to implement only a subset of modes, with HB filters added to achieve the others. That is to say, if the DUC comprises implementations of the DUCs for LTE (10 MHz), WCDMA, IEEE 802.16e (7MHz), IEEE 802.16e (10 MHz), IEEE 802.11n (20 MHz), white space LTE (10 MHz), white space IEEE 802.16e (7 MHz), white space IEEE 802.16e (10 MHz), white space IEEE 802.11n (20 MHz) and the additional HB filters listed in Table 6.5, then the full set of considered modes can be supported. The resulting architecture is as shown in Figure 6.15.

The total hardware resource utilisation of the above set of standards according to Table 6.3 in terms of slices, DSP48Es and BRAMs are 7661, 179 and 49 respectively. Similarly, the total hardware usage of the OFDM and spreader modules are 810 slices, 9 DSP48Es and 10 BRAMs respectively.



Figure 6.15: Fixed Multiple Standards Design Architecture.

In addition, dual port BRAMs are required to bridge the clock domain boundary between the IFFT component and the various DUC components, and thus two BRAMs are required for one standard: one for I channel data and the other for Q channel data. As a result, 18 BRAMs are required in total to precede the DUC designs.

Taking into account all of these factors, the hardware resource utilisation of the fixed multiple standards design is the sum of the implementation results discussed earlier, 18 BRAMs, and the additional HB filters from Table 6.5, i.e. 10059 slices, 222 DSP48Es and 77 BRAMs in total. Therefore, a larger FPGA device would be needed (there are only 64 DSP48Es in the Virtex-5 LX110T device), thus resulting in higher terminal size, power consumption and device cost.

By contrast, the new DRP-PR architecture allows multiple RMs to time-share the resources, provided that each RP has been adequately defined. This architecture is illustrated in Figure 6.16. The transform RP accommodates the IFFT and spreader RMs, and the DUC RP is associated with the RMs of 4 standards and 17 modes. As a result, the hardware usage is the sum of the largest DUC and transform RMs from Table 6.4, equivalent to 1636 slices, 42 DSP48Es and 21 BRAMs, as shown in (6.1).

$$Resource = max(Transform(RP)) + max(DUC(RP))$$
(6.1)

As shown in Figure 6.17, the proposed architecture can achieve reductions of 83.7%, 81.1% and 72.7% in terms of the number of slices, DSP48Es and BRAMs required, respectively. Therefore, the DRP-PR design method is seen to reduce FPGA resource utilisation significantly, compared to the fixed multiple standards design. Moreover, only two clock oscillators are employed (100MHz and 256MHz), while clock oscillators for 245.76 MHz, 358.4 MHz and 240 MHz are not needed, resulting in a further reduction in overall system cost, and a simpler architecture.



Figure 6.16: DRP-PR Architecture to Support Multiple Standards and Modes.



Fixed Multiple Standards Design DRP-PR Architecture

Figure 6.17: Hardware Utilisation Comparison of Two Methods.

# 6.8.2 Comparison of Proposed DRP-PR Design with Programmable Multiple Standards Design

Standard or mode switching can also be implemented using the programmable multiple standards design method described in Section 6.1.3. In this case, configuration files have to be downloaded which reconfigure the entire device for the desired standard or mode. This method may also require only two clock oscillators: one for the mapper modules, and the other for the DUC design. The DCM for the DUC can be set to a different value to cater for each of the supported DUC designs. The bitstream size is linked to the type and size of the FPGA device. In the case of the Virtex-5 LX 110T device, the full configuration file has a size of 3799 KB. The bitstream size of the programmable multiple standards design is therefore 3799 KB, and this can be compared against the PR bitstream sizes of the proposed DRP-PR architecture, as summarised in Table 6.6.

The size of the RP area is based on the frame in the floorplanning process. As discussed in Chapter 3, the configuration memory is grouped by columns and the column can be further divided into frames. The frame is the smallest unit of configuration memory and hence all operations must be based on whole frames. Since each RM within one RP shares the same hardware resource, they have the same bitstream size. The size of partial bitstream defines the reconfiguration overhead, assuming that the download speed is fixed.

In total four partial bitstreams must be downloaded for the corresponding RPs to swap standards in the worst case, e.g. standard switching from WCDMA to TV white space IEEE 802.11n (20 MHz) with 64-QAM modulation. As a result, the total size of reconfiguration bitstream required is the sum of the four partial bitstreams i.e. 411+343+30+19=803 KB, which represents a reduction of 78.9% compared to the 3799 KB of the programmable multiple standards design method.

An additional consideration is that the device can operate immediately after downloading the partial bitstreams, whereas the device has to perform a number of startup sequences and thus suffers from a long reconfiguration overhead when the full

| RP        | RMs                  | Number of<br>Frames | Bitstream Size<br>(KB) |  |
|-----------|----------------------|---------------------|------------------------|--|
|           | LTE (5 MHz)          |                     |                        |  |
|           | LTE (10 MHz)         |                     |                        |  |
|           | 802.16e (3.5 MHz)    |                     |                        |  |
|           | 802.16e (5 MHz)      |                     |                        |  |
|           | 802.16e (7 MHz)      |                     |                        |  |
|           | 802.16e (10 MHz)     |                     |                        |  |
|           | WCDMA                |                     |                        |  |
| DUC       | 802.11n (20 MHz)     | 60                  | 411                    |  |
|           | WS 802.11n (5 MHz)   |                     |                        |  |
|           | WS 802.11n (10 MHz)  |                     |                        |  |
|           | WS 802.11n (20 MHz)  |                     |                        |  |
|           | WS 802.16e (3.5 MHz) |                     |                        |  |
|           | WS 802.16e (5 MHz)   |                     |                        |  |
|           | WS 802.16e (7 MHz)   |                     |                        |  |
|           | WS 802.16e (10 MHz)  |                     |                        |  |
|           | WS LTE (5 MHz)       |                     |                        |  |
|           | WS LTE (10 MHz)      |                     |                        |  |
| Transform | IFFT                 | 50                  | 343                    |  |
|           | Spreader             |                     |                        |  |
|           | QPSK                 |                     | _                      |  |
| Mapper    | 16-QAM               | 5                   | 30                     |  |
|           | 64-QAM               |                     |                        |  |

 Table 6.6: Partial Bitstream Size on Virtex-5 LX 110T Device.

| RP                      | RMs  | Number of<br>Frames | Bitstream Size<br>(KB) |  |
|-------------------------|------|---------------------|------------------------|--|
| CL V                    | 1/8  |                     |                        |  |
| CLK<br>Divider          | 1/16 | 3                   | 19                     |  |
|                         | 1/32 |                     |                        |  |
|                         | 1/64 |                     |                        |  |
| Full Configuration File |      | Whole FPGA          | 3799                   |  |

Table 6.6: Partial Bitstream Size on Virtex-5 LX 110T Device.

bitstream is downloaded to reconfigure the entire device. Furthermore, the programmable multiple standards design method involves downloading the different configuration files even to make a small modification to the design, e.g. mapper switching, resulting in an inflexible implementation.

Taking all of these factors into account, it may therefore be seen from Table 6.6 that switching functions can be achieved in significantly less time using PR, as compared to the programmable multiple standards design method. The proposed DRP-PR architecture is capable of addressing the problems of reconfiguration latency caused by conventional FPGA reconfiguration in the SDR system.

# 6.8.3 Comparison of the Proposed DRP-PR Design with an Architecture based on PR only

The switching of standards or modes can also be implemented using the multiple clock oscillators design method described in Section 6.2.2. However, five oscillators are required in total: 240 MHz, 245.76 MHz, 256 MHz and 358.4 MHz for the DUC RP, and 100 MHz for the mapper RP.

Clock Enable (CE) circuitry is used to generate the various clock frequencies required for the mapper RP. In order that the data output by the mapper RP can then be fed to the transform RP operating at a clock frequency of 100 MHz, three clock frequencies: 200, 400 and 600 MHz, are required to cater for the QPSK, 16-QAM and

64-QAM modules respectively. A DCM is employed to generate a clock frequency of 600 MHz from the 100 MHz oscillator. The frequency of 400 MHz can be generated from the 600 MHz clock using a CE that is enabled on two clock cycles out of every three. Similarly, a frequency of 200 MHz can be generated by enabling CE on one in every three clock cycles.

However, the procedure discussed to generate various clock frequencies via a CE is at the cost of high operating clock frequency. In other words, the DCM has to generate the clock frequency with 600 MHz continuously, and based on this to generate a 400 MHz frequency for 16-QAM and a 200 MHz frequency for QPSK, resulting in high power consumption within the device, as will be discussed in Chapter 7. Similarly, the rest of the clock oscillators also have to operate continuously to serve the various standards or modes, which further increases the power consumption in the system.

Compared to the multiple oscillators design method, the proposed DRP-PR architecture provides a significant reduction in terms of power consumption and device cost, as three clock oscillators are saved. The modulation and DUC modules are implemented by PR and thus the hardware usage in terms of slices, DSP48Es and BRAMs is approximately equal for the proposed DRP-PR architecture and the multiple clock oscillators design.

With regard to the normalised clock oscillator design, it is difficult to find a suitable normalised clock frequency to support the 240 MHz, 245.76 MHz, 256 MHz and 358.4 MHz operating frequencies on the FPGA. Consequently, this method is not appropriate for supporting the 4 standards and 17 modes considered in this study.

## 6.9 Summary

Taking into account all aspects of implementation, the block diagram for the DRP-PR architecture for the SDR transmitter is represented in Figure 6.18. The SDR control unit is mainly used to store the partial bitstream files, and to execute download of the corresponding partial bitstreams to the FPGA device, according to the requirements of



Figure 6.18: Block Scheme for the Transmitter based on Proposed DRP-PR Architecture.

the SDR library. In this study, a conventional PC is used to store the bitstreams and control the download of bitstreams to the FPGA through a JTAG port. Additionally, parameters of the mapper DCM, DUC DCM and FFT core can be reconfigured directly while the FPGA is operating. Various values of multiplier and divisor supplied to the mapper and DUC DCMs determine the selected mapper and DUC schemes respectively, and similarly, the CP length and FFT size can also be programmed to cater for the standards based on the OFDM structure.

Considering all of the FPGA design methods discussed in this chapter, the properties of each design method can be summarised in Table 6.7. Based on the analysis undertaken, it is clear that the proposed DRP-PR architecture is capable of converging all of the benefits of other design methods, while overcoming their

# CHAPTER 6 - Hierarchical Design

| Architecture                              | Comments                                                                                                                                                                                                                                                            |                                                                                                                                                         |  |  |
|-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
|                                           | Strengths                                                                                                                                                                                                                                                           | Weaknesses                                                                                                                                              |  |  |
| Fixed Multiple Standards<br>Design        | <ul> <li>Support 4 standards and<br/>17 modes</li> <li>Switch functionalities<br/>with ease</li> </ul>                                                                                                                                                              | <ul> <li>Highest hardware usage<br/>and device cost</li> <li>Highest power<br/>consumption</li> <li>Low expansibility and<br/>update-ability</li> </ul> |  |  |
| Programmable Multiple<br>Standards Design | <ul> <li>Support 4 standards and<br/>17 modes</li> <li>Reduce hardware usage</li> <li>Only two clock<br/>oscillators required</li> <li>Higher expansibility and<br/>update ability</li> </ul>                                                                       | <ul> <li>Large reconfiguration<br/>overhead</li> <li>Inflexibility in<br/>functionalities switching<br/>with small modifications</li> </ul>             |  |  |
| Multiple Clock<br>Oscillators Design      | <ul> <li>Support 4 standards and<br/>17 modes</li> <li>Occupies almost the<br/>least FPGA hardware<br/>resource</li> <li>Switch functionalities<br/>with ease</li> </ul>                                                                                            | <ul> <li>Five clock oscillators<br/>are required</li> <li>High power<br/>consumption</li> <li>Low expansibility and<br/>update-ability</li> </ul>       |  |  |
| Normalised Clock<br>Oscillator Design     | • Two clock oscillators are required in theory                                                                                                                                                                                                                      | • Cannot support the 4<br>standards and 17 modes                                                                                                        |  |  |
| Proposed DRP-PR<br>Architecture           | <ul> <li>Support 4 standards and<br/>17 modes</li> <li>Occupies the least<br/>FPGA hardware resource</li> <li>Switch functionalities<br/>with ease</li> <li>Only two clock<br/>oscillators required</li> <li>Higher expansibility and<br/>update-ability</li> </ul> | <ul> <li>Need to be<br/>familiar with the target<br/>device</li> <li>More complex design<br/>architecture</li> </ul>                                    |  |  |

| <b>Table 6.7:</b> Comparison of Reconfiguration Design Methods based on FPGA. |  |
|-------------------------------------------------------------------------------|--|
|-------------------------------------------------------------------------------|--|

drawbacks. It is able to support the set of 4 standards and 17 modes discussed in Chapters 4 and 5 with the least hardware resource utilisation. Function module switching can be implemented by downloading the corresponding partial bitstream files, resulting in the lowest reconfiguration overhead and meeting the requirements of a real-time SDR system. An additional advantage is that only two clock oscillators are required, which further reduces the SDR equipment cost and size. Further, the proposed architecture has great expansibility and potential for updates; because it is capable of reconfiguring not only the functionalities but also the clock frequencies synthesised from a fixed clock oscillator, which increases design and update flexibility dramatically. Therefore, it can increase the product life cycle of the SDR system and reduce the cost of ongoing maintenance, thus meeting the technical and commercial requirements of an SDR system.

# 6.10 Concluding Remarks

In this chapter, a novel physical layer architecture for an SDR has been proposed, using PR and DRP reconfiguration technologies based on a single FPGA device. An example architecture has been developed to support LTE, WCDMA, IEEE 802.16e and IEEE 802.11n standards and a subset of their various modes, together with variations to support TV white space applications, in the transmitter chain.

This work represents an extension compared to many other SDR architectures, as it involves not only baseband but also IF processing. It is demonstrated that, for the target device considered, the proposed architecture could achieve reductions of 83.7%, 81.1% and 72.7% in respect of slices, DSP48Es and BRAMs resources respectively, while three fewer clock oscillator inputs are required compared to the fixed multiple standards design. Moreover, this architecture provides the potential to readily integrate new standards or modes into the design. Therefore, the proposed method could reduce the SDR device size, power consumption and cost significantly, while maintaining a high degree of design and function switching flexibility.

Power consumption is another important aspect which should be considered in the design of an SDR system, as it affects the performance of the entire SDR system in particular with respect to operating lifetime and energy costs. Therefore, an SDR system should have both good performance and low power consumption. In this chapter, a variety of modules were implemented using PR in the proposed DRP-PR architecture which demonstrated that it is capable of meeting the operational requirements of SDR. The use of PR technology is also capable of reducing the power consumption compared to conventional FPGA designs, and this will be further analysed for the proposed SDR architecture in Chapter 7.

# Chapter 7

# **Power Consumption**

In Chapter 6, a novel architecture based on DRP and PR technologies on a single FPGA device to support multiple communication standards and modes was proposed, and a number of modules with PR implementation were presented. The DRP-PR architecture is capable of switching underlying hardware functions and decreasing reconfiguration overhead significantly, resulting in reduced complexity and increased design flexibility in SDR systems. Another attractive feature of PR is that it enables power savings to be made, particularly as a smaller device may be utilised as a result of the implied time-multiplexing.

In this chapter, PR is analysed in terms of power consumption based on a number of modules in the proposed DRP-PR architecture. First, an overview of power consumption in FPGA design, such as concepts of static power and dynamic power, are introduced. Then, the conventional FPGA power analysis methodology is presented, and a modified power analysis methodology is proposed for PR implementations. The size of RP and types of clock routing are two factors affecting the dynamic power consumption in the process of PR. In order to investigate the influence of these two factors on power consumption, a number of modules are implemented, and the results obtained demonstrate firstly that only global clock routing resources are suited to PR design on the Virtex-5 device, and secondly that the size of RP can contribute to the reduction of power consumption for modules in the SDR system. The obtained rules for minimising dynamic power consumption can be applied to the proposed DRP-PR architecture discussed in Chapter 6 so as to optimise the original architecture floorplanning, and to achieve a further reduction in terms of power consumption and partial bitstream size. Thus, the proposed DRP-PR architecture is proven to have an additional advantage in terms of power consumption.

## 7.1 Overview of FPGA Power Consumption

Power consumption is an important factor in the process of FPGA designs because it can affect the system cost, performance, and functional lifetime significantly, in particular for mobile SDR platforms. Low power design can lead to a number of benefits. For example, a less expensive power supply is required, and thus the overall system cost and ongoing cost of operation can be reduced. Since power consumption is related to heat dissipation, low power design can also reduce the system temperatures and thus improve reliability [184]. Therefore, achieving a low power design is an important target when developing an SDR system.

#### 7.1.1 Static Power Consumption

There are two main sources of power consumption in the FPGA design. One is referred to as *static power* and the other is referred to as *dynamic power*. The static power is the power consumed by transistors due to leakage when the FPGA device is powered up. The static power consumption is composed of two forms of transistor leakage: source-to-drain (also referred to as sub-threshold) leakage; and gate leakage, as illustrated in Figure 7.1. The gate leakage ( $I_{GATE}$ ) means the current that flows from gate to substrate. The source-to-drain leakage ( $I_{S-D}$ ) is the current that flows in the channel from the source to drain. The  $I_{CCINTQ}$  expresses the FPGA core quiescent current and is the sum of the gate leakage and source-to-drain leakage [183].



 $I_{CCINTQ} = I_{S-D} + I_{GATE}$ 

Figure 7.1: Transistor Leakage Current [184].

#### 7.1.2 Dynamic Power Consumption

The dynamic power is the power consumed by switching events when the FPGA device is powered up and processing data. As shown in (7.1), the dynamic power can be expressed as a function of voltage, switching frequency and capacitance [185][186].

$$P_{Dynamic} = \sum_{i=0}^{N-1} (C_i \times F_i \times V^2).$$
(7.1)

where,  $C_i$  and  $F_i$  are the capacitance and toggle rate (i.e. switching activity) of the *i*th net, V is the internal voltage and N is the total number of nets.

It is obvious, based on (7.1), that a reduction in dynamic power consumption can be achieved by reducing the capacitance, toggle rate or internal voltage for any given net. Reducing the voltage or toggle rate can negatively affect the overall system timing performance with the implication that timing requirements may not be met [183]. The analysis in this thesis therefore focuses on the reduction of signal net capacitance with default voltage settings, which can be achieved through the introduction of area constraints. This approach does not imply any modification to the originally designed functionality or timing requirements.

Related work on power consumption was introduced in Section 1.2.5 on page 4. This chapter focuses on a study of dynamic power consumption based on implementation with the latest PR design flow. An important and necessary aspect of PR design is choosing an appropriate size of RP, noting that there are disadvantages to choosing a region which is too large or too small. Distribution of the clock signal is most important in FPGA designs, and PR provides two options for passing the clock signal into the RP: (i) via the dedicated clock networks, e.g., global or regional clock resources, or (ii) by passing the clock signal through a partition pin to utilise generic interconnects, resulting in local clock routing. Therefore, the dynamic power consumption of PR is analysed in terms of two factors: RP size and the type of clock routing adopted. This analysis is undertaken using the proposed DRP-PR architecture with a large number of functional modules.

# 7.2 Power Analysis Approach

#### 7.2.1 Tools for Analysing Power Consumption

Xilinx provides two convenient tools for power consumption estimation and analysis: the Xilinx Power Estimator (XPE), and the Xilinx Power Analyzer (XPA). The XPE tool is based on a spread-sheet-table, and users can obtain a raw power estimation by means of setting various parameters, e.g. junction temperature, clock frequency, and hardware resource utilisation in terms of DSP48Es, slices and BRAMs. This method is not based on the implementation of the FPGA design and thus can be used for power estimation at any time in the design process [188] [190].

The benefit of XPA is that it can provide a more accurate power analysis than the XPE tool. It supports specific analysis of the implemented design and calculation of the power by summing the consumed power of all nodes in the design, according to the design circumstance generated during the implementation. The switching activities of nodes are normally calculated by the XPA default settings because the implemented

design cannot provide switching activity information. However, the XPA tool is also capable of extracting actual switching activity from the Value Change Dump (VCD) files generated during behavioural or post-route simulations. Thus loading both VCD and implemented design permits more accurate power analysis to be undertaken than if based only on the implemented design [189]. Taking these considerations into account, the power consumption analysis presented in this thesis is based on results generated using the XPA tool and using the VCD file.

#### 7.2.2 XPA Power Analysis Methodology

The XPA-based power consumption analysis approach for conventional FPGA design is illustrated in Figure 7.2. First, Verilog or VHDL code is synthesised and then constraints, such as timing constraints, are integrated with the synthesised files (i.e. netlists) prior to implementation. The implementation process consists of three steps: translation, mapping and place & route. The generated netlists and all of the constraint files are merged in the process of translation. Then, all of the merged files are mapped into FPGA elements according to available resources. The final step is to place and route the design to meet the requirements of constraints on the target devices. The output of the implementation process is referred to as the Native Circuit Description (NCD) file, which includes all of the information about the implemented design on the target device. The VCD file can be obtained through post-route simulation. The XPA tool is able to provide an accurate power consumption analysis for the designs by loading the VCD and the generated NCD files.

A modified methodology for PR implementation is proposed in Figure 7.3 based on the power consumption analysis approach for conventional FPGA designs. One of the inputs to the PR implementation phase is therefore to define the size and position of the RPs, and this is achieved by setting area constraints. Iterative approaches can be taken whereby multiple implementations of the RM can be undertaken using different area constraints, in order to find the best solution.



Figure 7.2: Power Analysis with XPower Analyzer for Conventional FPGA Design.



Figure 7.3: Power Analysis with XPower Analyzer for PR Designs.

#### 7.2.3 Elements of Dynamic Power Consumption

XPA analyses dynamic power by summing the consumed power of all nodes in terms of clock power, signal power, logic power, I/O power and the power associated with dedicated resources, such as BRAMs and DSP48Es, according to the implemented design, as shown in Figure 7.4. Consequently, in this study the different elements contributing to the total dynamic power consumption can be analysed individually. The TV white space version of IEEE 802.11n (20 MHz) is taken as an example, as described by Figure 7.5. The distribution of dynamic power consumption across these elements depends to a great extent on the design itself, and thus can vary significantly even on the same device. Details of each component contributing to the overall dynamic power consumption are as follows:



Figure 7.4: Xpower Environment.

• *Clock Power*: Clock power is the most important factor in the dynamic power consumption, and it can account for 50%–70% of total dynamic power [186] [187]. FPGAs have dedicated clock networks to distribute clock signals to logic and embedded resources with minimum clock skew and latency. A reduction in clock power can be obtained by reducing clock routing capacitance or clock frequency.

• *Logic Power*: Logic power is dissipated by logic resources, e.g. slices, LUTs and FFs in the design. Reducing the number of logic resources results in lower logic power consumption.



## **Dynamic Power Consumption**

Figure 7.5: Dynamic Power Consumption of TV White Space IEEE 802.11n (20 MHz).

• *Signal Power*: Signal power is that consumed by signal nets within the design. Like clock power, signal power can be reduced by reducing routing capacitance.

• *I/O Power*: The I/O blocks can be considered as the interface between the pins of the FPGA package and internal logic. The choice of I/O standards is significant in terms of dynamic power, because the various I/O standards supported by FPGAs require different supply voltages; a low voltage standard can be selected where appropriate to reduce the I/O dynamic power contribution.

• *BRAMs Power*: This is the power dissipated by Block RAMs on the device, and it scales with the number of Block RAMs in use.

• *DSP48s Power*: Similarly, DSP48s power is caused by DSP48s required in the implementation, and scales with the number of DSP48s in use.

### 7.3 Key Factors in the Process of PR

#### 7.3.1 Area Constraints

Defining area constraints is a necessary step for PR implementation, as the device must be allocated into reconfigurable and static regions. In order to investigate the influence of area constraints on power consumption, a set of area constraint rules are defined:

• *Area Constraint 1*: The allocated RP provides approximately 10% fewer resources than are required by each RM individually.

• *Area Constraint 2*: The allocated RP provides approximately equal resources to those required by each RM individually.

• *Area Constraint 3*: The allocated RP provides approximately 10% additional resources than are required by each RM individually.

• *Area Constraint 4*: The allocated RP provides approximately 20% additional resources than are required by each RM individually.

• *Area Constraint 5*: The allocated RP provides approximately 50% additional resources than are required by each RM individually.

• *Area Constraint 6*: The allocated RP provides approximately double the resources required by each RM individually.

Specifying the sizes of RPs in a design plays an important role in the PR design process. Choosing a larger size implies that the allocated resources are not utilised efficiently, and that more frames are occupied, resulting in larger partial bitstream files. A smaller RP requires fewer frames, resulting in a smaller partial bitstream size, and hardware resource utilisation efficiency can be improved. On the other hand, PR implementation may not be performed successfully if the RP size is too tightly constrained. In general, it is recommended by Xilinx that the size of RP is set to approximately 20% additional resource for the RM, so as to avoid PR implementation errors and meet critical timing requirements [197]. However, the applicability of this guideline figure depends on the specifics of individual modules. In this study,

individual modules are analysed in terms of PR implementations using the set of various area constraints defined in order to investigate the influence of RP size on power consumption.

As discussed earlier, Area Constraint 1 has the smallest RP size and Area Constraint 6 has the largest RP size. The position of RP defined by Area Constraint 6 occupies the major part of conventional implementations without area constraints in order to minimise the influence of different clock regions on power consumption (Virtex-5 LX 110T die consists of 16 clock regions in total and the RP with the same size can be defined in different clock regions). Note that the defined Area Constraints 1–6 have the same clock regions for each module in order to obtain more accurate results. The allocated RP with approximately double the resources required is referred to the slice resources instead of DSP48Es and BRAMs, because the number of DSP48Es and BRAMs is limited on the device. The sizes and positions of the RPs defined by Area Constraints 1–6 are illustrated in Figure 7.6. The diagram represents only part of the entire FPGA device, and clock regions span half of the die horizontally.



Figure 7.6: Defined Area Constraints 1-6.

#### 7.3.2 Clock Distribution Mechanisms

The type of clock routing is another factor which can affect the power consumption of PR designs. Three types of clock routings are available: global clock routing, regional clock routing and local clock routing. Conventional FPGA implementation usually employs global clock routing because development tools automatically target these high performance, dedicated resources. The Global Clock Buffer (BUFG) is a necessary component for global clock routing, which can distribute clock signals into the global dedicated clock networks. These clock networks can span the entire FPGA and are a dedicated resource designed to reduce clock skew and provide high timing performance.

Regional clock routing is another set of dedicated clock resources designed for low clock skew on FPGAs and is independent of global clock resources. Like global clock routing, the regional clock routing is driven by the Regional Clock Buffer (BUFR), which can distribute the clock signals into the regional clock networks to serve the design. Finally, unlike global and regional clock routings utilising dedicated clock networks, local clock routing takes advantage of generic interconnects to connect the required hardware resources.

In the case of the Virtex-5 LX 110T device, the clock resources are illustrated in Figure 7.7. There are 32 BUFGs in total located in the centre of the device, and the global clock networks driven by one BUFG can span the entire device. Each clock region contains two BUFRs and there are 32 BUFRs in total located at the left or right side of clock regions. The regional clock network driven by one BUFR can only cover three adjacent regions at maximum. Therefore, it is not allowed to implement the design which can be routed over three adjacent clock regions by utilising the regional clock resources.

The module of LTE (10 MHz) is taken as an example and is implemented with global clock, regional clock and local clock routings respectively. These implementations are performed with the same area constraints in each case, and the resulting clock routings are illustrated in Figure 7.8.



Figure 7.7: Clock Resources of Virtex-5 LX 110T Device.

In PR design, partition pins must be added for each signal which passes between the static logic and RPs, to make sure that communication is successful. It is also possible for the clock signal to utilise the global clock networks or regional clock networks within the RP, given the importance attached to the integrity of the clock signal. In other words, PR design provides two options for passing the clock signal from the static region to the RPs: the first is to utilise the dedicated clock networks without adding partition pins, resulting in global clock routing or regional clock routing. The second is to utilise the generic interconnects by adding a partition pin to the clock signal, which provides local clock routing. PR design tools are capable of selecting the appropriate option according to the source of clock signal. If the BUFG is employed and the clock signal is driven by the BUFG entering the RP, the PR design tool infers that global clock routing is employed when the clock signal is driven by the BUFR entering the RP. However, if the clock signal enters the RP from an I/O port block



Figure 7.8: Clock Routings of LTE (10 MHz): (a) Global Clock Routing, (b) Regional Clock Routing, (c) Local Clock Routing.

directly without passing through a BUFG or BUFR, the PR design tool inserts a partition pin into the clock signal path, as for any other signal, and implements local clock routing for the RM.

The situation of regional clock routing for PR design in the case of Virtex-5 family devices is more complicated. According to the latest PR user guide, the PR design tool has to insert a partition pin to the clock signal driven by the BUFR; this applies only to
the Virtex-5 devices [33]. In other words, the regional clock routing of PR design on Virtex-5 devices does not utilise the real regional clock resources even though the clock signal is driven by the BUFR, as illustrated in Figure 7.9. The actual clock routing can be referred to as a "mixed clock routing" of regional clock and local clock routings: the clock routings between the output of BUFR and the input of the RP are real regional clock routings; the clock routings after the partition pin and within the RP utilise local clock routing resources.



Figure 7.9: Actual Clock Routing of PR Design on Virtex-5 Device.

Taking into account the clock routing resources on FPGAs discussed above, three types of clock routing resources can be considered for PR designs. However, it is important to note that local clock routings can cause large clock skews and unpredictable behaviour and errors at runtime. Therefore, utilising local clock routing resources is not recommended for PR designs. For the PR design on Virtex-5 devices, the actual regional clock routing is a mix of regional clock and local clock routings and thus it is also not recommended for PR design. As a result, only global clock resources are suited to the PR design on Virtex-5 devices and thus the power consumption is

analysed in terms of Area Constraints 1–6 defined earlier and global clock resources in this study. For the most recent FPGAs, including Virtex-6 and Virtex-7, a partition pin is not required for the clock signal driven by the BUFR, and thus both global and regional clock resources may be used for PR designs on these devices.

## 7.4 Implementation Results and Analysis

In this section, the results of conventional FPGA implementations are first listed. Here, the term "conventional implementation" is used to refer to synthesis and implementation with default settings and without area constraints. Then, modules are implemented using the set of different area constraints defined previously, allowing the influence of this factor on dynamic power consumption to be analysed.

#### 7.4.1 Conventional Implementation Results

The DUCs for LTE (10 MHz), WCDMA, IEEE 802.16e (7MHz), IEEE 802.16e (10 MHz), IEEE 802.11n (20 MHz), white space LTE (10 MHz), white space IEEE 802.16e (7 MHz), white space IEEE 802.16e (10 MHz) and white space IEEE 802.11n (20 MHz) are taken as examples. Since the spreader, mapper and clock divider modules occupy few slices without any DSP48Es and BRAMs, their resource utilisation and power consumption are insignificant compared to the DUC and transform RPs, and they can be omitted from further analysis in this study. Table 7.1 lists the conventional FPGA implementation results of the remaining modules.

|                     | Clock         | Other Dynamic              |               |        | f <sub>max</sub> |       |       |
|---------------------|---------------|----------------------------|---------------|--------|------------------|-------|-------|
|                     | Power<br>(mW) | Power <sup>1</sup><br>(mW) | Power<br>(mW) | Slices | DSP<br>48Es      | BRAMs | (MHz) |
| LTE (10 MHz)        | 103.72        | 92.98                      | 196.70        | 684    | 13               | 9     | 407.2 |
| 802.16e (7 MHz)     | 126.67        | 96.64                      | 223.31        | 804    | 12               | 9     | 433.3 |
| 802.16e (10 MHz)    | 108.17        | 123.77                     | 231.94        | 599    | 11               | 9     | 387.1 |
| WCDMA               | 89.78         | 67.87                      | 157.65        | 708    | 5                | 9     | 435.2 |
| 802.11n (20 MHz)    | 69.51         | 70.44                      | 139.95        | 488    | 12               | 9     | 442.1 |
| WS 802.11n (20 MHz) | 139.91        | 97.75                      | 237.66        | 995    | 30               | 1     | 368.3 |
| WS 802.16e (7 MHz)  | 190.69        | 113.10                     | 303.79        | 1173   | 33               | 1     | 411.9 |
| WS 802.16e (10 MHz) | 260.54        | 152.47                     | 413.01        | 1184   | 30               | 1     | 397.9 |
| WS LTE (10 MHz)     | 154.94        | 100.17                     | 255.11        | 1026   | 33               | 1     | 310.0 |
| OFDM                | 44.59         | 73.14                      | 117.73        | 797    | 9                | 10    | 297.3 |

Table 7.1: Conventional Implementation Results without Area Constraint.

1.  $P_{\text{Other}} = P_{\text{Logic}} + P_{\text{Signal}} + P_{\text{BRAMs}} + P_{\text{DSP48s}} + P_{\text{IO}}$ 

Note that all of the modules are implemented using the Xilinx ISE 12.4 software suite and targeting the Virtex-5 LX 110T device. The *clock power* column states the clock power consumption of the modules and the *other power* column expresses the sum of logic power, signal power, I/O power, and the power consumed by the DSP48Es and BRAMs. The *dynamic power* column refers to the total dynamic power consumption of the design, i.e. the sum of the clock power and other power. The  $f_{max}$  column provides the maximum clock frequencies supported by each of the modules analysed in the test.

### 7.4.2 PR Implementation Results

PR implementation results based on the Area Constraints 1–6 with global clock routing resources are presented in Table 7.2.

The table features the *range* column, which describes the size and position of the RP. The range defines the area from lower left corner to the upper right corner. TV

white space IEEE 802.11n (20 MHz) with Area Constraint 6 is taken as an example, and the range of its RP can be expressed as X12Y80—X37Y159, which implies the lower left corner is located as X12Y80 and the upper right corner is located as X37Y159. The diagram of this range is shown in Figure 7.10.



Figure 7.10: Range of TV White Space IEEE 802.11n (5 MHz) with Area Constraint 1.

Since the number of DSP48Es and BRAMs cannot be reduced during PR implementation and their utilisation remains the same, DSP48E and BRAM utilisation are omitted from these test results. It is important to note that the same I/O port locations have been retained across the PR tests for each individual module with Area Constraints 1–6. This is because choosing I/O ports with different pin locations can cause placements (and hence signal routings) to vary, which may impact on dynamic power consumption. Retaining the same I/O port locations reduces any related influence on the results. Originally designed functionality and timing requirements are unmodified across the PR tests.

| Designs      | Area<br>Constraint | Clock<br>Power<br>(mW) | Other<br>Power<br>(mW) | Dynamic<br>Power<br>(mW) | Slices | f <sub>max</sub><br>(MHz) | Range           |  |
|--------------|--------------------|------------------------|------------------------|--------------------------|--------|---------------------------|-----------------|--|
|              | -10% (AC1)         | 60.35                  | 60.60                  | 120.95                   | 496    | 305.1                     | X28Y100-X39Y149 |  |
|              | 0% (AC2)           | 60.72                  | 61.30                  | 122.02                   | 513    | 390.8                     | X26Y100-X39Y149 |  |
| LTE (10 MHz) | 10% (AC3)          | 61.17                  | 59.60                  | 120.77                   | 472    | 342.2                     | X24Y100-X39Y149 |  |
| (,           | 20% (AC4)          | 63.07                  | 60.17                  | 123.24                   | 539    | 387.3                     | X22Y100-X39Y149 |  |
|              | 50% (AC5)          | 65.47                  | 57.70                  | 123.17                   | 586    | 385.9                     | X18Y100-X39Y149 |  |
|              | 100% (AC6)         | 68.92                  | 56.64                  | 125.56                   | 574    | 329.6                     | X8Y100-X39Y149  |  |
|              | -10% (AC1)         | 70.64                  | 65.83                  | 136.47                   | 569    | 311.7                     | X28Y80-X39Y139  |  |
|              | 0% (AC2)           | 71.42                  | 64.48                  | 135.90                   | 567    | 331.6                     | X26Y100-X39Y139 |  |
| 802.16e      | 10% (AC3)          | 71.68                  | 63.37                  | 135.05                   | 599    | 275.2                     | X26Y80-X39Y139  |  |
| (7 MHz)      | 20% (AC4)          | 73.75                  | 64.50                  | 138.25                   | 624    | 355.0                     | X24Y80-X39Y139  |  |
|              | 50% (AC5)          | 76.50                  | 62.48                  | 138.98                   | 669    | 382.6                     | X20Y80-X39Y139  |  |
|              | 100% (AC6)         | 79.23                  | 62.32                  | 141.55                   | 702    | 401.4                     | X12Y80-X39Y139  |  |
|              | -10% (AC1)         |                        |                        |                          |        |                           |                 |  |
|              | 0% (AC2)           | 87.98                  | 87.68                  | 175.66                   | 553    | 371.7                     | X28Y100-X37Y159 |  |
| 802.16e      | 10% (AC3)          | 89.01                  | 84.88                  | 173.89                   | 573    | 367.4                     | X26Y130-X37Y159 |  |
| (10 MHz)     | 20% (AC4)          | 89.74                  | 85.71                  | 175.45                   | 594    | 386.0                     | X26Y100-X37Y159 |  |
|              | 50% (AC5)          | 92.55                  | 85.00                  | 177.55                   | 619    | 383.6                     | X22Y100-X37Y159 |  |
|              | 100% (AC6)         | 95.94                  | 86.54                  | 182.48                   | 677    | 383.0                     | X18Y100-X37Y159 |  |
|              | -10% (AC1)         | 60.15                  | 54.62                  | 114.77                   | 503    | 379.1                     | X28Y60-X39Y109  |  |
|              | 0% (AC2)           | 61.44                  | 52.46                  | 113.90                   | 513    | 360.8                     | X26Y40-X39Y109  |  |
| WCDMA        | 10% (AC3)          | 62.61                  | 50.61                  | 113.22                   | 519    | 408.8                     | X24Y60-X39Y109  |  |
|              | 20% (AC4)          | 63.66                  | 52.06                  | 115.72                   | 545    | 426.6                     | X22Y60-X39Y109  |  |
|              | 50% (AC5)          | 65.86                  | 50.36                  | 116.22                   | 555    | 427.9                     | X18Y60-X39Y109  |  |
|              | 100% (AC6)         | 68.22                  | 51.51                  | 119.73                   | 614    | 387.6                     | X12Y60-X39Y109  |  |
|              | -10% (AC1)         | 51.89                  | 58.03                  | 109.92                   | 413    | 357.5                     | X26Y80-X33Y119  |  |
|              | 0% (AC2)           | 51.97                  | 58.21                  | 110.18                   | 398    | 308.5                     | X26Y60-X33Y119  |  |
| 802.11n      | 10% (AC3)          | 52.38                  | 56.64                  | 109.02                   | 441    | 271.0                     | X24Y90-X33Y119  |  |
| (20 MHz)     | 20% (AC4)          | 53.30                  | 57.40                  | 110.70                   | 412    | 314.1                     | X24Y70-X33Y119  |  |
|              | 50% (AC5)          | 55.12                  | 58.18                  | 113.30                   | 470    | 348.7                     | X22Y60-X33Y119  |  |
|              | 100% (AC6)         | 58.46                  | 55.51                  | 113.97                   | 520    | 376.8                     | X16Y60-X33Y119  |  |
|              | -10% (AC1)         | 83.31                  | 85.41                  | 168.72                   | 739    | 299.0                     | X26Y90-X37Y159  |  |
|              | 0% (AC2)           | 83.88                  | 73.28                  | 157.16                   | 736    | 377.1                     | X26Y80-X37Y159  |  |
| WS 802.11n   | 10% (AC3)          | 84.22                  | 69.05                  | 153.27                   | 676    | 300.9                     | X24Y80-X37Y159  |  |
| (20 MHz)     | 20% (AC4)          | 86.08                  | 79.08                  | 165.16                   | 691    | 404.0                     | X22Y80-X37Y159  |  |
|              | 50% (AC5)          | 88.30                  | 80.55                  | 168.85                   | 781    | 355.9                     | X20Y80-X37Y159  |  |
|              | 100% (AC6)         | 92.50                  | 77.60                  | 170.10                   | 813    | 297.1                     | X12Y80-X37Y159  |  |

**Table 7.2:** Implementation Results with Global Clock Routings and Area Constraints 1–6.

| Designs                | Area<br>Constraint | Clock<br>Power<br>(mW) | Other<br>Power<br>(mW) | Dynamic<br>Power<br>(mW) | Slices | f <sub>max</sub><br>(MHz) | Range          |  |
|------------------------|--------------------|------------------------|------------------------|--------------------------|--------|---------------------------|----------------|--|
|                        | -10% (AC1)         | 114.30                 | 93.37                  | 207.67                   | 890    | 284.9                     | X28Y20-X37Y119 |  |
|                        | 0% (AC2)           | 115.09                 | 94.23                  | 209.32                   | 926    | 295.7                     | X26Y20-X37Y119 |  |
| WS 802.16e             | 10% (AC3)          | 115.69                 | 81.23                  | 196.92                   | 870    | 342.5                     | X24Y60-X37Y119 |  |
| (7 MHz)                | 20% (AC4)          | 118.76                 | 89.32                  | 208.08                   | 1004   | 363.8                     | X24Y20-X37Y119 |  |
|                        | 50% (AC5)          | 118.62                 | 90.85                  | 209.47                   | 939    | 340.6                     | X20Y20-X37Y119 |  |
|                        | 100% (AC6)         | 124.71                 | 86.91                  | 211.62                   | 1009   | 379.4                     | X14Y20-X37Y119 |  |
|                        | -10% (AC1)         | 144.24                 | 114.70                 | 258.94                   | 904    | 448.2                     | X28Y20-X41Y99  |  |
|                        | 0% (AC2)           | 146.75                 | 112.85                 | 259.60                   | 935    | 397.1                     | X26Y20-X41Y99  |  |
| WS 802.16e<br>(10 MHz) | 10% (AC3)          | 146.98                 | 107.10                 | 254.08                   | 926    | 395.9                     | X24Y20-X41Y99  |  |
|                        | 20% (AC4)          | 147.54                 | 107.01                 | 254.55                   | 852    | 390.2                     | X22Y20-X41Y99  |  |
|                        | 50% (AC5)          | 152.38                 | 105.70                 | 258.08                   | 973    | 393.4                     | X18Y20-X41Y99  |  |
|                        | 100% (AC6)         | 159.16                 | 104.51                 | 263.67                   | 1054   | 393.1                     | X12Y20-X41Y99  |  |
|                        | -10% (AC1)         | 92.75                  | 84.68                  | 177.43                   | 780    | 256.4                     | X28Y60-X35Y159 |  |
|                        | 0% (AC2)           | 93.62                  | 81.19                  | 174.81                   | 789    | 259.9                     | X26Y60-X35Y159 |  |
| WS LTE                 | 10% (AC3)          | 95.11                  | 71.75                  | 166.86                   | 824    | 334.1                     | X24Y80-X35Y159 |  |
| (10 MHz)               | 20% (AC4)          | 96.47                  | 75.03                  | 171.50                   | 877    | 343.8                     | X24Y60-X35Y159 |  |
|                        | 50% (AC5)          | 99.63                  | 77.91                  | 177.54                   | 862    | 378.8                     | X20Y60-X35Y159 |  |
|                        | 100% (AC6)         | 103.14                 | 81.19                  | 184.33                   | 926    | 251.7                     | X16Y60-X35Y159 |  |
|                        | -10% (AC1)         | 29.03                  | 55.86                  | 84.89                    | 627    | 344.0                     | X28Y60-X45Y99  |  |
| OFDM                   | 0% (AC2)           | 29.62                  | 55.98                  | 85.60                    | 624    | 326.1                     | X26Y60-X45Y99  |  |
|                        | 10% (AC3)          | 30.25                  | 54.13                  | 84.38                    | 640    | 315.6                     | X24Y60-X45Y99  |  |
|                        | 20% (AC4)          | 30.62                  | 55.38                  | 86.00                    | 741    | 303.8                     | X22Y60-X45Y99  |  |
|                        | 50% (AC5)          | 33.07                  | 55.42                  | 88.49                    | 737    | 352.7                     | X16Y60-X45Y99  |  |
|                        | 100% (AC6)         | 36.95                  | 54.11                  | 91.06                    | 712    | 326.2                     | X6Y40-X45Y99   |  |

Table 7.2: Implementation Results with Global Clock Routings and Area Constraints 1–6.

Based on the results presented in Table 7.2, it is clear that the clock power consumption increases gradually while the other elements of power consumption fluctuate for each module with Area Constraints 1–6 defined. For Area Constraints 1–3, the clock power consumption increases slightly while the best results in terms of other power consumption can be obtained when Area Constraint 3 is applied (i.e. where the allocated area is 10% greater than the resources required), resulting in the lowest total dynamic power consumption. With regard to the IEEE 802.16e (10 MHz) module, Area Constraint 1 (i.e. where the allocated area is 10% fewer than the resources required) is too tightly constrained and thus PR implementation cannot be

performed successfully. With respect to Area Constraints 3–6, the clock power consumption increases while the other power consumption fluctuates slightly. The lowest clock power consumption can be obtained with Area Constraint 3, and thus Area Constraints 3 results in the lowest total dynamic power consumption.

The choice of RP size fundamentally affects the overall dynamic power consumption: too large an RP size results in high clock and dynamic power consumption; too small an RP size can lead to the lowest clock power consumption, but relatively high other power consumption, and thus overall power consumption is not the lowest in Area Constraints 1–6 for each individual module. Taking into account the considerations discussed above, the results obtained demonstrate that Area Constraint 3 (i.e. where the allocated area is approximately 10% greater than the resources required) leads to the lowest total dynamic power consumption for the considered RMs.

Comparing the results presented in Tables 46 and 47, it is clear that a significant reduction in clock and dynamic power consumption can be obtained when an area constraint of 10% greater than the resources is chosen. All the tests are implemented using global clock routing and the only difference between the conventional and PR cases are the area constraints. The reduction in power consumption may therefore be attributed to the applied area constraints.

Without area constraints, modules can be placed and routed loosely across the entire device, whereas setting appropriate area constraints prompts a tighter placement and routing. Tight placement results in a diminished length of nets, especially the clock net. The implication is a reduction in the capacitance of signals within the design, resulting in a power reduction according to (7.1). In addition, a number of global clock branches have to be utilised for any modules spanning many clock regions without area constraints. However, only the global clock branches of the constrained area have to be used and thus some clock branches are disconnected and freed, resulting in lower clock power consumption as compared to an implementation without area constraints.

TV white space IEEE 802.11n (20 MHz) is taken as an example. Since clock power

consumption is significant, and dominates the total dynamic power consumption according to Table 7.1, the clock routings of the modules are of primary interest. The results presented in Tables 46 and 47 show a reduction in clock power from 139.91 mW to 84.22 mW as a result of Area Constraint 3. The clock routings of TV white space IEEE 802.11n (20 MHz) using a conventional implementation without area constraints, and a PR implementation using global clock routing and Area Constraint 3, are shown in Figure 7.11.



Figure 7.11: Clock Routings Comparison: (a) Conventional FPGA Implementation without Area Constraints; (b) Global Clock Routing PR Implementation with Area Constraint 3.

Without an area constraint, the design is placed loosely, spanning ten clock regions in total, as shown in Figure 7.11 (a). However, the same design can be placed closely and tightly within only four clock regions with Area Constraint 3, as illustrated in Figure 7.11 (b). It is obvious that the length of global clock routing has been reduced significantly and the clock branches spanning the other six clock regions can be disconnected and freed, which are the main reasons for the reduction of clock and dynamic power consumption. In addition, tight placement can increase the slice utilisation efficiency because the basic logic resources, LUTs and FFs, can be placed closely and tightly. That is the main reason that the number of slices required for a given module is seen to reduce as the area is more tightly constrained.

### 7.4.3 Further Analysis of Dynamic Power Consumption

As discussed earlier, the sizes of RP defined by Area Constraint 3 (i.e. where the allocated area is approximately 10% greater than the resources required) provide the best implementation results in terms of dynamic power consumption. Consequently, results for conventional and PR implementations with Area Constraint 3 are further analysed to investigate the dynamic and clock power savings due to area constraints, as detailed in Table 7.3. The *AC* column means the dynamic power saving attributable only to area constraints, which can be obtained by subtracting results for PR implementations using global clock routing and Area Constraint 3, from the corresponding conventional implementations without area constraints.

|                     | Dynamic Power Consumption | Clock Power Consumption |
|---------------------|---------------------------|-------------------------|
|                     | AC (mW)/%                 | AC (mW)/%               |
| LTE (10 MHz)        | 75.93 (39%)               | 42.55 (41%)             |
| 802.16e (7 MHz)     | 88.26 (40%)               | 54.99 (43%)             |
| 802.16e (10 MHz)    | 58.05 (25%)               | 19.16 (18%)             |
| WCDMA               | 44.43 (28%)               | 27.17 (30%)             |
| 802.11n (20 MHz)    | 30.93 (22%)               | 17.13 (25%)             |
| WS 802.11n (20 MHz) | 84.39 (36%)               | 55.69 (40%)             |
| WS 802.16e (7 MHz)  | 107.87 (36%)              | 75.00 (39%)             |
| WS 802.16e (10 MHz) | 158.93 (38%)              | 113.56 (44%)            |
| WS LTE (10 MHz)     | 88.25 (35%)               | 59.83 (39%)             |
| OFDM                | 33.35 (28%)               | 14.34 (32%)             |
| Average             | 33%                       | 35%                     |

**Table 7.3:** Analysis of Clock and Dynamic Power Reduction.

According to Table 7.3, the reduction of overall dynamic power consumption ranges from 22% to 40% and the average is approximately 33%. Similarly, 25%–44% of the clock power consumption reduction is obtained and the average is approximately

35%. The greatest power savings in terms of overall dynamic and clock power consumption can be obtained for the IEEE 802.16e (7 MHz) and TV white space IEEE 802.16e (10 MHz), reaching 40% and 44% respectively.

Considering the overall dynamic and clock power consumption, the average reductions of approximately 33% and 35% due to area constraints are significant. Therefore, the setting of appropriate area constraints is seen to play a important role in the process of reducing the dynamic power consumption of PR implementations.

## 7.5 Static Power Consumption

Based on the analysis of [184] [198] [199], the static power consumption is the result of the transistor leakage current and thus two factors are considered to affect the value of the static power consumption: the size of the device, and the technology node. A smaller size of device means fewer transistors on the device and thus the static power is reduced. The leakage current has a close relation with the technology node (transistor size): leakage current increases when technology node goes down, e.g. from 90 nm to 65nm [184]. In addition, the static power has significant relation to the junction temperatures. With the increment of junction temperatures, the leakage current goes up dramatically.

The static power consumption of the Virtex-5 LX series is listed in Table 7.4 and illustrated in Figure 7.12. With the increment of device sizes, the static power consumption increases significantly. Note that all of the static power figures are calculated using the XPE tool with default voltage settings: the core voltage (i.e.  $V_{ccint}$ ) and the auxiliary voltage (i.e.  $V_{ccaux}$ ) are set to 1 V and 2.5 V respectively. According to Table 7.4, the static power can be reduced by 65% from the XC5VLX330T device to the XC5VLX110T device at 25°C.

| Virtex-5 LX<br>Devices | Slices | BRAMS | DSP48Es | Static Power<br>(mW) | Junction<br>Temperature<br>(° C) |
|------------------------|--------|-------|---------|----------------------|----------------------------------|
| XC5VLX20T              | 3120   | 26    | 24      | 237                  |                                  |
| XC5VLX30T              | 4800   | 36    | 32      | 302                  |                                  |
| XC5VLX50T              | 7200   | 60    | 48      | 420                  |                                  |
| XC5VLX85T              | 12960  | 108   | 48      | 674                  | 25                               |
| XC5VLX110T             | 17280  | 148   | 64      | 882                  |                                  |
| XC5VLX155T             | 24320  | 212   | 128     | 1472                 |                                  |
| XC5VLX220T             | 34560  | 212   | 128     | 1684                 |                                  |
| XC5VLX330T             | 51840  | 324   | 192     | 2523                 |                                  |

 Table 7.4: Information of Virtex-5 LX FPGAs.



Figure 7.12: Static Power Consumption for Virtex-5 LX Devices.

As reviewed in Chapter 6, multiple functions can be implemented with time division multiplexing via PR so as to reduce the FPGA size. The proposed DRP-PR

architecture only requires 1636 slices, 42 DSP48Es and 19 BRAMs while 10059 slices, 222 DSP48Es and 77 BRAMs are needed by the fixed multiple standard design method. According to Table 7.4, the largest device in the Virtex-5 LX series, i.e. the XC5VLX330T, has only 192 DSP48Es and thus cannot support the fixed multiple standard design method. Therefore, a larger FPGA device is required, resulting in a significant growth in static power consumption. However, the DRP-PR architecture can perform all of the functionalities required with much reduced resources, and thus the entire system can be implemented on a smaller device, e.g. the XC5VLX110T device, reducing static power consumption by at least 65%.

With regard to the XC5VLX50T and XC5VLX85T devices, they have sufficient numbers of slices, BRAMs and DSP48Es to implement the proposed DRP-PR architecture. However, 48 DSP48Es in total are distributed into 6 clock regions (8 DSP48Es are grouped into each clock region). According to the proposed DRP-PR architecture, the number of DSP48Es is the sum of the largest DUC and transform designs in terms of DSP48Es, and the largest usage of DSP48Es for the DUC and transform RPs is 33 and 9 respectively. Consequently, these two RPs have to cover 5 and 2 clock regions respectively in order to avoid the overlapping of DSP48Es, because the frame column is the smallest unit of configuration memory and can only be allocated to one RP. If the proposed DRP-PR architecture is implemented on either the XC5VLX50T or XC5VLX85T, as illustrated in Figure 7.13, overlapping can occur and PR implementations cannot be performed successfully. Here, the XC5VLX50T is taken as an example and the overlapping between transform and DUC RPs is highlighted. As a result, both XC5VLX50T and XC5VLX85T devices cannot implement the proposed DRP-PR design.

The DSP48Es of XC5VLX110T are distributed across 8 clock regions, and this device is the smallest in the Virtex-5 LX family on which the proposed DRP-PR architecture can be implemented successfully.



Figure 7.13: Floorplanning of DRP-PR Architecture on XC5VLX50T, showing overlapping of PR modules in one clock region.

## 7.6 Power Optimisation of DRP-PR Architecture

The observations made previously demonstrate that choosing an appropriate size of RP can lead to low dynamic power consumption, and thus the floorplanning of the DRP-PR architecture discussed in Chapter 6 can be optimised to obtain a lower power architecture, as shown in Figure 7.14.

The results in terms of dynamic power consumption and bitstream size for original and optimised floorplanning are listed in Table 7.5. The size of RP depends on the RM with the largest resource requirements and thus the size of TV white space IEEE 802.16e (7 MHz) with Area Constraint 3 is selected to define the DUC RP within which all of the DUC modules are implemented.

The sizes of mapper and clock divider RPs can both be chosen with half of one entire frame respectively. However, the position of the transform RP defined in Area Constraint 3 is occupied by the DUC. Consequently, the transform RP is moved to the top of the device to ensure that the implementations may be performed successfully.



Figure 7.14: Floorplanning Optimisation.

Note that both DRP-PR implementations are performed using global clock routing resources. Based on the results presented in Table 7.5, the dynamic power consumption of the DUC RP can achieve an average reduction of 3.0% via setting the most appropriate size of RP. The bitstream size of the optimised DUC RP can be reduced by 20.9%. Similarly, the dynamic power consumption of the optimised transform, mapper and clock divider RPs can be decreased on average by 40.9%, 16.3% and 13.8% respectively, and the bitstream size can also be reduced by 48.4%, 76.6% and 63.2% respectively.

Therefore, the optimised floorplanning of the proposed DRP-PR architecture can obtain a further reduction in dynamic power consumption and bitstream size. As discussed earlier, four partial bitstreams in total require to be downloaded (one for each RP) to perform standard switching in the worst case. As a result, the total size of reconfiguration bitstream required is the sum of each of these partial bitstreams, i.e. 325+177+7+7=516 KB, which represents a reduction of 35.7% compared to the equivalent obtained in the original floorplanning, and a reduction of 86.4% compared to the programmable multiple standards design method. Note that these figures are linked to the relationship between the size of the FPGA device and the size of the RP, and therefore are specific to this example.

|                 |                         | Dynamic 1 | Power Consun | ption (mW) | Bitstream Size (KB) |           |           |
|-----------------|-------------------------|-----------|--------------|------------|---------------------|-----------|-----------|
| RP              | RMs                     | Original  | Optimised    | Reduction  | Original            | Optimised | Reduction |
|                 | LTE (5 MHz)             | 168.28    | 162.82       | 3.2%       |                     |           | 20.9%     |
|                 | LTE (10 MHz)            | 166.61    | 161.16       | 3.3%       |                     |           |           |
|                 | 802.16e (3.5 MHz)       | 185.36    | 183.47       | 1.0%       |                     |           |           |
|                 | 802.16e (5 MHz)         | 235.65    | 226.60       | 3.8%       |                     |           |           |
|                 | 802.16e (7 MHz)         | 179.16    | 176.68       | 1.4%       |                     |           |           |
|                 | 802.16e (10 MHz)        | 226.35    | 222.36       | 1.8%       |                     |           |           |
|                 | WCDMA                   | 163.14    | 160.28       | 1.8%       |                     |           |           |
|                 | 802.11n (20 MHz)        | 162.56    | 157.05       | 3.4%       |                     |           |           |
|                 | WS 802.11n (5 MHz)      | 198.43    | 194.28       | 2.1%       |                     |           |           |
| DUC             | WS 802.11n<br>(10 MHz)  | 203.48    | 189.09       | 7.1%       | 411                 | 325       |           |
|                 | WS 802.11n<br>(20 MHz)  | 201.75    | 195.05       | 3.3%       |                     |           |           |
|                 | WS 802.16e<br>(3.5 MHz) | 248.68    | 241.02       | 3.1%       |                     |           |           |
|                 | WS 802.16e (5 MHz)      | 308.13    | 304.73       | 1.1%       |                     |           |           |
|                 | WS 802.16e (7 MHz)      | 236.51    | 222.01       | 6.1%       |                     |           |           |
|                 | WS 802.16e<br>(10 MHz)  | 292.24    | 284.11       | 2.8%       |                     |           |           |
|                 | WS LTE (5 MHz)          | 200.94    | 195.73       | 2.6%       |                     |           |           |
|                 | WS LTE (10 MHz)         | 218.74    | 213.25       | 2.5%       |                     |           |           |
| A: <sup>1</sup> |                         |           |              | 3.0%       |                     |           | 20.9%     |
| Trans-          | OFDM                    | 165.32    | 103.08       | 37.6%      | 2.42                | 177       | 40,40/    |
| Iorm            | Spreader                | 26.18     | 14.62        | 44.2%      | 343                 | 1//       | 48.4%     |
| A:              |                         |           |              | 40.9%      |                     |           | 48.4%     |
| Mer             | QPSK                    | 27.92     | 21.92        | 21.5%      | 20                  |           | 76 70/    |
| per             | 16-QAM                  | 28.13     | 24.43        | 13.2%      | 30                  |           | 76.7%     |
|                 | 64-QAM                  | 28.79     | 24.68        | 14.3%      |                     |           |           |
| <b>A:</b>       |                         |           |              | 16.3%      |                     |           | 76.7%     |

 Table 7.5: Comparison Results between Original and Optimised Floorplannings.

|                  |            | Dynamic 1 | Power Consum | ption (mW) | Bitstream Size (KB) |           |           |  |
|------------------|------------|-----------|--------------|------------|---------------------|-----------|-----------|--|
| RP               | RMs        | Original  | Optimised    | Reduction  | Original            | Optimised | Reduction |  |
|                  | Clock 1/8  | 24.10     | 20.78        | 13.8%      |                     |           |           |  |
| Clock<br>Divider | Clock 1/16 | 24.12     | 20.79        | 13.8%      | 19                  | 7         | 63.2%     |  |
|                  | Clock 1/32 | 24.13     | 20.78        | 13.9%      |                     |           |           |  |
|                  | Clock 1/64 | 24.12     | 20.79        | 13.8%      |                     |           |           |  |
| A:               |            |           |              | 13.8%      |                     |           | 63.2%     |  |

 Table 7.5: Comparison Results between Original and Optimised Floorplannings.

1. A: Average

## 7.7 Concluding Remarks

In this chapter, the importance of power consumption in FPGA designs, and the concepts of static and dynamic power have been introduced. A new power analysis approach (based on the conventional analysis of FPGA designs) has been proposed to analyse clock and dynamic power consumption for PR, and applied to the DRP-PR design developed in this thesis.

Key factors in the process of PR, i.e., the specification of area constraints and clock distribution mechanisms, were presented respectively in order to investigate influence of these two factors on power consumption. Six area constraints were defined and three clock routing resources (global clock, regional clock and local clock) were analysed. Since local clock routing can cause unpredictable behaviour and errors at runtime, and as was demonstrated regional clock routing contains local clock routing for the PR design on the Virtex-5 device, only global clock routing resources are recommended for the PR design targeting Virtex-5 devices. Regional clock resources can be used for PR designs on the most recent FPGAs, such as Virtex-6 and Virtex-7.

It was shown that, for the target device and modules considered, Area Constraint 3 (i.e. where the allocated area is approximately 10% greater than the resources required) can achieve the lowest overall dynamic power consumption of all the area constraints defined. Furthermore, the results obtained demonstrate that overall dynamic and clock

power consumption can be reduced by approximately 33% and 35% respectively via setting appropriate area constraints, when compared to the results for conventional and PR implementations with Area Constraint 3.

Based on the observed outcomes, the specification of RPs with 10% additional resources can be applied to the original floorplanning of DRP-PR architecture proposed in Chapter 6, so as to obtain a further reduction of dynamic power consumption. The results obtained demonstrate that the dynamic power consumption of the optimised architecture can be further reduced by 3.0%, 40.9%, 16.3% and 13.8% for the DUC, transform, mapper and clock divider RPs respectively. The bitstream size of each RP can also obtain further savings: the optimised bitstream required for total reconfiguration can be reduced by 35.7% compared to the equivalent obtained in the original floorplanning proposed in Chapter 6, and by 86.4% compared to the programmable multiple standards design method for the worst case scenario.

With regard to the static power consumption, the results demonstrate that there is a significant reduction in static power consumption via the PR technique attributed to time division multiplexing. Compared to the reduction in dynamic power consumption, the reduction of static power is the main contributor to the overall power savings. It is demonstrated that the proposed DRP-PR architecture can achieve a static power consumption reduction of at least 65% compared to the fixed multiple standards design method.

As discussed in Chapter 6, the proposed DRP-PR architecture is very advantageous in terms of hardware resource utilisation, design flexibility, functionality switching and expansibility. It can also reduce the power consumption of the entire system by reducing the number of clock oscillators required. In this chapter, the DRP-PR architecture was also demonstrated to reduce dynamic and static power consumption by exploiting these reconfigurable functionalities. Taking into account the discussion above, the proposed DRP-PR architecture can provide a number of benefits for the SDR system: low hardware utilisation, design flexibility, functionality switching with ease, a high degree of expansibility, and low power consumption.

# **Chapter 8**

# **Conclusions and Future Work**

## 8.1 Conclusions

The introduction of **Chapter 1** provided the motivation for targeting SDR systems to FPGAs, and reviewed related work. Customers obtain benefits from SDR: they are capable of receiving waveform expansion for emerging standards or service updating by downloading the relevant software, instead of purchasing new hardware.

Technology and communication background was introduced in **Chapter 2**. The concepts of SDR, including definition and architecture, were introduced first. A review of FPGA technology, which is viewed as the most promising programmable hardware to enable SDR systems, was included thereafter. The four communication standards (LTE, IEEE 802.16e, IEEE 802.11n and WCDMA) studied in this thesis were briefly introduced, and concepts of TV white space were also provided.

**Chapter 3** presented two dynamic reconfiguration technologies based on FPGA: PR and DRP. The principles and design considerations of PR were discussed at first, along with examples of reconfigurable filters with external PR and internal PR. Based on an analysis of, and performance comparison between, conventional and PR implementations, PR was demonstrated to perform function switching in much less time and thus to be better suited to the requirements of a real-time SDR system. The DRP architecture is able to change the various clock frequencies by switching the values of multiplier and divisor dynamically for a fixed input clock oscillator. The results demonstrate that a single clock oscillator is capable of generating a variety of clock frequencies using DRP technology. Therefore, it can reduce the number of clock oscillators required, and hence also reduce the power consumption and cost in the entire SDR system. It also improves the degree of flexibility in the design and implementation dramatically.

**Chapters 4** and **5** discussed the DUC designs for conventional standards and TV white space applications. **Chapter 4** started by presenting the principles and demands of the DUC. In addition, the concepts and measurements of the key performance metrics for the DUC architecture, including SEM, EVM and ACLR were introduced. The filter design consideration for OFDM-based standards, e.g., LTE (5 MHz and 10 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and IEEE 802.11n (20 MHz), together with a and non-OFDM-based standard, WCDMA, were summarised respectively. The DUC results obtained demonstrate that the DUC designs using the proposed filter design approach can achieve the requirements of each corresponding standard and provide good performance in terms of SEM, EVM and ACLR.

Since communication using TV white space spectrum can provide a number of benefits, such as better service availability and lower deployment cost compared to those using other frequency bands, conventional DUC designs were extended to the TV white space applications in **Chapter 5**. It was shown that the method of filter design considerations in **Chapter 4** can also be applied to the filter designs in **Chapter 5** with a small modification to comply with the more stringent SEM requirements for TV white space applications. The results of TV white space DUC designs for IEEE 802.11n (5 MHz, 10 MHz and 20 MHz), IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz) and LTE (5 MHz and 10 MHz) were shown to comply with both the SEMs proposed by the FCC and the original standards, and obtain good performance in terms of the EVM measurement. The hardware utilisation of white space DUC designs were higher in terms of slices and DSP48Es compared to the equivalents in **Chapter 4**. This is because longer filters, increased coefficient and input data widths, a higher IF sampling rate, and a higher spectral purity requirement for the DDS component are

required in the TV white space DUC designs.

In **Chapter 6**, a novel physical layer architecture for the transmitter in an SDR system was proposed based on a single FPGA device. The proposed architecture integrates the DUC designs discussed in **Chapters 4** and **5**, mapping schemes (QPSK, 16-QAM and 64-QAM) and transform modules (OFDM and spreader). These were implemented with two dynamic reconfiguration technologies, i.e. PR and DRP as discussed in **Chapter 3**. Consequently, 4 standards and 17 modes in total are supported, as follows:

- LTE (5MHz and 10 MHz)
- IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz)
- IEEE 802.11n (20 MHz)
- WCDMA
- TV white space IEEE 802.11n (5 MHz, 10 MHz, and 20 MHz)
- TV white space IEEE 802.16e (3.5 MHz, 5 MHz, 7 MHz and 10 MHz)
- TV white space LTE (5 MHz and 10 MHz)

The proposed DRP-PR architecture is an extension compared to many other SDR architectures because it contains not only baseband but also IF processing. Compared to other SDR design methods based on FPGA (namely *the fixed multiple standards design, programmable multiple standards design* and two design methods relying only on PR technology: *the multiple clock oscillator design* and *the normalised clock oscillator design*), the proposed DRP-PR design is clearly advantageous in terms of hardware utilisation, flexibility of design and function switching, expansibility and update-ability. It was demonstrated that, for the target device considered, the proposed architecture could achieve reductions of 83.7%, 81.1% and 72.7% in respect of slice, DSP48E and BRAM resource utilisation respectively, while requiring three fewer clock oscillator inputs compared to the fixed multiple standards design.

Reconfiguration overhead can be reduced by 78.9% with the proposed architecture when the worst case is considered and all 4 RPs require to be changed, e.g. standard switching from WCDMA to TV white space IEEE 802.11n (20 MHz) with 64-QAM

modulation, compared to the programmable multiple standard design. Three fewer clock oscillator inputs are required as a result of DRP technology compared to the multiple clock oscillator design. Moreover, this architecture provides the potential to readily integrate new standards or modes into the design.

**Chapter 7** first reviewed the importance of power consumption in FPGA designs, and then presented power analysis methods for conventional and PR implementations to analyse clock and dynamic power consumption for conventional and PR tests. It was demonstrated that the PR technique can reduce dynamic power consumption via setting appropriate area constraints.

It was shown that, for the target device and modules considered, Area Constraint 3 (i.e. where the allocated area is 10% greater than the resources required) can achieve the lowest overall dynamic power consumption of all the area constraints defined. Furthermore, the results obtained demonstrate that overall dynamic and clock power consumption can be reduced by approximately 33% and 35% respectively via setting appropriate area constraints, when comparing the results for conventional and PR implementations with Area Constraint 3.

Based on the observed outcomes, the specification of RPs with 10% additional resources can be applied to the original floorplanning of DRP-PR architecture proposed in **Chapter 6**, so as to obtain a further reduction of dynamic power consumption. The results obtained demonstrate that the dynamic power consumption of the optimised architecture can be further reduced by 3.0%, 40.9%, 16.3% and 13.8% for the DUC, transform, mapper and clock divider RPs respectively. The bitstream size of each RP can also obtain further savings: the optimised bitstream required for total reconfiguration can be reduced by 35.7% compared to the equivalent obtained in the original floorplanning proposed in **Chapter 6**, and by 86.4% compared to the programmable multiple standards design method for the worst case scenario.

With regard to the static power consumption, the results demonstrate that there is a significant reduction in static power consumption via the PR technique attributed to time division multiplexing. Compared to the reduction in dynamic power

consumption, the reduction of static power is the main contributor to the overall power savings. It is demonstrated that the proposed DRP-PR architecture can achieve a static power consumption reduction of at least 65% compared to the fixed multiple standards design method.

In conclusion, the proposed DRP-PR architecture can provide a number of benefits for the SDR system: low hardware utilisation, high degree of expansibility and updateability, functionality switching with ease, multiple standards and low power consumption, as illustrated in Figure 8.1.



Figure 8.1: Benefits Diagram of the Proposed DRP-PR Architecture.

• *Low Power Consumption*: Fewer clock oscillators are required as a result of the DRP technique, providing a reduction in overall system power consumption. Furthermore, PR technology can reduce static and dynamic power consumption significantly based on the module functionalities. In addition, PR can also reduce the power consumption during the reconfiguration process by downloading the partial bitstream files as opposed to fully reconfiguring the device. Therefore, the proposed architecture can achieve an SDR system with low power consumption.

• *Multiple Standards & Modes*: The proposed architecture can support 4 wireless communication standards and 17 modes in the baseband and IF processing components, as discussed previously in this chapter. LTE standard and IEEE 802.16e are viewed as beyond 3G (3.9G) standards, while IEEE 802.11n incorporating MIMO techniques can be considered as the next generation WLAN standard. The TV white space application is an extension compared to the conventional wireless communication standards, as it can increase spectrum efficiency and provide more spectrum resource for wireless communication. In addition, wireless communication using TV white space spectrum can provide better performance and reduce deployment cost compared to the equivalents operating in the Gigahertz range. The study of TV white space is becoming increasing popular, and a device which supports multiple standards and modes, including TV white space applications, can improve the function and competitiveness of the SDR systems.

• *Low Hardware Utilisation*: The proposed architecture can achieve a significant reduction in hardware utilisation compared to conventional FPGA-based SDR systems. Furthermore, the proposed architecture can further reduce the hardware usage when additional standards and modes are supported. Low hardware utilisation means that all of the functionalities can be implemented on a small FPGA device and thus the size of entire SDR system can be reduced.

• *Standards Switching with Ease*: Standards switching can be achieved by downloading several corresponding of partial bitstream files. The partial bitstream only contains configuration data and address information for the region in which the PR implementation is performed, and thus there is no startup process required after the partial bitstream is downloaded. The size of the partial bitstreams is relatively small. Even for the worst case discussed in Chapter 6, the sum of partial bitstream files from each RP is still much smaller than the size of a full configuration file. These two attractive features of PR can accelerate the reconfiguration process significantly. Therefore, the proposed architecture can meet the requirements of real-time SDR systems and allow the underlying hardware to swap functionalities flexibly and fast.

#### **CHAPTER 8 - Conclusions and Future Work**

• *High expansibility and update-ability*: SDR systems are required to support a variety of standards and even emerging standards without acquiring new hardware. Consequently, expansibility and update-ability are further important factors against which to evaluate the SDR system performance. The proposed architecture has significant reprogrammability in both clock frequency and functionalities via the combination of DRP and PR technologies. These two dynamic reconfiguration technologies can cope with the requirements of new clock frequencies and functionalities when multiple standards or even emerging standards are integrated into the SDR system. In addition, defining RP areas with extra resource in the proposed architecture permits easier integration of additional, potentially more complex RMs in the future. Therefore, the proposed DRP-PR architecture has high expansibility and update-ability, in accordance with the core concept of SDR, which can prolong SDR platform lifetime and reduce the cost of maintenance and updating.

### 8.2 Future Work

For the proposed DRP-PR architecture to support multiple standards and modes in the transmitter chain, two factors can be improved so as to enhance the overall system performance of SDR in the future.

• *More complex IF processing modules*: Since an OFDM signal is the sum of many narrowband signals in the time domain, it has high PAPR compared to a single carrier signal. A larger and more expensive linear power amplifier is required to deal with signals with high PAPR. In addition, the power amplifier has to operate inefficiently because signals with high PAPR can introduce irregular and infrequent peaks [201] [202]. In order to reduce the cost of the power amplifier and increase its operational efficiency, CFR algorithm processing components can be integrated into the IF processing section to reduce the PAPR of the signal prior to the DAC [203] [204]. Digital Pre-Distortion (DPD) is another method of increasing power amplifier working efficiency by means of linearising the power amplifier [205]. Therefore, CFR and DPD

processing components can be integrated into the IF processing section to improve the functionality and operating efficiency in the overall transmitter chain in the future. As a result, the future hierarchical design methodology for the SDR transmitter architecture is illustrated in Figure 8.2.



Figure 8.2: Future Hierarchical Design Methodology for SDR Transmitter Architecture.

The first layer is divided into two RPs: baseband and IF. The second layer derives from the further division of the first layer. The baseband RP consists of mapper and transform RPs, and the IF RP is composed of DUC, CFR and DPD RPs, in accordance with the functions in the transmitter chain.

CFR and DPD RPs are added after the DUC RP. Since the configurations of CFR and DPD might change according to the modes (as for the DUC modules), the functionalities of DUC, CFR and DPD are divided into three independent RPs to maintain the high degree of design and implementation flexibility. A variety of partial bitstreams for the three RPs can be generated according to different standards and modes, which enables the SDR system have powerful IF processing ability.

• *Hierarchical Design with Embedded Development Kit (EDK)*: The hierarchical design methodology discussed in Chapter 6 was implemented with the external PR method. This means that management of downloading various partial bitstream files is controlled by a PC or DSP, rather than the FPGA itself. Internal PR controlled by an embedded processor allows modules to be swapped much faster compared to the external PR. In addition, an embedded processor could run complex algorithms to manage downloading of various partial bitstream files automatically according to user requirements and the environment. Therefore, the hierarchical design methodology discussed in this chapter will be implemented with internal PR in the future.

## 8.3 Final Remarks

The programmable SDR system is capable of implementing a variety of communication standards and is attractive to consumer. With the development of FPGA technology, FPGAs will provide more powerful performance and reduced cost to enable SDR systems to be widely applied in civilian areas in the future.

# References

- [1] J. H. Reed (editor), Software Radio: A Modern Approach to Radio Engineering, Prentice Hall 2002.
- [2] J. Mitola, "The Software Radio Architecture", *IEEE Communications Magazine*, Vol. 33, May 1995, pp 26-38.
- [3] J. Mitola, "Software Radios: Survey, Critical Evaluation and Future Directions", *IEEE Aerospace and Electronic Systems Magazine*, Vol. 8, Issue 8, Apr 1993, pp 25-36.
- [4] SDR Forum, *Software Defined Radio Technology for Public Safety*. [online], SDR Forum, April 2006. Available at: <www.ece.vt.edu/swe/chamrad/psi/SDRF-06-A-0001-V0.00.pdf>.
- [5] M. Palkovic, P. Raghavan, M. Li, A. Dejonghe, L. Van Der Perre and F. Catthoor, "Future Software-Dfined Radio Platforms and Mapping Flows", *IEEE Signal Processing Magazine*, Vol. 27, Issue 2, March 2010, pp22-33.
- [6] V. Giannini, J. Craninckx, S. D'Amico and A. Baschirotto, "Flexible baseband analog circuits for software-defined radio front-ends" *IEEE Solid-State Circuits*, Vol. 42 No.7, July 2007, pp. 1501-1512.
- [7] P.Isomaki, N.Avessta, An Overview of Software Defined Radio Technologies [online] Turku: Turku Center for Computer Science (TUCS), December 2004. Available at: <a href="http://tucs.fi/publications/attachment.php/?fname=TR652.pdf">http://tucs.fi/publications/attachment.php/?fname=TR652.pdf</a>>.
- [8] F.K. Jondral, "Software-Defined Radio–Basics and Evolution to Cognitive Radio", EURASIP Journal on Wireless Communications and Networking, Vol. 2005, Issue 3, August 2005, pp. 275-283.
- [9] R.H. Hosking, *Software Defined Radio Handbook Ninth Edition*, [online], Pentek, Inc, 2011. Available at:<www.pentek.com/sftradhandbook/SftRadHandbook.cfm>
- [10] T. Ulversoy, "Software Defined Radio: Challenges and Opportunities", *IEEE Communications Surveys & Tutorials*, Vol. 12, Issue, 4, fourth quarter 2010, pp.531-550.
- [11] Restructured JTRS Program Reduces Risk, but Significant Challenges Remain, [online] United States Government Accountability Office, September, 2006. Available at:<www.gao.gov/assets/ 260/251421.pdf>
- [12] C. Liang and X.M. Huang, "Mapping Parallel FFT Algorithm onto SmartCell Coarse-Grained Reconfigurable Architecture", *IEICE Transactions on Electronics*, Vol. E93-C, Issue 3, March 2010, pp.407-415.
- [13] M. Cummings and S. Haruyama, "FPGA in the Software Radio", *IEEE Communications Magazine*, Vol.37, Issue 2, February, 1999, pp. 108-112.
- [14] Xilinx, Spartan-3 FPGA Family Data Sheet, [online], Xilinx, Inc, December 2009. Available at:<www.xilinx.com/support/documentation/data\_sheets/ds099.pdf>.

### **CHAPTER** - References

- [15] Xilinx, Spartan-3 FPGA Generation FPGA User Guide, [online], Xilinx, Inc, June 2011. Available at:<www.xilinx.com/support/documentation/user\_guides/ug331.pdf>.
- [16] The Embedded Primer, Steepest Ascent LTD.
- [17] T.S. Hall and J.O. Hamblen, "System-on-a Programmable-Chip Development Platforms in the Classroom", *IEEE Transactions on Education*, Vol.47, Issue 4, November 2004, pp. 502-507.
- [18]K. DeHaven, Extensible Processing Platform Ideal Solution for a Wide Range of Embedded Systems, [online], Xilinx, Inc, April, 2010. Available at:<www.xilinx.com/support/documentation/ white\_papers/wp369\_Extensible\_Processing\_Platfom\_Overview.pdf>.
- [19] F. Khan, LTE for 4G Mobile Broadband: Air Interface Technologies and Performance, Cambridge University Press, 2009.
- [20] S. Gultchev, K. Moessner, D. Thilakawardana, T. Dodgson and R. Tafazolli, *Evaluation of Software Defined Radio Technology*, [online], OFCOM. Available at:<stakeholders.ofcom.org.uk/ binaries/research/technology-research/eval.pdf>
- [21] Xilinx, Embedded System Tools Reference Manual, UG111(v13.4), [online], Xilinx, Inc, January 2012.

Available at:<www.xilinx.com/support/documentation/sw\_manuals/xilinx13\_4/est\_rm.pdf>.

- [22] V. Giannini and J. Craninckx and A. Baschirotto, Baseband Analog Circuits for Software Defined Radio, Springer 2008.
- [23] H. Holma and A. Toskala, LTE for UMTS: Evolution to LTE-Advanced Second Edition, Wiley 2011
- [24] L. Hanzo, Y.J. Akhtman, L. Wang and M. Jiang, *MIMO-OFDM for LTE, Wi-Fi and WiMAX:* Coherent versus Non-coherent and cooperative Turbo-transceivers, Wiley 2011
- [25] S. Ahmadi, Introduction to Mobile WiMAX Radio Access Technology: PHY and MAC Architecture, [online], Intel Inc, December 2006, Available at:<vivonets.ece.ucsb.edu/ ahmadiUCSB\_slides\_Dec7.pdf>.
- [26] Wikipedia-http://en.wikipedia.org/wiki/Wireless\_LAN.
- [27] H. Holma and A. Toskala, WCDMA for UMTS: HSPA Evolution and LTE, Wiley 2010.
- [28] N. Srivastava and S. Hanson, Expanding Wireless Communications with "White Space". [online], Dell Inc, October 2008. Available at:<i.dell.com/sites/content/business/solutions/whitepapers/en/ Documents/communications-white-spaces.pdf>
- [29] David Dye, Partial Reconfiguration of Xilinx FPGAs using ISE Design Suite, WP374 (v1.1), [online], Xilinx Inc, July 2011, Available at:<www.xilinx.com/support/documentation/ white\_papers/wp374\_Partial\_Reconfig\_Xilinx\_FPGAs.pdf>
- [30] Xilinx, *Partial Reconfiguration User Guide*. [online], Xilinx Inc, October 2010. Available at:<www.xilinx.com/support/documentation/sw\_manuals/xilinx12\_3/ug702.pdf>.
- [31] E. Eto, *Difference-Based Partial Reconfiguration*. [online], Xilinx Inc, December 2007. Available at:<www.xilinx.com/support/documentation/application\_notes/xapp290.pdf>
- [32] Xilinx, Early Access Partial Reconfiguration User Guide. [online], Xilinx Inc, September 2008. Available at:<kom.aau.dk/~ylm/Teaching/Spring2011/RC/mm3/ug208\_92.pdf>
- [33] Xilinx, *Partial Reconfiguration User Guide (v14.1)*, [online], Xilinx Inc, April 2012. Available at:<www.xilinx.com/support/documentation/sw\_manuals/xilinx14\_1/ug702.pdf>
- [34] Pao-Ann Hsiung, Marco D. Santambrogio, Chun-Hsian Huang, Reconfigurable System Design and Verification. Boca Raton: CRC Press 2009.
- [35] Xilinx, Virtex-5 FPGA Configuration User Guide, [online], Xilinx Inc, August 2010. Available at:<www.xilinx.com/support/documentation/user\_guide/ug191.pdf>

- [36] Xilinx, XPS SYSACE (System ACE) Interface Controller, [online], Xilinx Inc, December 2009. Available at:<www.xilinx.com/support/documentation/ip\_documentation/xps\_sysace.pdf>
- [37] Xilinx, *Virtex-5 FPGA User Guide*. [online], Xilinx Inc, May 2010. Available at:<www.xilinx.com/support/documentation/user guides/ug190.pdf>.
- [38] G. Hueber, R. Stuhlberger and A. Springer, "Concept for an Adaptive Digital Front-End for Multi-Mode Wireless Receivers", In *Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS)*, Seattle, USA, May 2008, pp. 89-92.
- [39] K. Lim, S.H. Lee, S.K. Min, S.M. Ock, M.W. Hwang, C.H. Lee, K.G. Kim and S.W. Han, "A Fully-Integrated Direct Conversion Receiver for CDMA and GPS applications", *IEEE Solid-State Circuits*, Vol. 41, Issue 11, November 2006, pp. 2408-2416.
- [40] A. Tasic, W. Serdijn and J.R. Long, "Adaptive Multi-Standard Circuits and Systems for Wireless Communications", *IEEE Circuits and Systems Magazine*, Vol. 6, Issue 1, First Quarter 2006, pp. 29-37.
- [41] J. Craninckx, M. Liu, D. Hauspie, V. Giannini, T. Kim, J. Lee, M. Libois, D. Debaillie, C. Soens, M. Ingels, A. Baschirotto, J. Van Driessche, L. Van der Perre and P. Vanbekbergen, "A Fully Reconfigurable Software-Defined Radio Transceiver in 0,13 μm CMOS", In *Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco USA, February, 2007.
- [42] G. Hueber, L. Maurer, G. Strasser, R. Stuhlberger, K. Chabrak and R. Hagelauer, "The Design of a Multi-Mode/Multi-System Capable Software Radio Receiver", In *Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS)*, Island of Kos, May 2006
- [43]K.S. Yeung and S.C. Chan, "The Design and Multiplier-less Realization of Software Radio Receivers with Reduced System Delay", *IEEE Transactions on Circuits and Systems I*; Vol. 51, Issue 12, December 2004, pp. 2444-2459.
- [44] C. Martelli, R. Reutemann, C. Benkeser and Q.T. Huang, "A 50mW HSDPA Baseband Receiver ASIC with Multimode Digital Front-End", In *Proceedings of IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, USA, February 2007.
- [45] I.H. Sohn, E.R. Jeong and Y.H. Lee, "Data-Aided Approach to I/Q Mismatch and DC Offset Compensation in Communication Receivers", *IEEE Communications Letters*, Vol. 6 Issue 12, December 2002, pp. 547-549
- [46] V.W. Leung, L.E. Larson, P.S. Gudem "Improved Digital-IF Transmitter Architecture for highly Integrated W-CDMA Mobile Terminals", *IEEE Transactions on Vehicular Technology*, Vol. 54 Issue 1, January 2005, pp. 20-32
- [47] N. Ghittori, A. Vigna, P. Malcovati, S. D'Amico and A. Baschirotto, "An IEEE 802.11 and 802.16 WLAN Wireless Transmitter Baseband Architecture with a 1.2-V, 600-MS/s, 2.4-mW DAC", *Analog Integrated Circuits and Signal Processing*, Vol. 59, Issue 3, June 2009, pp. 231-242.
- [48] J W. Tuttlebee (editor), Software Defined Radio: Enabling Technologies, Chichester, UK, John Wiley & Sons, 2002
- [49] B. G. Yu, J. U. Kim and S. W. Ra, "Implementation of A Digital IF Transceiver for SDR-based WiMAX Base Station", In *Proceedings of the International Symposium on Consumer Electronics*, Irving, TX June 2007, pp. 1-6
- [50] V.W. Leung, L. Larson and P. Gudem, "Digital-IF WCDMA Handset Transmitter IC in 0.25 μm SiGE BiCMOS", *IEEE Journal of Solid-State Circuits*, Vol. 39 Issue 12, December 2004, pp. 2215-2225.
- [51] P. Kenington, RF and Baseband Techniques for Software Defined Radio. Norwood, MA; Artech House, 2005.

- [52]Z.A. Shamsan, L. Faisal, S.K. Syed-Yusof and T.A. Rahman, "Spectrum Emission Mask for Coexistence between Figure WiMAX and Existing Fixed Wireless Access Systems", WSEAS Transactions on Communications, Vol. 7, Issue 6, June 2008, pp. 627-636.
- [53] E. Dahlman, S. Parkvall and J. Skold, *4G: LTE/LTE-Advanced for Mobile Broadband*, the Boulevard, Langfor Lane, Kidlington, Oxford, UK, Elsevier, 2011.
- [54] IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Fixed Broadband Wireless Access Systems, 2004.
- [55] C.M. Zhao and R.J. Baxley, "Error Vector Magnitude Analysis for OFDM Systems", In Proceedings of Asilomar Conference on Signals, System and Computers (ACSSC), Pacific Grove, CA, November 2006, pp. 1830-1834
- [56] M.D. McKinley, K.A. Remley, M. Myslinski, J.S. Kenney, D. Schreurs and B. Nauwelaers, "EVM Calculation for Broadband Modulated Signals", In *Proceedings of 64th Automatic RF Techniques Group (ARFTG) Conference*, Orlando, USA, December 2004, pp.45-52.
- [57] J.L. Pinto and L. Darwazeh, "Phase Distortion and Error Vector Magnitude for 8-PSK Systems". In Proceedings of London Communications Symposium (LCS), London, UK, 2000.
- [58] H. Holma and A. Toskala (editor), *LTE for UMTS-OFDMA and SC-FDMA Based Radio Access*, Chichester, UK, John Wiley & Sons, 2009
- [59] R.A. Shafik, M.S. Rahman, A.R. Islam and N.S. Ashraf, "On the Error Vector Magnitude As a Performance Metric and Comparative Analysis". *In Proceedings of 2nd International Conference* on Emerging Technologies (ICET), Peshawar, Pakistan, November 2006.
- [60] H.A. Mahmoud and H. Arslan, "Error Vector Magnitude to SNR Conversion for Nondata-Aided Receivers", *IEEE transactions on wireless communications*, Vol. 8, No. 5, May 2009, pp. 2694-2704.
- [61] Altera, *Accelerating DUC&DDC System Designs for WiMAX*. [online], Altera Inc, May 2007. Available at:<www.altera.com/literature/an/an421.pdf>
- [62] B. Dusza, K. Naniel and C. Wietfeld, "Error Vector Magnitude Measurement Accuracy and Impact on Spectrum Flatness Behavior for OFDM-based WiMAX and LTE Systems". In Proceedings of the 5th International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM), Beijing, China, September 2009.
- [63] K.M. Gharaibeh, K.G. Gard, and M.B. Steer, "Accurate Estimation of Digital Communication System Metrics-SNR, EVM and ρ in a Nonlinear Amplifier Environment". In Proceedings of 64th Automatic RF Techniques Group (ARFTG) Conference, Orlando, USA, December 2004, pp.41-44.
- [64] E. Acar, S. Ozev and K.B. Redmond, "Enhanced Error Vector Magnitude (EVM) Measurements for Testing WLAN Transceivers". In *Proceedings of International Conference on Computer-Aided Design (ICCAD)*, San Jose, CA, USA, November 2006.
- [65] R. Senguttuvan, S. Bhattacharya and A. Chatterjee, "Efficient EVM Testing of Wireless OFDM Transceivers Using Null Carriers", *IEEE transactions on Very Large Scale Integration (VLSI)* systems, Vol. 17, No. 6, June 2009, pp.803-814.
- [66] M. Meurer, P.W. Baier, T. Weber, Y. Lu and A. Papathanassiou, "Joint Transmission: Advantageous Downlink Concept for CDMA Mobile Radio Systems using Time division Duplexing", *Electronics Letters*, Vol. 36, Issue 10, May 2000, pp. 900-901.
- [67] S.Sesia, I. Toufik and M. Baker (editor), LTE-The UMTS Long Term Evolution From Theory to Practice, Chichester, UK, John Wiley & Sons, 2009
- [68] 3GPP TR 36.804 (Release 8), October 2007
- [69] Xilinx, Virtex-5 FPGA Xtreme DSP Design Considerations User Guide (v3.4), [online], Xilinx, Inc,

June 2010. Available at:<www.xilinx.com/support/documentation/user\_guides/ug193.pdf>

- [70] H. Tarn, E. Hemphill and D, Hawke, *3GPP LTE Digital Front End Reference Design (v1.0)* Xilinx Inc, October 2008.
- [71] H. Tarn, K. Neilson, R. Uribe and D, Hawke, Designing Efficient Wireless Digital Up and Down Converters Leveraging CORE Generator and System Generator, [online], Xilinx, October 2007. Available at:<http://www.xilinx.com/support/documentation/application notes/xapp1018.pdf>
- [72] N.M. Zawawi, M.F. Ain, S.I.S. Hassan, M.A. Zakariya, C.Y. Hui and R. Hussin, "Implementing WCDMA Digital Up Converter In FPGA". In *Proceedings of IEEE International RF and Microwave Conference*, Kuala Lumpur, Malaysia, December 2008, pp. 91-95.
- [73] W. Wang, Y.F. Zeng and Y. Yan, "Efficient Wireless Digital Up Converters Design Using System Generator". In Proceedings of IEEE, 9th International Conference on Signal Processing (ICSP), Beijing, China, June, 2008, pp. 443-446.
- [74] F.Y. Lin, W.M. Qiao, X.X. Jiao, L. Jing and Y.H. Ma, "Efficient Design of Digital Up Converter for WCDMA in FPGA Using System Generator". In Proceedings of IEEE International Conference on Information Engineering and Computer Science (ICIECS), Wuhan, China, December 2009, pp. 1-4.
- [75] R. Mehra, "FPGA Based Efficient WCDMA DUC for Software Radios", Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Microelectronics (JSAM), December 2010, pp. 8-13.
- [76] R. Mehra and S. Devi, "Area Efficient Reconfigurable Digital Up Converter for Software Defined Radio Based Wireless Systems", *Journal of Communication and Computer*, Vol. 7, No. 7, July 2010, pp.28-32.
- [77] C. Singh, M.S. Patterh and S. Sharma, "Digital Up Converter for WiMAX System", International Journal of Engineering Science and Technology, Vol.2, No. 9, 2010, pp. 4570-4574
- [78] G.M. Sreerama Reddy and P. Chandrashekara Reddy, "Design and FPGA Implementation of High Speed, Low Power Digital Up Converter for Power Line Communication Systems", *European Journal of Scientific Research*, Vol.25, No. 2, 2009, pp.234-249.
- [79] Lattice, *Digital Up/Down Converter (DDC/DUC) for WiMAX Systems*, [online], Lattice Inc, May 2008. Available at:<www.latticesemi.com/documents/RD1036\_User\_guide.pdf>
- [80] Lattice, *The FPGA as a Flexible and Low-Cost Digital Solution for Wireless Base Station*, [online], Lattice, Inc, March 2007. Available at:<www.latticesemi.com/documents/doc23895x44.pdf>
- [81] Altera, *Digital IF Modem Design with the DSP Builder Advanced Blockset*, [online], Altera Inc. August 2008. Available at:<www.altera.com/literature/an/an544.pdf>
- [82] R. Lyons, Understanding Digital Signal Processing, 2nd ed. New Jersey: Bernard Goodwin, 2004
- [83] J. Vankka, Digital Synthesizers and Transmitters for Software Radio, Springer, 2005
- [84] Numerical Controlled Oscillator, the DSPedia, Steepest Ascent LTD.
- [85] Xilinx, *LogiCORE IP DDS Compiler v4.0 Data Sheet*, [online], Xilinx Inc. March 2011. Available at:<www.xilinx.com/support/documentation/ip\_documentation/dds\_ds558.pdf>
- [86] D. Astely, E. Dahlman, A. Furuskar, Y. Jading, M. Lindstrom and S. Parkvall, *LTE: The Evolution of Mobile Broadband*, [online], Ericsson Inc. Available at:<www.ericsson.com/res/thecompany/docs/journal\_conference\_papers/atsp/lte\_evolution\_of\_mobile\_broadband.pdf>
- [87] Motorola, Inc. Long Term Evolution (LTE): Overview of LTE Air-Interface Technical White Paper, [online] Motorola, Inc. Available at:<br/>business.motorola.com/experiencelte/pdf/
  LTEAirInterfaceWhitePaper.pdf>

- [88] Agilent, *Measuring ACLR Performance in LTE Transmitters*, [online], Agilent Inc. Available at:<cp.literature.agilent.com/litweb/pdf/5990-5089EN.pdf>
- [89] R. Sperlich, Y. Park, G. Copeland and J.S. Kenney, "Power Amplifier Linearization with Digital Predistortion and Crest Factor". In *Proceedings of IEEE MTT-S International Microwave Symposium Digest*, Fort Worth TX, USA, June 2004, Vol. 2 pp. 669-672
- [90] R.J. Baxley, C.M. Zhao and G.T. Zhou, "Constrained Clipping for Crest Factor Reduction in OFDM", *IEEE Transactions on Broadcasting*, Vol.52, Issue 4, December 2006, pp.570-575.
- [91] Altera, Crest Factor Reduction for OFDMA Systems, [online], Altera Inc. Available at:<www.altera.com/literature/an/an475.pdf>
- [92] J.G. Andrews, A. Ghosh and R. Muhamed, Fundamentals of WiMAX Understanding Broadband Wireless Networking, Prentice Hall, 2007.
- [93] E. Hemphill, H. Tarn, D. Hawke and J. Seoane, *WiMAX Digital Front End Reference Design*, Xilinx, Inc, June 2008.
- [94] IEEE Standard for Local and Metropolitan Area Networks: Part 16: Air Interface for Fixed Broadband Wireless Access Systems, 2004.
- [95] IEEE Standard for Local and Metropolitan Area Networks: Part 16: Air Interface for Fixed Mobile Broadband Wireless Access Systems, Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobiles Operation in Licensed Bands and Corrigendum 1, 2005.
- [96] S. Kassim and F. Malek, "Design and Measurement of 2.5 GHz Driver Amplifier for IEEE 802.16e Mobile WiMAX using a Small-Signal Method". In Proceedings of International Conference on Energy, Environment, Economics, Devices, Systems, Communications and Computers, Iasi, Romania, July 2011, pp. 166-170.
- [97] Keithley, An Introduction to Orthogonal Frequency Division Multiplex Technology, [online], Keithley Inc. Available at:

<www.ieee.li/pdf/viewgraphs/introduction\_orthogonal\_frequency\_division\_multiplex.pdf>

- [98] ETSI EN 301 021 final draft (V1.6.1), Fixed Radio Systems, Point-to-multipoint equipment, Time Division Multiple Access (TDMA), Point-to-multipoint digital radio systems in frequency bands in the range 3 GHz to 11 GHz, 2003.
- [99] IEEE Standard for Information technology Telecommunications and information exchange between systems Local and Metropolitan Area Networks Specific requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, Amendment 5: Enhancements for Higher Throughput, October, 2009.
- [100]3GPP TS 25.104 (V6.1.0), 3rd Generation Partnership Project Technical Specification Group Radio Access Network BS Radio Transmission and Reception (FDD), Release 6, March 2003.
- [101]3GPP TS.25.141 (V4.9.0), 3rd Generation Partnership Project Technical Specification Group Radio Access Network Base Station Conformance Testing (FDD), Release 5, September 2004
- [102]3GPP TS 25.101 (V6.19.0), 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; User Equipment (UE) radio transmission and reception (FDD), Release 6, March 2009.
- [103]N.S. Alagha and P. Kabal, "Generalized Raised-Cosine Filters", *IEEE Transactions on Communications*, Vol.47, Issue 7, July 1999, pp. 989-997.
- [104]P.V. Rao, C.R. Prasanna and S. Ravi, "Design and ASIC Implementation of Root Raised Cosine Filter", *European Journal of Scientific Research*, Vol.31, No.3, 2009, pp. 319-328.
- [105]E. Hemphill, H. Tarn, M. Pecot, D. Hawke and J. Seoane, *High Density WCDMA Digital Front End Reference Design*, April 2007.

- [106]R. Tanner and J. Woodward, WCDMA: Requirements and Practical Design, Wiley, 2004.
- [107]R.G. Lyons, Understanding Digital Signal Processing, Second Edition, Prentice Hall, March, 2004.
- [108]F. Mlinarsky, "Utilizing White Spaces for Broadband Access", Presentation at the 4GWE Conference, Miami, USA, January 2010. Available at:<octoscope.com/English/Collaterals/ Presentations/octoScope\_WhiteSpace\_20100122.pdf>
- [109]M. Nekovee, "A Survey of Cognitive Radio Access to TV White Spaces", International Journal of Digital Multimedia Broadcasting, Vol. 2010, April 2010, pp.1-11
- [110]S.J. Shellhammer, A.K. Sadek and W. Zhang, "Technical Challenges for Cognitive Radio is TV White Space Spectrum". In *Proceedings of UCSD Information Theory and Applications (ITA)* Workshop, San Diego, CA, USA, January 2009
- [111]R. Buczkiewicz, *Understanding TV White Spaces*, [online] LS Research Inc. Available at:<www.lsr.com/downloads/news/2011/understanding-TV-White-Spaces.pdf>
- [112]Motorola, Inc. TV White Space Position Paper Fixed TV White Space Solutions for Wireless ISP Network Operators, [online] Motorola, Inc. Available at:<www.techrepublic.com/whitepapers/tvwhite-space-position-paper-fixed-tv-white-space-solutions-for-wireless-isp-network-operators/ 1115419>
- [113]J. Mosenthal, B. Nleya and N. Manthoko, "Broadband / Future Generation Network Services Deployment in Rural and Remote Areas". In *Proceedings of 2nd International Conference on Adaptive Science Technology (ICAST)*, Accra, Ghana, January 2009, pp. 128-132
- [114]J. Riding, J. Ellershaw, A. Tran, L. Guan and T. Smith, "Economics of Broadband Access Technologies for Rural Areas", In *Proceedings of Conference on Optical Fiber Communication*, San Diego, CA, March 2009, pp. 1-3.
- [115]F. Darbari, M. Brew, S. Weiss and R.W. Stewart, "Practical Aspects of Broadband Access for Rural Communities using a Cost and Power Efficient Multi-Hop / Relay Network". In *Proceedings* of GLOBECOM Workshops (GCWkshps), Miami, FL, USA, December 2010, pp. 731-735.
- [116]Second Report and Order and Memorandum Opinion and Order In the Matter of Unlicensed Operation in the TV Broadcast Bands. Additional Spectrum for Unlicensed Devices Below 900 MHz and in the 3 GHz Band. Federal Communication Commission, Document 08-260, November, 2008.
- [117]A. Gomes, H. Alves, P. Marques, J. Rodriguez, J. Mwangoka, R. Dionisio, F. Alves, G. Kormentzas, G. Mastorakis, E. Pallis, A. Bourdena, T. Forde, L. Doyle, L. Dasilva, S. Roesseau, H. Aiache, D. Lavaux, S. Stavrou, L. Kanaris, E. Charalambos, J. Kubasik, G. Schuberth, J. Outters and R. Neudel, *Cognitive Radio Systems for Efficient Sharing of TV White Spaces in European Context, FP7 ICT-2009.1.1, D2.1 European TV White Spaces Analysis and COGEU use-cases,* [online]. Available at:<www.ict-cogeu.eu/pdf/COGEU D2 1%20(ICT 248560).pdf>
- [118]P. Bahl, R. Chandra, T. Moscibroda, R. Murty and M. Welsh, "White Space Networking with Wi-Fi Like Connectivity". ACM SIGCOMM Computer Communication Review, Vol.39, Issue 4, pp. 27-38, 2009.
- [119]S. Narlanka, R. Chandra, P. Bahl and J.I. Ferrell, "A Hardware Platform for Utilizing TV Bands With a Wi-Fi Radio". In *Proceedings of 15th IEEE LAN/MAN Workshop*, Princeton, NJ, USA, June 2007.
- [120]Y. Xiao, "IEEE 802.11n: Enhancements for Higher Throughput in Wireless LANs", IEEE Wireless Communications, Vol. 12, Issue 6, December 2005, pp.82-91.
- [121]Y.X. Lin and W.S. Wong, "Frame Aggregation and Optimal Frame Size Adaptation for IEEE 802.11n WLANs". In *Proceedings of IEEE Global Telecommunications Conference*, San

Francisco, CA, USA, November 2006, pp. 1-6.

- [122]D. Skordoulis, Q. Ni, H.H. Chen, A.P. Stephens, C.W. Liu and A. Jamalipour, "IEEE 802.11n MAC frame aggregation mechanisms for next-generation high-throughput WLANs", *IEEE Wireless Communications*, Vol.15, Issue 1, February 2008, pp.40-47.
- [123]S.W. Oh, A.A.S. Naveen, Y.H. Zeng, V.P. Kumar, T.P.C. Le, K. Kua and W.Q. Zhang, "White-Space Sensing Device for Detecting Vacant Channels in TV Bands". In Proceedings of 3rd International Conference on Cognitive Radio Oriented Wireless Networks and Communication (CROWNCOM), Singapore, May 2008
- [124]S.W. Oh, W.Q. Zhang, T.P.C. Le, Y.H. Zeng, A.A. Phyu and A.A.S. Naveen, "TV White-Space Device Prototype Using Covariance-Based Signal Detection". In *Proceedings of IEEE Dynamic Spectrum Access Networks (DySPAN)*, Chicago, USA, October 2008.
- [125]K. Kim, J. Min, S. Hwang, S. Lee, K.Kim and H. Kim, "A CR Platform for Applications in TV Whitespace Spectrum". In Proceedings of 3rd International Conference on Cognitive Radio Oriented Wireless Networks and Communication (CROWNCOM), Singapore, May 2008
- [126]M.A. Rahman, C.Y. Song and H. Harada, "Development of a TV White Space Cognitive Radio Prototype and its Spectrum Sensing Performance". In Proceedings of 6th International Conference on Cognitive Radio Oriented Wireless Networks and Communication (CROWNCOM), Osaka, Japan, June 2011, pp. 231-235.
- [127]Xilinx, *IP LogiCORE FIR COmpiler v5.0 data sheet*, [online], Xilinx Inc. Available at:<www.xilinx.com/support/documentation/ip\_documentation/fir\_compiler\_ds534.pdf>
- [128]H. Arslan (editor), Cognitive Radio, Software Defined Radio and Adaptive Wireless System, Springer 2007.
- [129]T.J. Rouphael, *RF and Digital Signal Processing for Software Defined Radio A Multi-Standard Multi-Mode Approach*, Elsevier 2009.
- [130]. M. Dillinger, K. Madani and N. Alonistioti, Software Defined Radio Architectures, Systems and Functions, Wiley 2003.
- [131]W.H.W. Tuttlebee (editor), Software Defined Radio Baseband Technologies for 3G Handsets and Basestations, Wiley 2004.
- [132]J. Mitola, Software Radio Architecture Object-Oriented Approaches to Wireless Systems Engineering, Wiley 2000.
- [133]Y.Lin, H.Lee, M.Who, Y.Harel, S.Mahlke, T.Mudge, C.Chakrabarti, and K.Flautner. "SODA: A low-power architecture for software radio". In *Proceedings of 33rd International Symposium on Computer Architecture (ISCA)*, Boston, MA USA, June 17-21, 2006.
- [134]A. Dutta, D. Saha, D. Grunwald and D. Sicker, "An Architecture for Software Defined Cognitive Radio". In Proceedings of ACM/ IEEE Symposium on Architectures for Networking and Communications Systems, La Jolla, CA, USA October 2010.
- [135]GnuRadio: http://www.gnu.org/software/gnuradio/
- [136]A. Niktash, H. Parizi, A.H. Kamalizad and N. Bagherzadeh, "A Reconfigurable FEC Processor for Viterbi, Turbo, Reed-Solomon and LDPC Coding". In *Proceedings of IEEE Wireless Communications & Networking Conference*, Las Vegas, USA, March 2008, pp. 606-610.
- [137]L. Rizzo and L. Vicisano, "Reliable Multicast Data Distribution Protocol Based on Software FEC Techniques". In Proceedings of the Four IEEE Workshop on the Architecture and Implementation of High Performance Communication Systems (HPCS'97), Chalkidiki, Greece, June 1997.
- [138]L. Rizzo and L. Vicisano, "Effective Erasure Codes for Reliable Computer Communication Protocols", Computer Communication Review, Vol.27, No.2, April 1997, pp.24-36.

- [139]L. Rizzo, "On the Feasibility of Software FEC", *DEIT Tech Report*, January 1997. [online], Available at:<http://www.iet.unipi.it/¬luigi/softfec.ps>.
- [140]P. Sedcole, B. Blodget, T. Becker, J. Anderson and P. Lysaght, "Modular Dynamic Reconfiguration in Virtex FPGAs", *IEE Proceedings Computer and Digital Techniques*, Vol.153, Issue 3, May 2006, pp. 157-164.
- [141]J. Delorme, J. Martin, A. Nafkha, C. Moy, F. Clermidy, P. Leray and J. Palicot, "A FPGA Partial Reconfiguration Design Approach for Cognitive Radio Based on NoC Architecture". In Proceedings of 6th Joint International IEEE Northeast Workshop on Circuits and Systems and Traitement Analogique Information Signal Applications Conference (NEWCAS-TAISA), Montreal, Quebec, Canada, June 2008, pp. 355-358.
- [142]R. Kuman, R.C. Joshi and K.S. Raju, "A FPGA Partial Reconfiguration Design Approach for RASIP SDR". In *Proceedings of Annual IEEE India Conference (INDICON)*, Ahmedabad, India, December 2009.
- [143]A. Nafkha, J. Delorme and R. Seguier, "A Heterogeneous Reconfigurable Platform for Cognitive Radio Systems". In Proceedings of Workshop on Software Radios (WSR), Karlsruhe, Germany, March 2008
- [144]J.P. Delahaye, J. Palicot and P. Leray, "A Hierarchical Modeling Approach in Software Defined Radio System Design". In Proceedings of IEEE Workshop on Signal Processing Systems Design and Implementation (SIPS), Divani Palace, Greece, November 2005, pp. 42-47.
- [145]O. Faust, B. Sputh, D. Nathan, S. Rezgui and A. Weisensee, "A Single-Chip Supervised Partial Self-Reconfigurable Architecture for Software Defined Radio". In *Proceedings of International Parallel and Distributed Processing Symposium (IPDPS)*, Nice, France, April 2003.
- [146]K. Papadimitriou, A. Anyfantis and A. Dollas, "An Effective Framework to Evaluate Dynamic Partial Reconfiguration in FPGA Systems", *IEEE Transactions on Instrumentation and Measurement*, Vol.59, No.6, June 2010, pp.1642-1651.
- [147]P. Coulton and D. Carline, "An SDR Inspired Design for the FPGA Implementation of 802.11a Basedband System". In *Proceedings of IEEE International Symposium on Consumer Electronics*, Reading, UK, September 2004, pp. 470-475.
- [148]M. Uhm and J. Belzile, "Meeting Software Defined Radio Cost and Power Targets: Making SDR Feasible", *Military Embedded Systems*, May 2005, pp. 7-8.
- [149]F. Berthelot, F. Nouvel and D. Houzet, "Partial and Dynamic Reconfiguration of FPGAs: A Top Down Design Methodology for An Automatic Implementation". In *Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS)*, Rhodes Island, Greece, April 2006.
- [150]J.P. Delahaye, J. Palicot, C. Moy and P. Leray, "Partial Reconfiguration of FPGAs for Dynamical Reconfiguration of a Software Radio Platform". In *Proceedings of 16th IST Mobile and Wireless Communications Summit*, Budapest, Hungary, May 2007.
- [151]A.A. Kountouris and C. Moy, "Reconfiguration in Software Radio Systems". In Proceedings of Workshop on Software Radios (WSR), Karlsruhe, Germany, March 2002
- [152]E.J. McDonald, "Runtime FPGA Partial Reconfiguration". In Proceedings of IEEE Aerospace Conference, Big Sky, Montana, USA, March 2008, pp. 1-7.
- [153]N. Bagherzadeh, T. Eichenberg. "Mobile software defined radio solution using high-performance, low-power reconfigurable DSP architecture". In *Proceedings of SDR 2005 Technical Conference* and Product Exposition, Anaheim, CA, USA, November 14-18, 2005
- [154]A.H. Gholamipour, E. Bozorgzadeh and L. Bao, "Seamless Sequence of Software Defined Radio Designs through Hardware Reconfigurability of FPGAs". In *Proceedings of IEEE International*

Conference on Computer Design (ICCD), Squaw Creek, CA, USA, October 2008, pp. 260-265.

- [155]C. Kao, "Benefits of Partial Reconfiguration", Xcell journal, fourth quarter 2005, pp 65-67
- [156]K.G. Nezami, P.W. Stephens and S.D. Walker. "Handel-C Implementation of Early-Access Partial Reconfiguration for Software Defined Radio". In *Proceedings of IEEE Wireless Communications* & Networking Conference (WCNC). Las Vegas, NV, USA March 31 - April 3, 2008
- [157]J.J. Delahaye, C. Moy, P. Leray and J. Palicot. "Managing Dynamic Partial Reconfiguration on Heterogeneous SDR Platforms", In Proceedings of SDR 2005 Technical Conference and Product Exposition, Anaheim, CA, USA, November 14-18, 2005
- [158]3GPP TS 36.211, (Release 9), January 2010
- [159]Y.J. Oh, H.H. Lee and C.H. Lee, "A Reconfigurable FIR Filter Design Using Dynamic Partial Reconfiguration". In *Proceedings of IEEE International Symposium on Circuits and Systems* (ISCAS), island of Kos, Greece, May 21-24, 2006, pp. 4851-4854.
- [160]E. El-Araby, I. Gonzalez and T. El-Ghazawi, "Exploiting Partial Runtime Reconfiguration for High-Performance Reconfigurable Computing", ACM Transactions on Reconfigurable Technology and Systems (TRETS), Vol.1 Issue 4, January 2009, pp.21-44.
- [161]G.C. Cardarilli, M. Re, I. Shuli and L. Simone, "Partial Reconfiguration in the Implementation of Autonomous Radio Receivers for Space". In Proceedings of 6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC), Montpellier, France, June 20-22, 2011.
- [162]C. Conger, A. G. Ross and A.D. George, "Design Framework for Partial Run-Time FPGA Reconfiguration". In *Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms*, Las Begas, USA, July 14-17, 2008.
- [163]S. Yousuf and A.G. Ross, "DAPR: Design Automation for Partially Reconfigurable FPGAs". In Proceedings of International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), Las Vegas, USA, July 12-15, 2010.
- [164]M. Sarlotte, B. Counil, P. Gelineau, R. Chau and D. Manufroid, "Partial Reconfiguration Concept in a SCA Approach". In *Proceedings of Software Defined Radio Technical Conference and Product Exposition (SDR'07)*, Denver, USA, November 2007.
- [165]F. Berthelot, F. Nouvel and D. Houzet, "Partial and Dynamic Reconfiguration of FPGAs: a Top Down Design Methodology for an Automatic Implementation. In *Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS)*, Rhodes Island, Greece, April 25-29, 2006.
- [166]D. Koch and J. Torresen, Advances and Trends in Dynamic Partial Run-Time Reconfiguration [online]. Available at:

<a href="http://drops.dagstuhl.de/opus/volltexte/2010/2841/pdf/10281.KochDirk.Paper.2841.pdf">http://drops.dagstuhl.de/opus/volltexte/2010/2841/pdf/10281.KochDirk.Paper.2841.pdf</a>>.

- [167]J. Torresen and D. Koch, "Can Run-Time Reconfigurable Hardware be More Accessible?". In Proceedings of International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), Las Vegas, USA, July 16-19, 2011.
- [168]M. Ullmann, M. Hubner, B. Grimm and J. Becker, "An FPGA Run-Time System for Dynamical On-Demand Reconfiguration". In *Proceedings of 18th International Parallel and Distributed Proceedings Symposium (IPDPS)*, Santa Fe, USA, April 26- 30, 2004.
- [169]F. Berhelot, F. Nouvel and D. Houzet, "Design Methodology for Dynamically Reconfigurable Systems". In Proceedings of the conference Journées Francofonnes de l'Adéquation Algorithmes-Architectures (JFAAA), Dijon, France, January 2005, pp. 220-224.
- [170]J. Jones and M. Stettler, "Dynamic Reconfiguration and Incremental Firmware Development in the
Xilinx Virtex 5". In *Proceedings of Topical Workshop on Electronics for Particle Physics*, Naxos, Greece, September 15-19, 2008, pp. 583-586.

- [171]R. Hymel, A.D. George and H. Lam, "Evaluating Partial Reconfiguration for Embedded FPGA Applications". In *Proceedings of Higher-Performance Embedded Computing Workshop*, Lexington, USA, September 18-20, 2007.
- [172]M. Hubner, L. Braun, D. Gohringer and J. Becker, "Run-Time Reconfigurable Adaptive Multilayer Network-on-Chip for FPGA-Based Systems". In *Proceedings of IEEE International Symposium on Parallel and Distributed Processing, (IPDPS)*, Miami, USA, April 14-18, 2008.
- [173]M. Huebner, T. Becker and J. Becker, "Real-time LUT-based Network Topologies for Dynamic and Partial FPGA self-reconfiguration". In *Proceedings of 17th Symposium on Integrated Circuits* and Systems Design, Porto de Galinhas, Brazil, September 11, 2004, pp.28-32.
- [174]A.V. Brito, M. Kuhnle, M. Hubner, J. Becker and E.U.K. Melcher, "Modelling and Simulation of Dynamic and Partially Reconfigurable Systems using SystemC". In *Proceedings of IEEE Computer Society Annual Symposium on VLSI*, Porto Alegre, Brazil, May 9-11, 2007.
- [175]P. Lysaght, B. Blodget, J. Mason, J. Young and B. Bridgford, "Invited Paper: Enhanced Architectures, Design Methodologies and CAD Tools for Dynamic Reconfiguration of Xilinx FPGAs". In *Proceedings of International Conference on Field Programmable Logic and Applications*, Madrid, Spain, August 28-30, 2006.
- [176]J. Becker, M. Hubner, G. Hettich, R. Constapel, J. Eisenmann and J. Luka, "Dynamic and Partial FPGA Exploitation", *Proceedings of the IEEE*, Vol. 95, Issue 2, February 2007, pp.438-452.
- [177]M. Ullmann, M. Hubner, B. Grimm and J. Becker, "On-Demand FPGA Run-Time System for Dynamical Reconfiguration with Adaptive Priorities". In *Proceedings of the International Conference on Field Programmable Logic and Applications*, Antwerp, Belgium, 2004, pp. 454-463.
- [178]H. Kooti, E. Bozorgzadeh, S.H. Liao and L.C. Bao, "Reconfiguration-aware Spectrum Sharing for FPGA based Software Defined Radio". In *Proceedings of IEEE International Symposium on Parallel and Distributed Processing*, (IPDPS), Atlanta, USA, April 19-23, 2010.
- [179]T. Becker, W. Luk and P.Y.K. Cheung, "Parametric Design for Reconfigurable Software-Defined Radio". In *Proceedings of the 5th International Workshop on Applied Reconfigurable Computing* (ARC), Karlsruhe, Germany, March 16-18, 2009, pp.15-26.
- [180]M. Liu, W. Kuehn, Z.H Lu and A. Jantsch, "Run-Time Partial Reconfiguration Speed Investigation and Architectural Design Space Exploration". In *Proceedings of International Conference on Field Programmable Logic and Applications*, Prague, Czech, August 31- September 2, 2009, pp. 498-502.
- [181]C. Claus, B. Zhang, W. Stechele, L. Braun, M. Hubner and J. Becker, "A Multi-Platform Controller Allowing for Maximum Dynamic Partial Reconfiguration Throughput". In *Proceedings* of International Conference on Field Programmable Logic and Applications, Heidelberg, Germany, September 8-10, 2008, pp. 535-538.
- [182]P. Manet, D. Maufroid, L. Tosi, G. Gailliard, O. Mulertt, M.D. Ciano, J.D. Legat, D. Aulagnier, C. Gamrat, R. Liberati, V.L. Barba, P. Cuvelier, B. Rousseau and P. Gelineau, "An Evaluation of Dynamic Partial Reconfiguration for Signal and Image Processing in Professional Electronics Applications", *EURASIP Journal on Embedded Systems*, Vol. 2008, January 2008.
- [183]R. Woods, J. McAllister, G. Lightbody and Y. Yi, FPGA-based Implementation of Signal Processing Systems, Wiley 2008.
- [184]D. Curd, *Power Consumption in 65 nm FPGAs*, [online], Xilinx, Inc. February 2007. Available at:<www.xilinx.com/support/documentation/white\_papers/wp246.pdf>

- [185]P. Abusaidi, M. Klein and B. Philofsky, Virtex-5 FPGA System Power Design Considerations, [online], Xilinx, Inc. February 2008. Available at:<www.xilinx.com/support/documentation/ white\_papers/wp285.pdf>
- [186]L. Wang, M. French, A. Davoodi and D. Agarwal, "FPGA Dynamic Power Minimization Through Placement and Routing Constraints", *EURASIP Journal on Embedded Systems*, Vol. 2006, Issue 1, January 2006
- [187]L. Shang, A.S. Kaviani and K. Bathala, "Dynamic Power Consumption in Virtex<sup>™</sup>-II FPGA Family". In Proceesings of 10th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, USA, February 24-26 2002, pp. 157-164.
- [188]Xilinx, XPower Estimator User Guide, [online], Xilinx Inc. Available at:<www.xilinx.com/ support/documentation/user\_guides/ug440.pdf>
- [189]Xilinx, XPower Analyzer Help, [online], Xilinx Inc. Available at:<www.xilinx.com/support/ documentation/sw\_manuals/xilinx13\_1/isehelp\_start.htm#xpa\_c\_overview.htm>
- [190]J. Becker, M. Huebner and M. Ullmann, "Power Estimation and Power Measurement of Xilinx Virtex FPGAs: Trade-offs and Limitations". In *Proceedings of 16 Symposium on Integrated Circuits and Systems Design*, Sao Paulo, Brazil, September, 2003, pp. 283-288.
- [191]S.S. Liu, R.N. Pittman and A. Forin, "Energy Reduction with Run-Time Partial Reconfiguration", *Technical Report of Microsoft Research*, MSR-TR-2009-2017, September 2009.
- [192]J. Noguera and I.O. Kennedy, "Power Reduction in Network Equipment Through Adaptive Partial Reconfiguration". In *Proceedings of International Conference on Field Programmable Logic and Applications*, Amsterdam, Netherlands, August, 2007, pp. 240-245.
- [193]K. Paulsson, M. Hubner, S. Bayar and J. Becker, "Exploitation of Run-Time Partial Reconfiguration for Dynamic Power Management in Xilinx Spartan III-based Systems". In *Proceedings of Reconfigurable Communication-Centric SoCs (ReCoSoC)*, Montpellier, France, June 2007.
- [194]K. Paullson, M. Hubner and J. Becker, "Cost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Low-Power Capacity-Based Measurement Application on Xilinx FPGAs". In *Proceedings of Design, Automation and Test in Europe (DATE)*, Munich, Germany, March 2008.
- [195]Q. Wang, S. Gupta and J. Anderson, "Clock Power Reduction for Virtex-5 FPGAs". In Proceedings of 18th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, USA, February 2009, pp. 13-22.
- [196]K. Paulsson, M. Hubner and J. Becker, "On-line Optimization of FPGA Power-Dissipation by Exploiting Run-Time Adaption of Communication Primitives". In *Proceedings of 19th Annual Symposium on Integrated Circuits and Systems Design*, Ouro Preto, Brazil, August 2006.
- [197]Xilinx, Partial Reconfiguration Flow Presentation Manual (V13.2) [online], Xilinx Inc. Available at:<www.xilinx.com/university/workshops/partial-reconfiguration-flow/index.htm>
- [198]M. Klein, Static Power and the Importance of Realistic Junction Temperature Analysis, [online], Xilinx, Inc. March, 2005. Available at:<www.xilinx.com/support/documentation/white\_papers/ wp221.pdf>
- [199]H. Belhadj, V. Aggrawal, A. Pradhan and A. Zerrouki, *Power-Aware FPGA Design*, [online], Actel, Inc. Available at:<www.actel.com/documents/Power\_Aware\_WP.pdf>
- [200]M. Klein, *Power Consumption at 40 and 45 nm, WP298 (v1.0)*, [online], Xilinx, Inc. April 2009. Available at:<www.xilinx.com/support/documentation/white\_papers/wp298.pdf>.
- [201]S.H. Han, J.M. Cioffi and J.H. Lee, "On the Use of Hexagonal Constellation for Peak-to-Average

Power Ratio Reduction of an OFDM signal", *IEEE Transactions on Wireless Communications*, Vol. 7, Issue 3, March 2008, pp. 781-786.

- [202]S.H. Han and J.H. Lee, "An Overview of Peak-to-Average Power Ratio Reduction Techniques for Multicarrier Transmission", *IEEE Wireless Communications*, Vol. 12, Issue 2, April 2005, pp.56-65.
- [203]N. Lashkarian, E. Hemphill, H. Tarn, H. Parekh and C. Dick, "Reconfigurable Digital Front-End Hardware for Wireless Base-Station Transmitters: Analysis, Design and FPGA Implementation", *IEEE Transactions on Circuits and Systems–1: Regular Papers*, Vol. 54, No.8, August, 2007.
- [204]N. Lashkarian, H. Tarn and C. Dick, "Peak to Average Power Ratio Reduction in Multi-band Transmitters: Analysis, Design and FPGA Implementation". In *Proceedings of IEEE GLOBECOM*, St. Louise, Missouri, USA, November 2005.
- [205]Xilinx, *LogiCORE IP Digital Pre-Distortion v4.0*, [online], Xilinx Inc. Available at:<www.xilinx/support/documentation/ip\_documentation/xmp143\_dpd.pdf>
- [206]M. Alderighi, F. Casini, S. D'Angelo, M. Mancini, A. Marmo, S. Pastore and G.R. Sechi, "A Tool for Injecting SEU-like Faults into the Configuration Control Mechanism of Xilinx Virtex FPGAs". In Proceedings of 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI System (DFT'03), Boston, MA, USA, November 3-5, 2003, pp. 71-78.
- [207]P. Suarez-Casal, A. Carro-Lagoa, J.A. Garcia-Naya and L. Castedo, "A Multicore SDR Architecture for Reconfigurable WiMAX Downlink". In *Proceedings of 13th Euromicro Conference on Digital System Design*, Lille, France, September 1-3, 2010, pp. 801-804.
- [208]GSMWorld Press Release, [online]. Available at:<www.gsma.com/newsroom/operators-with-620-million-subscribers-spearhead-gsmas-3g-campaign>
- [209]Xilinx, The First Generation of Extensible Processing Platforms: A New Level of Performance, Flexibility and Scalability, [online], Xilinx Inc. Available at:<www.xilinx.com/publications/ prod\_mktg/zynq7000/Product-Brief.pdf>
- [210]K. He, L. Crockett and R. Stewart, "Dynamic Reconfiguration Technologies Based on FPGA in Software Defined Radio System", *Journal of Signal Processing Systems for Signal, Image, and Video Technology*, Vol. 69, Issue 1, June, 2012, pp.75-85.
- [211] Ministry of Defence: Delivering Digital Tactical Communications through the Bowman CIP Programme, House of Commons, February 27, 2007.
- [212]Wikipedia-http://en.wikipedia.org/wiki/Bowman\_(communications\_system).