# A MIXED-SIGNAL MATCHED-FILTER DESIGN AND SIMULATION

M. R. Zahabi, V. Meghdadi, J.P. Cances and A. Saemi

University of Limoges - Ecole Nationale Supérieure d'Ingénieurs de Limoges (ENSIL) XLIM-Dept. C<sup>2</sup>S<sup>2</sup>, UMR CNRS 6172

16, Rue Atlantis - Parc ESTER - BP 6804- 87068 Limoges cedex, France Email {zahabi, cances, meghdadi, saemi}@ensil.unilim.fr

## ABSTRACT

A  $0.35\mu$ -CMOS mixed-signal programmable filter suitable for high-rate communication systems is designed and investigated. The proposed filter has analog input and analogsampled outputs. Filter taps are stored in a digital memory and can be changed on the fly. The filter structure is based on a bank of digitally controlled transconductors along with small capacitors. The employed transconductors are based on simple inverter and thus can be integrated efficiently with the digital parts of a system. A FIR cosine rolloff filter is designed and investigated by simulation in time and frequency domains. The results show that the proposed structure has a good speed-complexity-consumption trade-off.

#### 1. INTRODUCTION

Despite the fact that digital signal processing is widespread in today technology, existence of analog parts especially in front-end sections of communication systems is unavoidable. Integration of analog and digital parts therefore is essential in a mixed-signal SoC (system on chip) solution for modern technology. It may be both economic and power efficient to pre-process the analog signals at analog/digital interfaces. Matched-filters in communication systems are located at the interface. From a radio-software point of view, these filters are implemented by means of digital VLSI circuits or DSP processors and are preceded by a fast sample-and-hold (S/H) and analog-to-digital converter (ADC). Mixed-signal implementation, however, is an alternative that can result a simple and power-efficient circuits as well as a wider frequency range.

In addition to this general motivation, yet, another important application for this work falls into the domain of analog decoding. Analog decoding was presented at the same time by J. Hagenauer and H. A. Loeliger in 1988 [1][2]. Briefly speaking, a nonlinear analog circuit can serves all necessary functions for channel decoders with less power consumption and higher speed with respect to conventional digital decoders. Analog decoders with their parallel analog inputs and outputs are not fully compatible with digital receivers. In a typical scenario, the received signal must be converted to digital, then processed, filtered and converted again to analog to serve for the analog decoders' inputs. Several S/H are also needed to provide parallel input to the decoders. It is clear that such an approach is not efficient since the power required for ADC and DAC often dominate the power saved by decoder. Our approach can resolve this problem and provides an integrated solution from matched filtering to decoding sections. In this approach analog signals are filtered and are given directly to the decoders. In addition, analog outputs are parallel sampled-data that are suitable for the analog decoders.

The other major approaches to implement sampled-data filters are switched-capacitors (SC) and switched-currents (SI). Very precise monolithic filter can be realized by SC networks. However it is well known that SC networks have a sever frequency limitation due to use of operational amplifiers. It is simpler to implement SI networks because they don't need linear floating capacitors (two polysilicon layers) and thus are totally compatible with digital CMOS process. Several attempts were done to increase the clock rate of SI networks, but they need special active elements such as GaAs MESFET [3]. Programmable SI filters is also reported using array of switched-current mirrors [4].

This paper is an attempt to design a mixed-signal filter suitable for high-rate digital receiver front-end. The advantages are that the signal path remains in analog domain (no ADC, low power consumption) while the filter taps are stored digitally (reconfigurability). Besides, the overall proposed circuit can be implemented using standard CMOS technology, so it can be integrated in the same chip as the digital parts of the circuit.

The rest of this paper is organized as follows. In the second section the mathematical convention to approximate a continuous convolution is discussed. The third section describes the design and structure of the mixed-signal filter and the MAC (multiply and accumulate) using CMOS inverters. A simple modification to the binary number representation associated with filter taps in order to optimize the dynamic range is also discussed in this section. Circuit level

simulation for a typical filter is performed and is presented in section four. Finally conclusions are drawn in the last section.

#### 2. APPROXIMATION TO CONTINUOUS CONVOLUTION

Suppose h(t) is a time limited impulse response of a causal filter with duration *D* to be realized, and x(t) is the input signal. The filter output is given by:

$$y(t) = \int_{\tau-D}^{t} x(\tau) h(t-\tau) d\tau$$
(1)

In digital system applications, we are normally interested in the output at some instants nT where T is the symbol duration. In practical case, the duration of D is an integer multiple of symbol period T i.e. D/T = N and the above equation can be written as follows:

$$y(nT) = \int_{(n-N)T}^{nT} x(\tau)h(nT-\tau)d\tau$$
<sup>(2)</sup>

This is a continuous weighted integral of x(t). The integral can be approximated by dividing the integration period into *L* subsections. Defining the  $k^{th}$  integration gain as:

$$g_{k} = h\left((L-1-k)\Delta t\right) \tag{3}$$

where  $\Delta t = D/L$  and k=(0,...,L-1). The filter output can then be expressed approximately as:

$$y(nT) = \sum_{k=0}^{L-1} g_k \int_{nT - (L-k)\Delta t}^{nT - (L-k-1)\Delta t} x(\tau) d\tau$$
(4)

In order to implement equation (4), we need an integrator with a gain proportional to  $g_k$  at the instant  $(L-1-k)\Delta t$ . With respect to an ideal convolution, the above operations introduce three approximations. The first approximation is the truncation of filter impulse response which is equivalent to the convolution of the filter spectrum with  $D \operatorname{sinc}(fD) \exp(-j\pi fD)$ . The second approximation is due to equation (3) which is equivalent to the flat-top sampling of h(t). It affects the overall frequency response via multiplication to  $\Delta t \operatorname{sinc}(f \Delta t) \exp(-j\pi f \Delta t)$  and as  $\Delta t$  becomes small, the sinc effect becomes negligible. The third approximation is the quantization of the filter taps  $(g_k)$  that can degrade the frequency response. One can minimize this effect by incorporating the quantization into the design as opposed to quantizing the coefficients after the filter has been designed [6]. The phase response is always linear providing that symmetry of filter taps is held.

## 3. FILTER STRUCTURE

According to (4) the filter output at instant nT can be approximately obtained by partially multiplication and integration of the analog input signal. Clearly this operation defines a Multiply-and-Accumulate (MAC) operation. We first concentrate on the structure of MAC. Typically the input signal is in voltage and a voltage-to-current (V/I) converter can be applied along with a capacitor to perform integration.



Figure 1: Balance transconductor and output resistance compensation

Figure 1 shows the preliminary structure for the MAC. The transconductor proposed in [5] is based on two CMOS inverters and is employed for this work. It can be shown that the following relation holds in this circuit:

$$i_{o} = i_{o1} - i_{o2} = (k_{n} - k_{p})(V_{l} - V_{c})v_{i} + g_{m}v_{i}$$
(5)

$$g_{m} = \sqrt{k_{n}k_{p}} \left( V_{DD} - T_{THn} - \left| V_{THp} \right| \right)$$
(6)

$$V_{c} = \frac{V_{DD} - V_{THn} - |V_{THp}|}{1 + \sqrt{k_{n} / k_{p}}} + V_{THn}$$
(7)

where  $V_I$  is the common mode input voltage and  $g_m$  and  $V_C$ are defined in (6) and (7) respectively.  $i_o$  is the effectice current that charge the capacitors and is defined in figure 1. Aspect ratios of the transistors are so that  $k_n \approx k_p$  and also  $V_I$ is close to  $V_C$  and therefore the output current,  $i_o$ , will be almost equal to the second term in equation (5). In addition, in figure 1, negative resistance compensation is performed by using two other similar transconductors  $g_{m2}$  and  $g_{m3}$ . It is easy to show that the added transconductors realize an equivalent resistance of  $1/(g_{m3} - g_{m2})$  at output nodes. Proper setting of  $g_{m2}$  and  $g_{m3}$  results in a negative resistance that will cancel out the output resistances of the transconductors and yield a theoretical infinity output resistance. Based on (6)  $g_m$  is proportional to both supply voltage  $V_{DD}$ and aspect ratio of transistors (through  $k_n$  and  $k_p$ ). Slight reduction of aspect ratio (W/L) for the transistors in  $g_{m3}$  with respect to  $g_{m2}$  can produce the required negative resistance. In addition two grounded capacitors are connected in the figure to realize an integrator. The circuit in figure 1 is an integrator with the constant gain of  $g_{ml}/C$ . A mixed-signal MAC is designed and is depicted in figure 2. The transconductors have binary-weighted gain. The current nature of the output quantity makes it possible to tie several transconductor outputs. Note that in this figure, all transconductors share the output capacitors as well as negative resistance circuitry  $(g_{m2} \text{ and } g_{m3} \text{ in figure 1})$ . Input voltage can be connected to or disconnected from the transconductors by means of switch box and according to the state of control bit  $b_a$ . This is equivalent to change the overall gain of MAC.

Proc. of the 2007 15<sup>th</sup> Intl. Conf. on Digital Signal Processing (DSP 2007)



Suppose that *Q* transconductors of gain  $2^q g_m$ ;  $q=\{0,1,...,Q-1\}$  are used and driven by the input signal  $v_i$  through the switch boxes. The effective integrator gain will be:

$$G_{\text{integrator}} = \left(\sum_{q} b_{q} s_{q} 2^{q}\right) \frac{g_{m}}{C}$$
(8)

where  $b_q$  are binary control bits and have the values of zero or one.  $s_q$  is either +1 or -1 for direct and twisted connection respectively. For example  $s_1$ = -1 in figure 2. The signal  $b_q$ are stored in a digital memory of the width Q. Such configuration constitutes a mixed-signal MAC for which the input signal is analog and multiplication coefficients are quantized numbers in the range of:

$$-\sum_{q;s_q=-1} b_q 2^q$$
 to  $\sum_{q;s_q=+1} b_q 2^q$  (9)

In filter realization, these coefficients correspond to the filter taps. Polar (positive and negative) tap values are required for the realization of practical filters. Moreover the taps values are not necessarily symmetric about zero. The bit allocation to the digitized tap thus must be so that the full available dynamic range is covered. Fortunately this requirement can be accomplished in the proposed structure with no extra complexity as follows: we reverse the inputs (outputs) of those transconductors which have negative weight (the second one for the above example) and at the same time invert its corresponding logic in the digital memory  $(b_1$  in the figure). For example, the impulse response of a raised cosine filter is not symmetric about zero and it can be found that the optimal four-bit range is  $\{-2, ..., 13\}$ . The binary weights of  $\{1, -2, 4, 8\}$  can span above range and is used during this paper. Based on (4) the MAC output is one sample of the filter output. Thus replication of MAC (N times) is necessary to obtain all required samples. The MAC output will be valid after the integration period D. The integration period for the next MAC start after  $T_{a}$  second where  $1/T_{o}$  is the desired sampling rate at the filter output and  $\Delta t < T_o < T$ .



Figure 3: the digital parts of filter connection

Therefore the integration periods of the MACs overlaps but all of them perform the same things. It means that the (digital) commands for the MACs are driven by the same code words but in a delayed manner. The necessary hardware is some (digital) latches as delay elements (figure 3). The memory and the latches are clocked at the over sampling rate, i.e. reciprocal of  $\Delta t$ . Note that each latch block in this figure may be a simple latch or  $T/\Delta t$  stacked-latch for  $T_o = \Delta t$  and  $T_o = T$  respectively.

## 4. RESULTS AND SIMULATIONS

A raised cosine matched-filter for 40 MHz symbol rate is designed and is quantized to M=16 levels. Figure 4 shows the frequency response for this filter as well as corresponding theoretical filter with the same number of quantization levels. It can be seen that the noise injection from the side lobs does not differ considerably for the practical and theoretical ones; thereby the overall performance of the receiver will be almost the same. Phase response is also quiet linear but is not shown here.

The MAC units are build with  $9\log_2(M)+10$  transistors and is replicated *N* times where *N* is equal to *D/T* to provide all the samples during the integration period. For the assumed filter, the impulse response duration is 6 times as much as symbol duration and thus *N*=6.



Fig 4: Theoretical and practical frequency response of a typical raised cosine match filter

274

Moreover, the integrating capacitors should be discharged using a switching MOS at the beginning of each new integration period. In our simulation, each 4-bit MAC draws 858µA using a supply voltage of 3.3 volts. For N=6 the total power is about 17 mW which is quite satisfying. Table 1 summarizes the results of several recent works as well as this work. In comparison with commercial and digital implementations, e.g. ASIC, TI, FPGA, great power improvement is achieved with the proposed filter. Although the analog and SI realizations consume lower power than the digital realizations but it can be observed that our proposed structure sounds interesting concerning the number of taps, its low supply voltage and the operating frequency. Another simulation is the time response of the practical filter and comparison with the response of an ideal filter with no quantization. Figure 5 demonstrates the two responses obtain from simulation in Matlab and Cadence. An input Barker 13 sequence is shaped at the rate of 40MHz using a root raised cosine filter and is given to the theoretical and practical filters. The response of theoretical filter follows the practical one tightly. This proves that almost no distortion is introduced by the proposed filter. Moreover the results confirm that the allocation of 4 bits for the filter taps is quiet sufficient in this application.

#### **5 CONCLUSION**

In this paper we have proposed a new structure for the mixed-signal realization of FIR filters with high-frequency characteristics. It is based on CMOS technology that can be integrated along with digital circuitry. A high bandwidth MAC structure using CMOS inverter is proposed and employed as a building block for the FIR filter. Furthermore, filter taps are stored in a digital memory and can be modified in real time. This feature is very suitable in adaptive filtering. A simple approach to optimize the taps range according to the range of filter's impulse response is discussed in this paper that allows keeping the quantization error as small as possible. A sample raised cosine filter was designed and implemented. The frequency response of the filter is calculated via simulation in Cadence. The results show that its versatility as a matched filter. The proposed structure is very simple and straightforward and can be implemented by standard CMOS technology. In addition, simulation results confirm that it is a promising structure for high speed and low power applications with respect to other sampled-data approaches.



Figure 5: Time response simulation for an ideal root raised cosine filter (up) and a practical filter with 4-bit tap values.

#### **6 REFERENCES**

- J. Hagenauer etal, "The analog decoder," in Proc. Int. Symp. Information Theory, Cambridge, MA, p. 145, 1988,
- [2] H.A. Loeliger, etal, "Probability propagation and decoding in analog VLSI," Proc. Int. Symp. Information Theory, Cambridge, MA, p. 146, 1988.
- [3] C. Toumazou, N.C. Battersby and M. Punwani, "GaAs switchedcurrent techniques for front-end analogue signal processing applications," *Proc. IEEE Midwest Symp. Circuits Syst.*, Aug. 1992.
- [4] F.A. Farag, etal, "Digitally programmable switched-current FIR filter for low-voltage applications," Solid-State Circuits, IEEE Journal ofVolume 35, Issue 4, . pp.637-641,April 2000.
- [5] Bram Nauta, "A CMOS Transconductance-C Filter Technique for Very High Frequencies," IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 27, NO. 2. FEBRUARY 1992.
- [6] R. Storn, "Designing nonstandard filter with differential evolution," IEEE Signal Processing Magazine, Vol 22, Issue 1, Jan 2005.
- [7] V. Srinivasan, etal "Low-power realization of FIR filters using current-mode analog design techniques", Conference on Signals, Systems and Computers, Volume 2, pp.2223-2227, Nov. 2004.
- [8] A.Erdogan, E. Zwyssig, and T. Arslan, "Architectural trad-offs in the design of low power fir filtering cores," *IEE Proceedings on Circuits, Devices and Systems*, vol. 151, pp. 10–17, Feb. 2004.
   [9] "GC2011A-3 3V digital chin," *Texas Instruments Datasheet* 2000
- [9] "GC2011A-3.3V digital chip," *Texas Instruments Datasheet*, 2000.
   [10] G. Cardarilli, A. Re, A. Nannarelli, and M. Re, "Power characterizz
- [10] G. Cardarilli, A. Re, A. Nannarelli, and M. Re, "Power characterization of digital filters implemented on FPGA," *Proceedings of the International Symposium on Circuits and Systems*, pp. V–801–V–804, 2002.
- [11] G. Liang and D. Allstot, "FIR filtering using switched current techniques," *Proceedings of the International Symposium on Circuits* and Systems, pp. 2291–2294, May 1990

| Parameter         | SI [7]      | ASIC [8]    | TI [9]      | FPGA [10] | Analog [11] | This work    |
|-------------------|-------------|-------------|-------------|-----------|-------------|--------------|
| No. of Taps       | 8           | 24          | 32          | 8         | 11          | 25           |
| Sampling Freq.    | 20 MHz      | 20 MHz      | 20 MHz      | 20 MHz    | 10 MHz      | 40 MHz       |
| Power consumption | 13 mW       | 328 mW      | 400 mW      | 210 mW    | 100 mW      | 17 mW        |
| Supply voltage    | 5 V         | 3 V         | 3.3 V       | NA        | 5 V         | 3.3 V        |
| Technology        | 0.5 µm CMOS | 0.35µm CMOS | 0.5 µm CMOS | NA        | 2.0µm CMOS  | 0.35 µm CMOS |

Table 1: Comparison of the proposed filter with some other approaches

Proc. of the 2007 15<sup>th</sup> Intl. Conf. on Digital Signal Processing (DSP 2007)