A Low-power Down-Conversion and Correlation Engine for GPS Receivers

Tina Wen, Wei An

I. Introduction to GPS Applications

Nowadays 2 frequencies of GPS signals are used; one for industry and the other for military applications. Their null-to-null bandwidths are 2.046MHz and 20.46MHz respectively. The sampling frequencies of the signals before the digital processing step are around 40MHz and 400MHz respectively. We’ll use these numbers in our investigation.

The most computationally intensive operations of a GPS receiver are: down-converting the IF signal into its baseband signals; removing the Gold code; and despreading signals from wide-band to narrow-band. The diagram of these operations is given in Fig. 1. Two additional blocks, the sine wave generator and the Gold code generator, are not considered in our study. This is because the blocks consume minor power compared to the listed operations; and the low-power design methods discussed later can be extended to apply to these blocks as well.

The input to the hardware engine contains a 2-bit signal, a 2-bit sine wave and a 1-bit Gold code. The multiplication in down-conversion is performed by a 4-bit to 6-bit mapping. The Gold code removal is done by selecting the input number or its 2’s compliment based on the value of the Gold code. The accumulator performs the despreading operation. Since there is a large amount of data to be added, the accumulators have 6-bit inputs and 22-bit outputs.

II. Low-power Design Methodology

The well-known formula for circuit power dissipation is $CV^2f$ where $C$, $V$ and $f$ are capacitance, voltage supply, and transition frequency respectively. It is common practice to reduce the voltage in order to achieve quadratic power reduction. However, as the voltage is lowered, the delay of the circuit increases and this limits the operation frequency of the circuit. Though increasing the transistors size may compensate for the lost speed, this optimization is limited due to the increased parasitic capacitance adding more delay to the circuit.

[1] describes the method of using parallelism to reach low power frequency and consequently allowing a very low supply voltage. This method is suitable for our GPS application because of its oversampling feature. Pipelining is another method used to increase the allowance for propagation delay so the voltage can be lowered to reduce power. The performance of these methods need to be confirmed by simulations and there are other issues that need to be addressed, such as overhead, stability and layout area, etc.

Circuit stability is an important issue. When the voltage is so low that the devices are pushed into the sub-threshold region, the delay increases exponentially as supply voltage decreases. A small perturbation of supply voltage or device threshold voltage can cause a significant change in circuit delay, leading to failure. Thus the delay allowance must be large enough for circuits to operate reliably at low voltages.

III. Properties of Sequential Design

The sequential design diagram is presented in Fig. 1. The design process is as follows, Verilog -> RC compiler -> Virtuoso -> Hspice -> Nanosim

Fig. 2 shows the relationship between voltage, power and delay of the sequential design. As expected, the voltage and power dissipation drop as the delay increases. At low power, the delay is very sensitive to variations of supply voltage and threshold voltage. The influence of production variations is simulated using a Monte Carlo method. The delay distribution for voltage supplies of 0.4V, 0.45V and 0.5V are plotted in Fig. 3. Lower voltage causes larger delay and wider distribution spread. From observation, at 0.4V, the mean delay is around 15ns and the spread covers around 20ns. Thus, to ensure the reliability of the circuits, the clock cycles should be larger than 25ns.

IV. Parallel Implementation

Fig. 4 illustrates the parallel implementation. All parallel paths are simply duplications of the sequential design. An extra shifter and a 2nd-stage accumulator are needed to combine the result from all paths. Clock gating is used with the 2nd accumulator such that it only operates when the output of each path needs to be combined. If the path number is 32, it only operates every 32 clock cycles. Consequently the overhead of using parallelism is reduced.

We can ignore the 2nd accumulator overhead and model the parallel case as copies of the sequential block. Then we can estimate power consumption for different path numbers from the curve in Fig. 2. Take the 400MHz signal as an example. If we use a 4-path parallel implementation, the clock period of each path is reduced to 10ns. From Fig. 2, we find the corresponding single path power consumption to be around 15uW. Thus a 4-path parallelism would consume 60uW, as estimated in Fig. 5. The measured power consumption is 90uW. The extra power is due to the last combining circuit which results in not only more switching and leakage power, but also propagation delay.

Though increasing the number of paths allows more voltage scaling, it also adds more power dissipation as the number of paths increases. There is a balance between the two factors. Fig. 5 shows the minimum power consumption is 50uW with a 16-path parallel implementation. Compared to the 130uW needed for sequential implementation, this represents a power reduction of over 50%.

IV. Pipeline Implementation

40MHz signal receivers cannot benefit much from the parallel method. From Fig. 2, a clock period of 25ns is sufficient to handle delays caused by voltage reduction. However, the circuit is not reliable enough, resulting in variations in delay in Fig. 3. So we need to add a certain delay allowance for reliability purposes. In the case of a 40MHz signal, parallelism is not as advantageous as pipelining in terms of both power saving and layout area.

In the sequential design, the 3 components are naturally implemented as 3 pipeline stages. Fig. 6 shows their propagation delay. It is observed that the accumulator has significantly more delay than the other two. It also suffers the most from delay variation. In Fig. 7, we introduce a method to create a multiple-stage pipelined accumulator. Fig. 8 shows that at low voltages, there is significant improvement of delay. This is exactly what the 40MHz GPS receivers need for reliable operations.

V. Conclusion

Our project explores low power design methods for GPS receivers at 400MHz and 404MHz. It is illustrated that 400MHz receivers should use parallelism, resulting in a power reduction of over 50%. For the 40MHz receivers, implementing a multi-stage pipelined accumulator is needed for circuit reliability.

Acknowledgments

The authors gratefully acknowledge Joyce Wong, Dennis Daly, and Prof. A. Chandrakasan for helpful discussion.

References:

[1] Vivienne Sze, Anantha P. Chandrakasan, “A 0.4-V UWB Baseband Processor,” ISLPED’07, August 27–29, 2007, Portland, Oregon, USA
Figure 1: GPS Receiver Diagram

Figure 2: Voltage/Power vs. Delay for Sequential Design

Figure 3: Monte Carlo Delay Distribution

Figure 4: Parallel Implementation Diagram

Figure 5: Predicted and Measured Power for 400MHz Signal

Figure 6: Delay of 3 Components of Sequential Design

Figure 7: 2-stage Pipelined Accumulator

Figure 8: Delay vs. Voltage for Multi-stage Accumulator