# Lecture 7

LPset6 is due Wednesday October 3 (not Tuesday)
Lab 3 is Due next Tuesday October 2



## Pong in History:

http://www.pong-story.com/gi.htm



AY-3-8500 "Ball-and-Paddle" chip

https://commons.wikimedia.org/wiki/File:AY-3-8500.jpg

#### Toward FSM Modularity

Consider the following abstract FSM:



- Suppose that each set of states  $a_x...d_x$  is a "sub-FSM" that produces exactly the same outputs.
- Can we simplify the FSM by removing equivalent states?
   No! The outputs may be the same, but the next-state transitions are not.
- This situation closely resembles a procedure call or function call in software...how can we apply this concept to FSMs?

Acknowledgements: Rex Min

## The Major/Minor FSM Abstraction



- Subtasks are encapsulated in minor FSMs with common reset and clock
- Simple communication abstraction:
  - START: tells the minor FSM to begin operation (the call)
  - BUSY: tells the major FSM whether the minor is done (the return)
- The major/minor abstraction is great for...
  - Modular designs (always a good thing)
  - Tasks that occur often but in different contexts
  - Tasks that require a variable/unknown period of time
  - Event-driven systems

#### Inside the Major FSM



#### Variations:

- Usually don't need both Step 1 and Step 3
- One cycle "done" signal instead of multi-cycle "busy"

#### Inside the Minor FSM





#### Optimizing the Minor FSM

Good idea: de-assert BUSY one cycle early



#### Bad idea #1:

T<sub>4</sub> may not immediately return to T<sub>1</sub>



#### Bad idea #2:

**BUSY** never asserts!



#### A Four-FSM Example





# Four-FSM Sample Waveform



9/27/18

## Clocking and Synchronous Communication



#### Ideal world:



M1 and M2 clock edges aligned in time

#### Clock Skew



M2 clock delayed with respect to M1 clock

## Low-skew Clocking in FPGAs



#### Goal: use as few clock domains as possible

Suppose we wanted clocks at f/2, f/4, f/8, etc.:

```
reg clk2,clk4,clk8,clk16;
always @(posedge clk) clk2 <= ~clk2;
always @(posedge clk2) clk4 <= ~clk4;
always @(posedge clk4) clk8 <= ~clk16;
always @(posedge clk8) clk16 <= ~clk16;</pre>
```





#### Solution: 1 clock, many enables

Use one (high speed) clock, but create enable signals to select a subset of the edges to use for a particular piece of sequential logic

```
reg [3:0] count;
always @(posedge clk) count <= count + 1;  // counts 0..15

wire enb2 = (count[0] == 1'b1);
wire enb4 = (count[1:0] == 2'b11);
wire enb8 = (count[2:0] == 3'b111);
wire enb16 = (count[3:0] == 4'b1111);
</pre>

always @(posedge clk)
if (enb2) begin
    // get here every 2nd cycle
end
```



#### Using External Clocks

#### **IOB** stands for Input/Output Block

Sometimes you need to communicate synchronously with circuitry outside of the FPGA (memories, I/O, ...)

Problem: different delays along internal paths for DATA and CLK change timing relationship

#### **Solutions:**

- 1) Bound internal delay from pin to internal reg; add that delay to setup time  $(t_{SU})$  specification
- 2) Make internal clock edge aligned with external clock edge (but what about delay of pad and clock driver)



## 1) Bound Internal Data Delay

Solution: use registers built into the IOB pin interface:



Virtex-II IOB Block

#### 2) Align external and internal clocks



9/27/18

#### Example: Labkit ZBT interface

The upper DCM is used to generate the de-skewed clock for the external ZBT memories. The feedback loop for this DCM includes a 2.0 inch long trace on the labkit PCB and matches in distance all of the PCB traces from the FPGA to the ZBT memories. The propagation delay from the output of the upper DCM back to its CLKFB input should be almost exactly the same as the propagation delay from the DCM output to the ZBT memories.



The lower DCM is used to ensure that the fpga\_clock signal, which clocks all of the FPGA flip-flops, is in phase with the reference clock (clock\_27mhz).

# Generating Other Clock Frequencies (again)

The labkit has a 27MHz crystal (37ns period). Use DCM to generate other frequencies e.g., 65MHz to generate 1024x768 VGA video.



The DCM (ISE only) can also synthesize certain multiples of the CLKIN frequency (eg, multiples of 27MHz):

$$f_{CLKFX} = \left(\frac{M}{D}\right) f_{CLKIN}$$

Where M = 2-32 and D = 2-32 with a output frequency of range of 24MHz to 210MHz.

Vivado uses a Clock Wizard to simplify clock generation.

9/27/18 6.111 Fall 2018 19

#### Verilog to generate 65MHz clock

```
// use FPGA's digital clock manager to produce a
// 65MHz clock (actually 64.8MHz)
wire clock_65mhz_unbuf,clock_65mhz;
DCM vclk1(.CLKIN(clock_27mhz),.CLKFX(clock_65mhz_unbuf));
// synthesis attribute CLKFX_DIVIDE of vclk1 is 10
// synthesis attribute CLKFX_MULTIPLY of vclk1 is 24
// synthesis attribute CLK_FEEDBACK of vclk1 is NONE
// synthesis attribute CLKIN_PERIOD of vclk1 is 37
BUFG vclk2(.0(clock_65mhz),.I(clock_65mhz_unbuf));
```

$$f_{CLKFX} = \left(\frac{24}{10}\right) (27MHz) = 64.8MHz$$

# Where do we get frequencies?



16MHz Crystal

- Most frequencies come from Crystal Oscillators made of quartz
- Equivalent to very High-Q LRC tank circuits
- https://en.wikipedia.org/wiki/Crystal\_oscillator\_frequencies
- Incorporate into circuit like that below and boom, you've got a square wave of some specified frequency dependent largely on

the crystal



http://www.z80.info/uexosc.htm https://en.wikipedia.org/wiki/Crystal oscillator

# High Frequencies

Very hard to get a crystal oscillator to operate above ~200 MHz (7<sup>th</sup> harmonic of resonance of crystal itself, which usually is limited to about 30 MHz due to fabrication limitations)

Where does the 2.33 GHz clock of my iPhone come from then?

Frequency Multipliers!

# Voltage Controlled Oscillator

• It is very easy to make voltage-controlled oscillators that run up to 1GHz or more.

Why don't we just:



• Pick the voltage  $V_i$  that is needed to get the frequency we want  $f_o$ ? That's gotta be specified right?



A simple VCO (not type found in FPGA)

 Same reason we don't see op amps in open loop out in the wild...they are too unstable...gotta place them in negative feedback

# Phase Locked Loop

- Place the unstable, but capable VCO in a feedback loop.
- This type of circuit is a phase-locked loop variant



9/27/18

# Phase Locked Loop

 Circuit that can track an input phase of a system and reproduce it at the output



# Phase, Frequency Detector



9/27/18

#### Phase Detector

Can be a simple XOR gate





• If near the desired frequency already this can work...if it is too far out, it won't and can be very unreliable since phase and frequency are related but not quite the same thing, it will lock onto harmonics, etc...

- For frequency we instead use a PFD:
  - Phase/Frequency Detector:

# Phase-Frequency Detection

- Detects both change and which clock signal is consistently leading the other one
- Using MOSFETs you charge/discharge a capacitor accordingly which also with some resistors low-pass filter's the signal
- The output voltage is then roughly proportional to the frequency error!



http://www.globalspec.com/reference/72819/203279/2-7-phase-detectors-with-charge-pump-output

9/27/18

# Phase Frequency Detection



# PFD, Charge Pump, LP Filter



- So this circuit can make  $f_0 = f_{ref}$  That doesn't help us!
- How can we make a higher frequency?

#### Use Resistors in Voltage Divider in Feedback Path!



 A voltage divider in feedback path gives us voltage gain!

$$K = \frac{1}{1 - p + G} \quad p \approx 0.9999 \text{ means} \qquad K = \frac{1}{G}$$

 $G = \frac{R_1}{R_1 + R_2}$ 

The gain  $A_v$  of this circuit is therefore:

$$A_v = \frac{R_1 + R_2}{R_1}$$

The gain of a "non-inverting amplifier"

# Use a Clock Divider in Feedback Path!

 A clock divider in feedback path gives us clock gain!



# Use a Clock Divider in Feedback Path!



```
reg clk2,clk4,clk8,clk16;
always @(posedge clk) clk2 <= ~clk2;
always @(posedge clk2) clk4 <= ~clk4;
always @(posedge clk4) clk8 <= ~clk16;
always @(posedge clk8) clk16 <= ~clk16;</pre>
```

#### RESETing to a known state

Just after configuration, all the registers/memories are in a known state (eg, default value for regs is 0). But you may need to include a RESET signal to set the initial state to what you want. Note the Verilog initial block only works in simulation and has no effect when synthesizing hardware.

Solution: have your logic take a RESET signal which can be asserted on start up and by an external push button:

#### Debugging: making the state visible

To figure out what your circuit is doing it can be very useful to include logic that makes various pieces of state visible to the outside world. Some suggestions:

- turn the leds on and off to signal events, entry into particular pieces of code, etc.
- use the 16-character fluorescent display to show more complex state information
- drive useful data onto the ANALYZER pins and use the adapters to hook them up to the logic analyzer. Include your master clock signal and the configure the logic analyzer to sample the data on the non-active edge of the clock (to avoid setup and hold problems introduced by I/O pad delays). The logic analyzer can capture thousands of cycles of data and display the results in useful ways (including interpreting multi-bit data as samples of an analog waveform).
- magnivue can get down to 125 ps sample period! Great for finding glitches

## Encoding numbers

It is straightforward to encode positive integers as a sequence of bits. Each bit is assigned a weight. Ordered from right to left, these weights are increasing powers of 2. The value of an n-bit number encoded in this fashion is given by the following formula:

$$v = \sum_{i=0}^{n-1} 2^i b_i$$

Oftentimes we will find it convenient to cluster groups of bits together for a more compact notation. Two popular groupings are clusters of 3 bits and 4 bits.



| <mark>0</mark> 3720            | <mark>0x</mark> 7d0          |
|--------------------------------|------------------------------|
| Octal - base 8                 | Hexadecimal - base 16        |
| 000 - 0                        | 0000 - 0 1000 - 8            |
| 001 - 1                        | 0001 - 1 1001 - 9            |
| 010 - 2                        | 0010 - 2 1010 - a            |
| 011 - 3                        | 0011 - 3 1011 - b            |
| 100 - 4                        | 0100 - 4 1100 - c            |
| 101 - 5                        | 0101 - 5 1101 - d            |
| 110 - 6                        | 0110 - 6 1110 - e            |
| 6.111 Fall 2018 <b>111 - 7</b> | $0111 - 7  1111_{\bar{3}6}f$ |

#### Binary Representation of Numbers

#### How to represent negative numbers?

- Three common schemes:
  - sign-magnitude, ones complement, twos complement
- <u>Sign-magnitude</u>: MSB = 0 for positive, 1 for negative
  - Range:  $-(2^{N-1}-1)$  to  $+(2^{N-1}-1)$
  - Two representations for zero: 0000... & 1000...
  - Simple multiplication but complicated addition/subtraction
- Ones complement: if N is positive then its negative is  $\overline{N}$ 
  - Example: 0111 = 7, 1000 = -7
  - Range:  $-(2^{N-1}-1)$  to  $+(2^{N-1}-1)$
  - Two representations for zero: 0000... & 1111...
  - Subtraction is addition followed by end-around carry (subtraction is different from addition unit)

#### Representing negative integers

To keep our arithmetic circuits simple, we'd like to find a representation for negative numbers so that we can use a single operation (binary addition) when we wish to find the sum of two integers, independent of whether they are positive or negative.

We certainly want A + (-A) = 0. Consider the following 8-bit binary addition where we only keep 8 bits of the result:

which implies that the 8-bit representation of -1 is 11111111. More generally



9/27/18 6.111 Fall 2018 38

## Signed integers: 2's complement



8-bit 2's complement example:

$$11010110 = -2^7 + 2^6 + 2^4 + 2^2 + 2^1 = -128 + 64 + 16 + 4 + 2 = -42$$

If we use a two's complement representation for signed integers, the same binary addition mod 2<sup>n</sup> procedure will work for adding positive and negative numbers (don't need separate subtraction rules). The same procedure will also handle unsigned numbers!

By moving the implicit location of "decimal" point, we can represent fractions too:

$$1101.0110 = -2^3 + 2^2 + 2^0 + 2^{-2} + 2^{-3} = -8 + 4 + 1 + 0.25 + 0.125 = -2.625$$

9/27/18 6.111 Fall 2018 39

#### Sign extension

Consider the 8-bit 2's complement representation of:

$$42 = 00101010$$
  $-5 = \sim 00000101 + 1$   
= 11111010 + 1  
= 11111011

What is their 16-bit 2's complement representation?

9/27/18

#### Using Signed Arithmetic in Verilog

**BE CAUTIOUS** 

```
rea signed [63:0] data;
wire signed [7:0] vector;
input signed [31:0] a;
function signed [128:0] alu;
16'hC501 //an unsigned 16-bit hex value
16'shC501 //a signed 16-bit hex value
                                        16'shC501
Use care with signed arithmetic!
wire signed [7:0] total;
wire [3:0] counter; // max value 15, counting widgets off the mfg line
wire signed [5:0] available;
assign total = available + counter; // does this give the correct answer?
//NO! counter = 4'b1111 is treated as -1. Need to "append" a leading zero
assign total = available + {1'b0, counter}; // or use $unsigned()
assign total = available + $unsigned(counter);
```

## Using Signed Arithmetic in Verilog

"<<<" and ">>>" tokens result in arithmetic (signed) left and right shifts: multiple by 2 and divide by 2.

Right shifts will maintain the sign by filling in with sign bit values during shift

```
wire signed [3:0] value = 4'b1000; // -8
value >> 2 // results in 0010 or 2
value >>> 2 // results in 1110 or -2
```

## Using Signed Arithmetic in Verilog

# ALL OF THE FOLLOWING ARE TREATED AS UNSIGNED IN VERILOG!!!

- Any operation on two operands, unless both operands are signed
- Based numbers (e.g. 12'd10), unless the explicit "s"

modifier is used)

- Bit-select results a[5]
- Part-select results a[4:2]
- Concatenations

```
reg [15:0] a; // Unsigned
reg signed [15:0] b;
wire signed [16:0] signed_a;
wire signed [31:0] a_mult_b;

assign signed_a = a;//Convert to signed
assign a_mult_b = signed_a * b
```

Example of multiplying signed by unsigned

http://billauer.co.il/blog/2012/10/signed-arithmetics-verilog/

#### Memory Controller



#### 1GB RAM





# **FSM**



#### Potentially Glitchy Solution

```
module (
  input req, clk,
 output ras, mux, cas
  );
                                               REQ
reg [3:0] state, next state:
                                               RAS
parameter [3:0] STATE 0 = 0; // 0000
parameter [3:0] STATE 1 = 1; // 0001
parameter [3:0] STATE 2 = 2; // 0010
                                               MUX
parameter [3:0] STATE 3 = 3; // 0011
parameter [3:0] STATE 4 = 4; // 0100
                                               CAS
                                                       State 0 State 1
                                                                               State 0
always @(posedge clk) state <= next state;</pre>
                                                                  State 2 | State 4
always @ * begin
                                                                       State 3
  case (state)
     STATE 0: next state = req ? STATE 1 : STATE 0;
     STATE 1: next state = STATE 2;
     STATE 2: next state = STATE 3;
     STATE 3: next state = STATE 4;
     STATE 4: next state = STATE 0;
     default: next state = state 0;
  endcase
end
assign ras = !((state==STATE 1)||(state==STATE 2)||(state==STATE 3)||(state==STATE 4));
assign mux = (state==STATE 2) | (state==STATE 3) | (state==STATE 4);
assign cas = !((state==STATE 3)||(state==STATE 4));
endmodule
```

## FSM Output Glitching

- FSM state bits may not transition at precisely the same time
- Combinational logic for outputs may contain hazards
- Result: your FSM outputs may glitch!



If the soda dispenser is glitch-sensitive, your customers can get a 20-cent soda!

#### Registered FSM Outputs are Glitch-Free



- Move output generation into the sequential always block
- Calculate outputs based on next state
- Delays outputs by one clock cycle. Problematic in some application.

#### Glitchy Solution

```
module (
  input req, Klk,
  output reg ras, mux, cas
  );
                                               REQ
reg [3:0] state, next state:
                                               RAS
parameter [3:0] STATE 0 = 0; // 0000
                                               MUX
parameter [3:0] STATE 1 = 1; // 0001
parameter [3:0] STATE 2 = 2; // 0010
parameter [3:0] STATE 3 = 3; // 0011
                                               CAS
parameter [3:0] STATE 4 = 4; // 0100
                                                       State 0 State 1
                                                                               State 0
                                                                  State 2 State 4
always @(posedge clk) state <= next state;
                                                                       State 3
always @ * begin
  case (state)
     STATE 0: next state = req ? STATE 1 : STATE 0;
     STATE 1: next state = STATE 2;
     STATE 2: next state = STATE 3;
     STATE 3: next state = STATE 4;
     STATE 4: next state = STATE 0;
                                      always @(posedge clk) begin
     default: next state = state 0;
                                        ras <= !((next state==STATE 1)||(next state2) . .</pre>
  endcase
end
assign ras = \frac{1}{2} (state==STATE 1) | | (state==STATE 2) | | (state==STATE 4));
assign mux = (state==STATE 2) | | (state==STATE 4);
assign cas = !((state==STATE 3)||(state==STATE 4));
endmodule
  9/27/18
                                       6.111 Fall 2018
                                                                                     50
```

#### Another Glitch Free Solution

```
module (
  input req, clk,
  output reg ras, mux, cas
                                ras
  );
                                   mux
                                                 REQ
                                     cas
reg [3:0] state, next state:
                                                 RAS
parameter [3:0] STATE 0 = 4'b1010;
parameter [3:0] STATE 1 = 4'b0010;
                                                 MUX
parameter [3:0] STATE 2 = 4'b0110;
parameter [3:0] STATE 3 = 4'b0100;
                                                 CAS
parameter [3:0] STATE 4 = 4'b0101;
                                                         State 0 State 1
                                                                                  State 0
always @(posedge clk) state <= next state;</pre>
                                                                     State 2
                                                                             State 4
always @ * begin
                                                                         State 3
  case (state)
     STATE 0: next state = req ? STATE 1 : STATE 0;
     STATE 1: next state = STATE 2;
     STATE 2: next state = STATE 3;
     STATE 3: next state = STATE 4;
                                                                 Hint: You will need four bits for
     STATE 4: next state = STATE 0;
                                                                 your state variable.
     default: next state = STATE 0;
  endcase
end
assign {ras, mux, cas} = {state[3], state[2], state[1]};
endmodule
```

### Alternative Verilog

```
module (
  input req, clk,
  output reg ras, mux, cas
);

reg [3:0] state, next_state:

parameter [3:0] STATE_0 = 4'b1010;
parameter [3:0] STATE_1 = 4'b0010;
parameter [3:0] STATE_2 = 4'b0110;
parameter [3:0] STATE_3 = 4'b0100;
parameter [3:0] STATE_4 = 4'b0101;

always @(posedge clk) state <= next_state;</pre>
```

```
always @ * begin
  case (state)

  STATE_0: next_state = req ? STATE_1 : STATE_0;
  STATE_1: next_state = STATE_2;
  STATE_2: next_state = STATE_3;
  STATE_3: next_state = STATE_4;
  STATE_4: next_state = STATE_0;
  default: next_state = STATE_0;
  endcase
end

assign {ras, mux, cas} = {state[3], state[2], state[1]};
```

```
REQ
RAS
MUX
CAS
State 0 State 1 State 0
State 2 State 4
State 3
```

```
// next_state not needed

always @(posedge clk) begin
  case (state)
    STATE_0: state <= req ? STATE_1 : STATE_0;
    STATE_1: state <= STATE_2;
    STATE_2: state <= STATE_3;
    STATE_3: state <= STATE_4;
    STATE_4: state <= STATE_0;
    default: state <= STATE_0;
    endcase
end</pre>
```

endmodule

### Verilog Grading

#### Logistics

- Verilog submission with 2 days after lab checkoff. Lab must be checkoff first.
- Resubmission for regrade permitted for Lab 2 and Lab 3 only (email grader for regrading)

#### Grading

- Proper use of blocking and non-blocking assignments
- Readable Code with comments and consistent indenting
- Use of default in case statement
- Use of parameter statements for symbolic name and constants (state==5 vs state==DATA\_READY)
- Parameterized modules when appropriate
- Readable logical flow, properly formatted (see "Verilog Editors")
- No long nested if statements.
- 20% off for each occurrence.

# Nexys 4 - DDR



#### Low Cost FPGA Boards



- Basys3
  - Artix-7 FPGA
  - 12 bit VGA
  - Switches/LEDs
  - \$89 (9/2017)
  - Vivado Webpack



- PYNQ-Z1
  - 650MHz dual core Cortex A9
  - Artix-7
  - 512MB DDR3
  - \$65 (9/2017)
  - Vivado Webpack