Consider the following circuit:

The heavy lines represent busses, which are many signals grouped together, e.g., an eight-bit bus is eight separate signal wires that are treated as a group. When a bus connects to one or more components, it's just shorthand for drawing the individual signal wires between those components. The components with triangular schematic symbols are tristate drivers that operate like buffers except that they have an additional control input called an enable. When the enable is high, the buffer is on and the input value is driven onto the output. When the enable is low, the buffer is off and doesn't drive anything onto its output (i.e., the output pin is in a high-impedance state). What rules should the designer follow when designing the logic that generates DRA, DRB and DRALU in order to ensure that the DBUS signals always have legitimate values? The designer should have at most one of the signals DRA, DRB, or DRALU asserted during any clock cycle. While these signals are being computed, it is possible that more than one might be asserted simultaneously due to logic glitches or computation paths of different length. The designer should take care to eliminate or at least minimize this occurrence. Discussed in section Draw a schematic showing how a tristate driver might be implemented using mosfets. Hint: The following schematic shows one way of implementing a tristate driver.

You just have to fill in the logic inside each of the clouds-think about for what values of DATA, ENABLE you want the pullup to be on and replace the upper cloud with one or more logic gates that implement that equation. Ditto for the pulldown and lower cloud. Discussed in section The register-like symbols labeled "Reg A" and "Reg B" also have an additional enable input and are called load-enabled registers. When the enable is high, the register will be loaded from the incoming data. When the enable is low, the register reloads itself with its previous value. Show how to implement a load-enabled register from a regular D-register and a 2-way multiplexer. Just add the mux before the register to clock in either the old value or the new value

It's considered bad practice to control the loading of a register by "gating" its clock, i.e., by adding some logic that controls whether or not the register sees a rising clock edge. Briefly explain why "gating the clock" is discouraged. Hint: consider the effects of clock skew and logic hazards. "Gating" the clock is not good practice for 2 reasons:

(1) The possibility of glitches in the load enable signal. Combinational logic often controls the load enable signal, so there may be periods of time when the load enable momentarily changes value. If load enable is supposed to remain low, but changes to a high value while CLK has a high value, then the register will see a rising clock edge and sample its inputs. This unwanted sampling of its inputs may cause the register to remember incorrect values, or even worse, enter a metastable state if the inputs are changing.

(2) Clock skew. Placing a logic gate in front of the CLK input of a register introduces delay, as the logic gate has intrinsic delay. This delay skews the clock signal, and as we have seen, clock skew can require slower clocking of the circuit or could cause hold time requirements to be violated. The arithmetic-logic unit (ALU) has two data inputs (A and B) and, in this circuit, can perform only two operations, based on the single control signal FN:

The ALU also generates two condition codes which give us some additional information about the ALU output:

Assuming that we have 8-bit data values and use a two's complement representation for the data values processed by the ALU, draw gate-level schematics for the logic that generates the Z and N signals from the ALU output value. Discussed in section Your job is to build a controller that will cause the circuit above to execute the following algorithm which computes the greatest common divisor of two inputs:

	while (a != b)
		if (a > b) a = a - b;
	 	else b = b - a;
The controller will be a state machine that takes 2 bits of input (Z and N) and produces control signals for the data paths (DRA, DRB, DRALU, LDA, LDB, FN).

Draw a state diagram for the controller. Outputs from your FSM should depend only on the current state. Indicate which are the initial and final states of your FSM on the diagram. Discussed in section Supply a truth table for the logic that generates the control signals. The material in this question will not covered by any quizzes. It's presented here as an extended example of a programmable datapath.

The following diagram shows the datapath and control circuitry for a nifty little microprogrammed architecture the students used to build in the 6.004 lab:

Some features of the MAYBE:

If we inadvertently switch connections on two of the wires that run from the MAR register to the address inputs of the SRAM will operation be affected? Is your answer the same if we switch two wires running between ADRHI/ADRLO and the address inputs to the UROM? If the answers are not the same, what constitutes the difference between the SRAM and UROM? Reordering connections between the MAR and the address inputs of the SRAM won't have any observable effect since there will still be a unique location for each possible MAR value (the actual location in SRAM will change but who could tell?).

Switching the ADRHI/ADRLO connections could be detected since after the switch incrementing the registers would not fetch the immediately adjacent location. If we also permuted the contents of the UROM to match the change in address wiring, the switch would not be detectable. What, if anything, prevents two drivers from putting conflicting data on the data bus of the MAYBE (ignore transients during propagation delays of the control circuitry)? Can such conflicts happen if there are programming errors in the Control ROM? The data bus drivers are controlled by a 3-to-8 decoder. For any given 3-bit input, this device asserts only one of its outputs, so no conflicts are possible. Programming errors might result in the wrong value being driven onto the bus, but never multiple values at the same time. Given a big enough Control ROM, could the LDSEL and DRSEL decoders be eliminated (producing the load and drive signals directly as Control ROM outputs)? If so, what advantage might this have? Yes, simply replace each 3-bit control value that drives the decoder with an 8-bit value that connects directly to the LDxx or DRxx control signals. Now each signal could be asserted independently, perhaps in concert with other signals. This isn't useful for the DRxx signals (see answer to previous question), but would allow several registers to be loaded with the same data bus value simultaneously. Execution of a nanoprogram can be influenced by information for the datapath? Explain how a nanoprogram can make data-dependent decisions. The low-order control ROM address bit comes from a shift register that is loaded with condition codes from the ALU. Non-data-dependent nanoinstructions are loaded twice into consecutive even/odd locations of the control ROM, so the address bit from the shift register will select the same instruction regardless of whether it's is 0 or 1. However if different instructions are loaded into the even/odd locations, the nanoprogram will execute differently depending on the output of the shift register.

By shifting the shift register before executing the data-dependent nanoinstruction, it's possible to use any of the latched condition codes. Given a big enough Control ROM, could the condition shift register be eliminated (using the condition bits directly as Control ROM inputs)? If so, what advantage might this have? How many more (or fewer) outputs and inputs would the Control ROM need to have to implement this? What would be the size (in bits) of the Control ROM? In theory if we used the 7 condition code signals as additional address signals, we could test all 7 bits at the same time and execute one of 128 different instructions as a result. We might want to add a latch-enabled register to capture the signals on some specific cycle and save them for testing at some later cycle (the current design does this using the CONDCTL signals).

With 7 additional address inputs the control rom would grow from 213 locations to 219 locations. Assuming we need to control a latch-enabled register to capture the signal values, the number of control outputs would decrease by 1 (CONDCTL would go from 2 bits to 1). The nanoinstruction shown above selects the UROM as the data source and asserts ADR+ during the same clock cycle. How does this work, i.e., is the original or incremented address used when accessing the UROM? The increment happens at the end of the clock cycle (ie, at the next rising edge of the clock), so for this current clock cycle the original address is used. What does the following nanocode program do?

OpcodePhaseCOND=ADR+ALUCCDRSELLDSELComment
000010100000*=111111111001101MAR = uROM; ADR+
000010100001*=011111111100010A = SRAM
000010100010*=111111111001101MAR = uROM; ADR+
000010100011*=011111111100011B = SRAM
000010100100*=111111111001101MAR = uROM; ADR+
000010100101*=010011000010100SRAM = A + B; latch CCs
000010100110*=111111111001000OP = uROM; ADR+
Implements the "ADD(X,Y,Z)" microprogram instruction which stores the sum of SRAM locations X and Y into SRAM location Z. ADD has an opcode of 00001010 and takes 7 cycles to execute.
What does the following nanocode program do?

OpcodePhaseCOND=ADR+ALUCCDRSELLDSELComment
000010110000*=111111111001010A = uROM; ADR+
000010110001*=111111111001101MAR = uROM; ADR+
000010110010*=011111111100011B = SRAM
000010110011*=111111111001101MAR = uROM; ADR+
000010110100*=010011000010100SRAM = A + B; latch CCs
000010110101*=111111111001000OP = uROM; ADR+
Implements the "CADD(CX,Y,Z)" microprogram instruction which stores the sum of the constant CX and SRAM location Y into SRAM location Z. CADD has an opcode of 00001011 and takes 6 cycles to execute.
What does the following nanocode program do?

OpcodePhaseCOND=ADR+ALUCCDRSELLDSELComment
000011000000*=111111111001010A = uROM; ADR+
000011000001*=011111111001001ADR = uROM
000011000010*=011111111010001ADR = A
000011000011*=111111111001000OP = uROM; ADR+
Implements the "JMP(addrlo,adrhi)" microprogram instruction which changes the microcode program counter to the specified address. JMP has an opcode of 00001100 and takes 4 cycles to execute.
What does the following nanocode program do?

OpcodePhaseCOND=ADR+ALUCCDRSELLDSELComment
000101110000*=111111111001010A = uROM; ADR+
000101110001*=011111101010010Shift CC's
0001011100101=011111111001001ADR = uROM
0001011100111=011111111010001ADR = A
0001011101001=111111111001000OP = uROM; ADR+
0001011100100=111111111001010A = uROM; ADR+
0001011100110=111111111001000OP = uROM; ADR+
Implements the "JNC(addrlo,adrhi)" microprogram instruction which changes the microcode program counter to the specified address if the carry bit (captured from the ALU by some previous microinstruction) is not set. JNC has an opcode of 00010111 and takes 4 cycles to execute.
What does the following nanocode program do?

OpcodePhaseCOND=ADR+ALUCCDRSELLDSELComment
000000110000*=011001111010101MAR = 11111111
000000110001*=011111111100010A = SRAM
000000110010*=011111011010100SRAM = A - 1
000000110011*=111111111001101MAR = uROM; ADR+
000000110100*=011111111100011B = SRAM
000000110101*=011111111010101MAR = A
000000110110*=010101111010100SRAM = B
000101110111*=111111111001000OP = uROM; ADR+
Implements the "PUSH(x)" microprogram instruction which decrements the microstack pointer (stored in SRAM location 255 = 0xFF) and then stores the contents of SRAM location x in the SRAM location pointed to by the microstack pointer. PUSH has an opcode of 00000011 and takes 8 cycles to execute.