6.111 Lab #3

Goal: Implement a simple Pong like game with images on a VGA monitor with sound. In place of a puck, an image of a Death Star from Stars Wars will be used. (Death Star from F2017 Project by Nicholas Waltman/Mike Wang.)

During checkoff you may be asked to discuss one or more of the following questions:

Most video displays accept the image to be displayed in a serial fashion, usually a sequence of horizontal scan lines to be displayed one under another with a small vertical offset to create a raster image. Typically the raster is transmitted in left-to-right, top-to-bottom order. A complete raster image is called a frame and one can create the appearance of motion by displaying frames in rapid succession (24 frames/sec in movies, 30 frames/sec in broadcast TV, 60+ frames/sec in computer monitors).

To transmit a raster image, one must encode the color image information and provide some control signals that indicate the end of each horizontal scan line (horizontal sync) and frame (vertical sync). The display device creates the image using red, green and blue emitters, so an obvious way to encode the color information is to send separate signals that encode the appropriate intensity of red, green and blue. This is indeed how most analog computer monitors work -- they accept 5 analog signals (red, green, blue, hsync, and vsync) over a standardized HD15 connector. The signals are transmitted as 0.7V peak-to-peak (1V peak-to-peak if the signal also encodes sync). The monitor supplies a 75Ω termination for each signal, which if matched with a driver and cable with a characteristic impedance of 75Ω minimizes the interference due to signal reflections. The labkit incorporates an integrated circuit -- the ADV7125 Triple 8-bit high-speed video DAC -- which produces the correct analog signals given the proper digital inputs: three 8-bit values for R, G and B intensity, hsync, vsync, and blanking.

[Small digression on other video encodings; feel free to skip...]

When encoding a color video image for broadcast or storage, it's important to use the bandwidth/bits as efficiently as possible. And, in the case of broadcast, there was the issue of backwards compatibility with black-and-white transmissions. Since the human eye has less resolution for color than intensity, the color image signal is separated into luminance (Y, essentially the old black-and-white signal) and chrominance (U/Cr/Pr, V/Cb/Pb). YUV are related to RGB as follows:

Luminance and chrominance are encoded separately and transmitted/stored at different bandwidths. In most systems the chrominance bandwidth is a half (4:2:2 format) or a quarter (4:2:0 format) of the luminance bandwidth. There are several common ways of transmitting Y, U and V:

Composite video where Y and the composite sync are combined to form a 1V peak-to-peak signal. U+V and U-V are used to modulate orthogonal phases of a color subcarrier (3.58MHz in NTSC broadcasts) and then mixed with a low-pass-filtered version of Y/sync signal.
S-Video where Y and the modulated color subcarrier are transmitted on separate signal/ground pairs. This avoids the low-pass filtering of Y used in composite video, resulting in a higher-resolution video image.
Component video where Y, Cr/Pr, and Cb/Pb are transmitted on separate signal/ground pairs (Cr and Cb are just scaled versions of U and V).

Some transmission schemes break a frame into an even field (containing the even numbered scan lines) and an odd field (containing the odd numbered scan lines) and then transmit the fields in alternation. This technique is called interlacing and permits slower frame rates (and hence lower bandwidths) while still avoiding the problem of image flicker. When higher bandwidths are available, non-interlaced transmissions are preferred (often called progressive scan).

The labkit contains interface chips for encoding (ADV7194) and decoding (ADV7185) composite and S-Video signals. The decoder chip is particularly useful if you want to use a video camera signal as part of your project.

To create a video image for our Pong game, it's helpful to think of the image as a rectangular array of picture elements or pixels. There are several common choices for the dimensions (HxV) of the rectangle:

The computer monitors in the lab support resolutions up to 1280x1024 but the required pixel clock doesn't leave much time for the game logic to figure out the pixel to display, so let's go with a 1024x768 display for our game.

Please take a moment to read through the "VGA Video" hardware tutorial that's part of the on-line Labkit documentation. You'll see that the timings for the RGB image information relative to the horizontal and vertical syncs are somewhat complicated. For example, the horizontal sync goes active in the interval between the end of one scan line and the beginning of the next -- the exact timings are specified by the XVGA specification. Lab3.v includes an xvga module that generates the necessary signals; it uses two counters:

The xvga module also generates blank, a signal that's 0 when a pixel value will be displayed and 1 when the pixel would be off the screen (hcount > 1023 or vcount > 767). The inversion of this signal is required by the AD7125 VGA interface chip You can use (hcount,vcount) as the (x,y) coordinate of the pixel to be displayed: (0,0) is the top-left pixel, (1023,0) is the top-right pixel, (1023,767) is the bottom-right pixel, etc. Given the coordinates and dimensions of a graphic element, your game logic can use (hcount,vcount) to determine the contribution the graphic element makes to the current pixel. If you are storing the pixels in a memory array (called a frame buffer) then the index of the current pixel would be H*vcount + hcount[9:0], where H is the number of displayed pixels in each scan line.

Generally images are stored in a compressed form to save on space. Two commonly used formats are PNG (portable network graphics) and JPG (Joint Photographic Experts Group) formats. PNG is a lossless compression while JPG is a lossy compression. The human eye, however, generally will not be able to notice the loss in fidelity with lossy compression. Another format is BMP, an uncompressed file format. With BMP, the image is stored in a two dimensional memory (frame buffer) with coordinates ( i, j) corresponding to the i^th column, j^th row pixel in the image. Each pixel can be represented as single bit (black or white) or up to 24 bits for color. The image Death Star below is 256 x 240 pixels.

However, frame buffer memory in digital system is generally organized as a flat one dimensional memory or a linear memory model with an index into a single contiguous address space. The conversion of the addressing from two dimensions to linear addressing is straight forward. For a given pixel of the image at location ( i, j) of the image, in our Death Star example, the index in a linear address for that pixel is i + j*256 where 256 is the width of the image.

The video DAC provides for 8 bits RGB for a total of 24 bits. Using 24 bits is considered to be true color since any color from a palette of 16 million (2**24) can be displayed. [Note: use "**" for exponentation and not ^. The symbol ^ is the XOR function in Verilog.] When memory is a constrained and it generally is, a color map is used to reduced the memory usage yet still display 24 bits of color. This is accomplished by reducing the palette of 16 million colors available. In our example, using 8 bits for each pixel we can display 256 different colors. The 8 bit value is then used as an index to three color maps (for RGB) resulting in the 24 bit value sent to the VGA output. This limits the image to just 256 colors from a palette of 16 million.

Pong was one of the first mass-produced video games, a hit more because of its novelty than because of the gaming experience itself. Our version will be a single-player variation where the player is defending a "goal" by moving a rectangular paddle up and down the left edge of the screen. The puck (Death Star) moves about the screen with a fixed velocity, bouncing off the paddle and the implicit walls at the top, right and bottom edges of the screen. If the puck (Death Star) reaches the left edge of the screen (i.e., it wasn't stopped by bouncing off the paddle), the player looses and the game is over:

A 65MHz clock serves as the system clock and times the duration of a single pixel. The position of moving objects (e.g., the paddle and puck) are changed once every frame (1/60 second) during the high-to-low transition of vsync.

To keep the initial implementation easy, let's make the puck a 64-pixel by 64-pixel square and have it move at move diagonally at a constant velocity. We'll use switch[7:4] to set the puck's velocity in terms of pixels/frame: 4'b0000 means no motion, 4'b0101 would cause the puck (Death Star) to change both its x and y coordinate by 5 every frame (the sign of the change for each coordinate would be determined by which of the 4 possible headings the puck (Death Star) is following at the moment). When the puck (Death Star) collides with an edge or the paddle, its heading changes appropriately, e.g., a collision with the bottom edge changes the sign of the puck's y velocity.

Make the paddle 16 pixels wide and 128 pixels high. It should move up and down the left edge of the screen at 4 pixels/frame in response to the user pressing the UP or DOWN buttons on the labkit.

Pressing the ENTER button should reset the game to its initial state: the paddle centered on the left edge, and the puck (Death Star) somewhere in the middle of the screen, heading southeast. If the puck (Death Star) reaches the left edge, the game should stop (it can be restarted by pressing the ENTER button).

You may find it useful to use the following parameterized module in your implementation of Pong. Given the pixel coordinate (hcount,vcount) it returns a non-black pixel if the coordinate falls with the appropriate rectangular area. The coordinate of the top-left corner of the rectangle is given by the x and y inputs; the width and height of the rectangle, as well as its color, are determined by module's parameters.

You can instantiate several instances of blob to create different rectangles on the screen, using #(.param(value),...) to specify the instance's parameters:

[From the "more than you wanted to know" department:] blob is a very simple example of what game hardware hackers call a sprite: a piece of hardware that generates a pixel-by-pixel image of a game object. A sprite pipeline connects the output (pixel & sync signals) of one sprite to the input of the next. A sprite passes along the incoming pixel if the object the sprite represents is transparent at the current coordinate, otherwise it generates the appropriate pixel of its own. The generated pixel might come from a small image map and/or depend in some way on the sprite's internal state. Images produced by sprites later in the pipeline appear in front of sprites earlier in the pipeline, giving a pseudo 3D look to the same. This becomes even more realistic if sprites scale the image they produce so that it gets smaller if the object is supposed to be further away. The order of the pipeline becomes unimportant if a "Z" or depth value is passed along the pipeline with each pixel. The current sprite only replaces the incoming pixel/Z-value if its Z-value puts it in front of the Z-value for the incoming pixel. Simple, but sprites produced surprisingly playable games in the era before the invention of 3D graphic pipelines that can render billions of shaded triangles per second.]

Here is a modification of the blob module used to display an image. For simplicity, we use just one color map and displayed a greyscale image.

vclock	input	65MHz pixel clock
reset	input	1 to reset the module to its initial state, hooked to the ENTER pushbutton via a debouncing circuit
up	input	1 to move paddle up, 0 otherwise. Hooked to the UP pushbutton via a debouncing circuit.
down	input	1 to move paddle down, 0 otherwise. Hooked to the DOWN pushbutton via a debouncing circuit.
pspeed[3:0]	input	Puck (Death Star) horizontal & vertical velocity in pixels per frame. Hooked to switch[7:4]
hcount[10:0]	input	Counts pixels on the current scan line, generated by the xvga module.
vcount[9:0]	input	Counts scan lines in the current frame, generated by the xvga module.
hsync	input	Active-low horizontal sync signal generated by the xvga module.
vsync	input	Active-low vertical sync signal generated by the xvga module.
blank	input	Active-high blanking signal generated by the xga module.
phsync	output	Active-low horizontal sync signal generated by your Pong game. Often this is just hsync, perhaps delayed by a vclock if your pixel generating circuitry takes an additional vclock.
pvsync	output	Active-low horizontal sync signal generated by your Pong game. Often this is just vsync, perhaps delayed by a vclock if your pixel generating circuitry takes an additional vclock.
pblank	output	Active-high blanking signal generated by your Pong game. Often this is just blank, perhaps delayed by a vclock if your pixel generating circuitry takes an additional vclock.
pixel[23:0]	output	The {R,G,B} value for the current pixel, eight bits for each color.