A quick introduction
At first glance the NES appears to be just another 6502 computer, with a sophisticated case and a controller.
And while this is technically true, let me show you why the CPU is not the central part of this system.
The NES’s CPU is a Ricoh 2A03, which is based on the popular 8-bit MOS Technology 6502 and runs at 1.79 MHz (or 1.66 MHz in PAL systems).
A bit of context
The CPU market in the late 70s and early 80s was quite diverse. If a company wanted to build an affordable microcomputer, the following options were available:
- The Intel 8080: a popular CPU featured in the Altair, the first ‘personal’ computer. It has an 8-bit data bus and a 16-bit address bus.
- The Zilog Z80: an ‘unofficial’ version of the 8080 enhanced with more instructions, registers and internal components. It was sold at a cheaper price and could still execute 8080 programs. Amstrad and Sinclair (among others) chose this CPU.
- The Motorola 6800: another 8-bit CPU designed by Motorola, it contains a completely different instruction set. Many do-it-yourself computer kits, synthesisers and all-in-one computers included the 6800.
As if these options weren’t enough, another company named MOS appeared on the market and offered a redesigned version of the 6800, the 6502. While incompatible with the rest, the new chip was much much less expensive to produce and it was only a matter of time before the most famous computer makers (Commodore, Apple, Atari, Acorn and so forth) chose the 6502 to power their machines.
Back in Japan, Nintendo needed something inexpensive but familiar to develop for, so they selected the 6502. Ricoh, their CPU supplier, successfully produced a 6502-compatible CPU.
How Ricoh managed to clone it isn’t clear to this day. One would expect MOS to have licensed the chip design to Ricoh, but there are many contradictions to this:
- Ricoh’s and MOS’s version feature the same layout, but Ricoh’s one contain severed buses (disabling certain functions). I go into more details later.
- A document explicitly stating that MOS licensed the 6502 to Ricoh is yet to be found.
- An article published in 2008 by Nikkei Trendy states that Ricoh licensed from Rockwell, an authorised chip manufacturer. Although it’s debatable whether a second source was able to provide IP to a third-party, at least with MOS’s approval.
- It wouldn’t be the first time Nintendo got away with circumventing IP rights, as Ikegami Tsushinki v. Nintendo ruled in Japan that Nintendo didn’t own the code of the original Donkey Kong.
The system provides 2 KB of Work RAM (WRAM) for storing variables.
The components of the system are memory-mapped, meaning that they are accessed using memory addresses. The memory space is composed of the program ROM, WRAM, the PPU, the APU and 2 controllers. Each component is explained throughout this article.
The Ricoh 2A03 omits the Binary-Coded Decimal (BCD) mode originally included in the 6502. BCD encodes each decimal digit of a number as a separate 4-bit binary. The 6502 uses 8-bit ‘words’ – meaning that each word stores two decimal digits.
As an example for the curious, the decimal number
42 is represented as:
0010 1010in binary, but
0100 0010in BCD.
We could go on and on talking about it, but to give an outline: BCD is useful for applications that require treating each decimal place separately (for instance, a digital clock). However, it requires more storage since each word can only encode up to the decimal number
99 (whereas traditional binary can encode up to
255 with a eight-bit word).
In any case, Ricoh deliberately broke BCD mode in its chip by severing the control lines that activate it. This was presumably done in an effort to avoid paying royalties to MOS, since BCD was patented by them (and the legislation that enabled to copyright integrated circuit layouts in the United States wasn’t enacted until 1984).
Graphics are generated by a proprietary chip called the Picture Processing Unit (PPU). This chip renders sprites and background graphics, outputting the result to the video signal.
Constructing the frame
As with its contemporaries this chip is designed for the behaviour of a CRT display. There is no frame buffer as such: the PPU will render in step with the CRT’s beam, building the image on-the-fly.
Additionally, the frame that the PPU outputs is built using two different layers. For demonstration purposes, let’s use Super Mario Bros. to show how this works:
The PPU uses tiles as a basic ingredient for producing sprites and backgrounds.
The NES defines tiles as basic 8x8 maps stored in Character memory (found in the cartridge). Each pixel of the tile uses one of four colours (their palettes are defined later).
Groups of four tiles are combined in 16x16 maps called blocks, within which all tiles must share a colour palette.
To start drawing the picture, the PPU first looks for tile references from a set of tables previously populated by the game. Each table is used to build one layer of the frame.
The background layer is a 512x480 map containing static tiles. However, only 256x240 is viewable on the screen, so the game decides which part is selected for display. Games can also move the viewable area during gameplay; that’s how the scrolling effect is accomplished.
Nametables specify which tiles to display as background. The PPU looks for four 1024-byte nametables, each one corresponding to a quadrant of the layer. However, there’s only 2 KB of VRAM available! As a consequence, only two nametables can be stored. The remaining two still have to be addressed somewhere: most games just point the remaining two where the first two are (mirroring).
While this architecture may seem flawed at first, it was actually designed to keep cost down while providing simple expandability: if games needed a wider background, extra VRAM could be included in the cartridge.
Following each nametable is a 64-byte Attribute table that specifies which colour palette is assigned to each block.
Sprites are tiles that can move around the screen. They can also overlap each other, or appear behind the background. The viewable graphic will be decided based on its priority value. It’s the same concept as ‘layers’ in many graphic design programs.
The Object Attribute Memory (OAM) table specifies which tiles will be used as sprites. In addition to the tile index, each entry contains an (x,y) position and multiple attributes (colour palette, a priority and flip flags). This table is stored in a 256-byte DRAM found in the PPU chip.
The OAM table can be filled by the CPU. However, this can be pretty slow in practice (and risks corrupting the frame if not done at the right time), so the PPU contains a small component called Direct Memory Access or ‘DMA’ which can be programmed (by altering the PPU’s registers) to fetch the table from WRAM. With DMA, it’s guaranteed that the table will be uploaded when the next frame is drawn, but bear in mind that the CPU will be halted during the transfer!
The PPU is limited to eight sprites per scanline and up to 64 per frame. The scanline limit can be exceeded thanks to hardware multiplexing: the PPU will alternate sprites between scans; however, they will appear to flicker on-screen.
Once the frame is finished, it’s time to move on to the next one!
However, the CPU can’t modify any table that’s being used by the PPU, so when all scanlines are completed the vertical blank interrupt is called. This allows the game to update the tables without tearing the picture currently displayed. At that moment the CRT’s beam is pointing below the visible area of the screen, into the overscan (or bottom border area).
Secrets and limitations
If you’re thinking that a frame-buffer system with memory allocated to store the full frame would have been preferable: RAM costs were very high, and the console’s goal was to be affordable. This design proved to be very efficient and flexible too!
Some games require the main character to move vertically – thus the nametable will be set up with horizontal mirroring. Other games need their character to move left and right, and so use vertical mirroring instead.
Either type of mirroring will allow the PPU to update background tiles without the user noticing: there is plenty of space to scroll while new tiles are being rendered at a distance.
But what if the character wants to move diagonally? The PPU can scroll in any direction, but without extra VRAM, the edges may end up having to share the same colour palette (remember that tiles are grouped in blocks).
This is why some games like Super Mario Bros. 3 show strange graphics at the right edge of the screen while Mario moves (the game is set up for vertical scrolling). It’s possible that they needed to minimise the hardware cost per cartridge (this game has already a powerful mapper installed).
As an interesting fix: the PPU allowed developers to apply a vertical mask on top of tiles, effectively hiding part of the glitchy area.
Another speciality of Super Mario Bros. 3 is the amount of graphics it could display.
This game displays more background tiles than is strictly permitted. So how is it doing that? If we take two screen captures at different times while the display is generated, we can see that the final frame is actually composed of two different frames.
The wizardry here is that the game cartridge actually uses an extra semi-custom chip, the MMC3, to map in one of two different character chips. By checking which part of the screen the PPU is requesting, the mapper will redirect to one chip or the other – thus allowing more unique tiles on-screen than was originally supported.
A dedicated component called Audio Processing Unit (APU) provides this service. Ricoh embedded it inside the CPU chip to avoid unlicensed cloning of both CPU and APU.
This audio chip is a Programmable Sound Generator (PSG), which means that it can only produce pre-defined waveforms.
The APU has five channels of audio – each one reserved for a specific waveform. The music data is found in the program ROM.
Each waveform contains different properties that can be altered to produce a specific note, sound or volume.
These five channels are continuously mixed and sent through the audio signal.
Let’s now discuss the type of wave-forms synthesised by the APU:
Pulse waves have a very distinct beep sound that is mainly used for melody or sound effects.
The APU reserves two channels for pulse waves. Each can use one of three different voices, produced by varying pulse widths.
Most games used one pulse channel for melody and the other for accompaniment.
When the game needs to play a sound effect, the accompaniment channel is switched to play the effect and then returns to accompanying. This avoids interrupting the melody during gameplay.
This waveform serves as a bassline for the melody. Modifying its pitch dramatically can also produce percussion.
The APU has one channel reserved for this type of wave.
The volume of this channel can’t be controlled, possibly because the volume control is used to construct the triangle.
Noise is basically a set of random waveforms that sound like white static. One channel is allocated for it.
Games use the noise channel for percussion or ambient effects.
This channel has only 32 presets available. Half (16) of these presets produce clean static, and the other half produce robotic static.
Samples are recorded pieces of music that can be replayed. As you can see, this doesn’t have to be a single waveform.
The APU has one channel dedicated to samples. It has the following capabilities: 7-bit depth, 4.2-33.5 kHz sampling rate, and no volume control.
Samples are significantly bigger than single waveforms, so games normally store small pieces (like drum samples) that can be played repeatedly.
Secrets and limitations
While the APU was not comparable to the quality of a vinyl, cassette or CD, programmers did find ways of expanding its capability, thanks to the modular architecture of the NES.
The Japanese model of the NES, the Famicom, provided exclusive cartridge pins available for sound expansion. Games like Castlevania 3 included the Konami VRC6 chip, which allowed two extra pulse waves and a sawtooth wave.
Check out the difference between the Japanese version and the American versions (which didn’t have sound expansion).
Some games used tremolo effects to simulate more channels.
They are mainly written in the 6502 assembly language and reside in the program ROM, while the game’s graphics (tiles) are stored in Character memory.
The 16-bit address space limits the system to 64 KB of addressable memory. The system I/O is memory mapped – that only leaves around 8 KB of available storage for the program. If a game required extra space, extra chips (mappers) would be included in the cartridge, with an attendant increase in production costs.
Some cartridges included an additional battery-backed WRAM to store saves.
Anti-piracy and region lock
Nintendo was able to block unauthorised publishing thanks to the inclusion of a proprietary lockout chip called the Checking Integrated Circuit (CIC). It is located in the console and is connected to the reset signal (and thus not easily removed).
This chip runs 10NES, an internal program that checks for the existence of another lockout chip in the game cartridge. If that check fails then the console is sent into an infinite reset.
Both lockout chips are in constant communication during the console’s uptime. The system can be defeated by cutting one of the pins on the console’s lockout, which leaves the chip in an idle state. Alternatively, sending it a -5V signal can freeze it.
The CIC exists as a result of the fear caused by the video game crash of 1983. Nintendo’s then president Hiroshi Yamauchi decided that in order to enforce good quality games they would be in charge of approving every single one of them.
You’ll notice that the Japanese model of the console, the Famicom, was released before 1983’s crash. That’s why the CIC circuitry is used for sound expansions instead.