This article welcomes anyone that wants to help with translations or contributions.
At first glance, the NES can be considered just another 6502 computer with a sophisticated case and a controller.
And the fact is, this is technically right, but let me show you why the CPU will not actually be the central part of this system.
The NES’s CPU is a Ricoh 2A03, which is based on the popular 8-bit MOS Technology 6502 and runs at 1.79 MHz (or 1.66 MHz in PAL systems).
The CPU market in the late 70s and early 80s was quite diverse. If any company wanted to build an affordable microcomputer, the following options were available:
As if wasn’t enough, another company with the name MOS appeared on the market and offered a redesigned version of the 6800 called 6502. While incompatible with the rest, the new chip was much much less expensive to produce and it was a matter of time before most of the famous computer makers (Commodore, Apple, Atari, Acorn and so forth) chose the 6502 to power their machines.
Back in Japan, Nintendo needed something inexpensive but familiar to develop for, so they selected the 6502. Ricoh, their CPU supplier, successfully produced a 6502-compatible CPU by licensing the chip designs from MOS and subsequently making some modifications here and there (we’ll go over the details later on).
The system provides 2 KB of Work RAM (WRAM) for storing variables.
The components of the system are memory-mapped, meaning that they are accessed using memory addresses. The memory space is composed of the Program ROM, WRAM, the PPU, the APU and 2 controllers, each component is explained throughout this article.
The Ricoh 2A03 happens to omit the Binary-coded Decimal or ‘BCD’ mode originally included in the 6502. BCD enables to encode each decimal digit of a number as a separate 4-bit binary, and since the 6502 uses 8-bit words, each word stores two decimal digits.
As an example, the decimal number ‘42’ is:
This mode is useful for applications that require treating each decimal place separately (for instance, a digital clock). However, it requires more storage since each word can only encode up to the decimal number ‘99’ (where as traditional binary can encode up to ‘255’).
In any case, Ricoh deliberately broke BCD mode in its chip by severing the control lines that activate it. This was presumably done in an effort to avoid paying royalties to MOS, since BCD was patented by them (and the necessary legislation to copyright integrated circuit layout in the United States wasn’t enacted until 1984).
Graphics are generated by a proprietary chip called the Picture Processing Unit or ‘PPU’ for short. This chip renders sprites and background graphics, outputting the result to the video signal.
As with its contemporaries this chip is designed for the behaviour of a CRT display. There is no frame-buffer as such: the PPU will render in step with the CRT’s beam, building the image on-the-fly.
Additionally, the frame that the PPU outputs is built using two different layers. For demonstration purposes, Super Mario Bros will be used as example to show how this works:
The PPU uses tiles as a basic ingredient for producing sprites and backgrounds.
The NES defines tiles as basic 8x8 maps stored in Character Memory (found in the cartridge). Each pixel of the tile uses one of four colours (their palettes are defined later).
Four Tiles are combined in 16x16 maps called blocks where they have to share the same colour palette.
In order to start drawing the picture, the PPU first looks for tile references from a different set of tables previously populated by the game. Each table is used to build a layer of the frame.
The Background layer is a 512x480 map containing static tiles, however only 256x240 is viewable on the screen, so the game decides which part is selected for display. Games can also move the viewable area during gameplay, that’s how the Scrolling Effect is accomplished.
Nametables specify which tiles to display as background. The PPU looks for four 1024-byte Nametables, each one corresponding to a quadrant of the layer. However, there’s only 2 KB of VRAM available! As a consequence only two Nametables can be stored, the remaining two still have to be addressed somewhere: Most games just point the remaining two where the first two are (Mirroring).
While this architecture may seem flawed at first, it was actually designed to keep cost down while providing simple expandability: If games needed a wider background, extra VRAM in the cartridge could be included.
Following each nametable is a 64-byte Attribute Table that specifies which colour palette is assigned to each block.
Sprites are tiles that can move around the screen. They can also overlap each other and appear behind the background, the viewable graphic will be decided based on its priority value.
The Object Attribute Memory (OAM) table specifies which tiles will be used as sprites. In addition to the tile index, each entry contains an (x,y) position and multiple attributes (colour palette, a priority and flip flags). This table is stored in a 256-byte DRAM memory found in the PPU chip.
The OAM table can be filled by the CPU, however this can be pretty slow in practice (and risks corrupting the frame if not done at the right time), so the PPU contains a small component called Direct Memory Access or ‘DMA’ which can be programmed (by altering the PPU’s registers) to fetch the table if it’s stored in WRAM. With DMA, it’s guaranteed that the table will be uploaded when the next frame is drawn, but bear in mind that the CPU will be halted during the transfer!
The PPU is limited to eight sprites per scanline and up to 64 per frame. The scanline limit can be exceeded thanks to hardware multiplexing: The PPU will alternate sprites between scans, however they will appear to flicker on-screen.
Once the frame is finished, it’s time to move on to the next one!
However, the CPU can’t modify any table while the PPU is using them, so when all scanlines are completed the Vertical Blank interrupt is called. This allows the game to update them without tearing the picture currently displayed. At that moment the CRT’s beam is pointing below the visible area of the screen into the overscan, or bottom border area.
If you’re thinking that a frame-buffer system with memory allocated to store the full frame would have been preferable: RAM costs were very high and the console’s goal was to be affordable. This design proved to be very efficient and flexible too!
Some games require the main character to move vertically, thus the Nametable will be set up with Horizontal Mirroring. Other games want their character to move left and right, then Vertical Mirroring is used instead.
The specific type of mirroring will allow the PPU to update background tiles without the user noticing: There is plenty of space to scroll while new tiles are being rendered at distance.
But what if the character wants to move diagonally? The PPU can scroll in any direction, but without extra VRAM, the edges may ended up having to share the same colour palette (remember that tiles are grouped in blocks).
This is why some games like Super Mario Bros 3 show strange graphics at the right edge of the screen while Mario moves (the game is set up for vertical scrolling). It’s possible that they needed to keep the costs down regarding the amount of hardware needed in the cartridge (This game has already a powerful mapper installed).
As an interesting fix: the PPU allowed to apply a one vertical mask on top of tiles, effectively hiding part of the glitches.
Another specialty about Super Mario Bros 3 was the amount of graphics it could display.
This game happens to display more background tiles than strictly permitted. So how is it doing that? If we build screen captures at two different points as the display is generated we can see that the final frame is actually composed of two different-looking frames.
The wizardry here is that the game cartridge actually has an extra semi-custom chip MMC3 to map in one of two different character chips. By checking which part of the screen the PPU is requesting the mapper will redirect to one chip or the other, thus allowing more unique tiles in the screen than originally believed supported.
A dedicated component called Audio Processing Unit or ‘APU’ for short provides this functionality. Ricoh embedded it inside the CPU chip to avoid unlicensed cloning of both CPU and APU.
This audio chip is a Programmable Sound Generator (PSG), which means that it can only produce pre-defined wave-forms. The APU has five channels of audio, each one is reserved for a specific wave-form. The music data is found in the Program ROM.
Each wave-form contains different properties that can be altered to produce a specific note, sound or volume. These five channels are continuously mixed and sent through the audio signal.
Let’s now discuss the type of wave-forms synthesised by the APU:
Pulse waves have a very distinct beep sound that is mainly used for melody or sound effects.
The APU reserves two channels for one pulse-wave each. These channels use one of three different voices by varying its pulse-width.
Most of the games use one pulse channel for melody and the other for accompaniment.
When the game requires to play a sound effect, the accompaniment is switched to play the effect and then returns to accompanying, this avoids interrupting the melody during gameplay.
This wave-form serves as a bass-line for the melody, modifying its pitch dramatically can produce percussion as well.
The APU has one channel available only for this type of wave.
The volume of this channel can’t be controlled, possibly because the volume control is used to construct the triangle.
Noise is basically a set of random wave-forms that sound like white static. One channel is allocated for it.
Games use it for percussion or ambient effects.
This channel has only 32 presets available, one half produces clean static and the other produces robotic static.
Samples are recorded pieces of music that can be reproduced. As you can see this doesn’t have to be a single wave-form.
The APU has one channel dedicated for it with the following capabilities: 7-bit depth, 4.2-33.5 kHz sampling rate and no volume control.
The size of a sample is significantly bigger than the space required to program a single waveform, so games normally store small pieces (like drum sets) that can be played repeatedly.
While the APU was not comparable to the quality of a Vinyl, Cassette or CD, programmers did find different ways of expanding the capability thanks to the modular architecture of this console.
The Japanese model of the NES, the Famicom, provided exclusive cartridge pins available for sound expansion, games like Castlevania 3 included the Konami VRC6 chip, which allowed two extra pulse waves and a sawtooth wave.
Check out the difference between the American version (which didn’t have capabilities for sound expansion).
Some games used tremolo effects to simulate more channels.
They are mainly written in 6502 assembly language and reside in the Program ROM while its graphics (tiles) are stored in the Character Memory.
The 16-bit address space limits the system to 64 KB of addressable memory. The system I/O is memory mapped, that only leaves around 8 KB of available storage for the program. If a game required extra space, extra chips (mappers) would be included in the cartridge, with an attendant increase in production costs.
Some cartridges included an additional battery-backed WRAM to store saves.
Nintendo was able to block unauthorised publishing thanks to the inclusion of a proprietary Lockout chip called Checking Integrated Circuit or CIC, it’s located in the console and is connected to the reset signals (and is not easily removed).
This chip runs 10NES, an internal program that checks for the existence of another Lockout chip in the game cartridge, if that check fails then the console is sent into an infinite reset.
Both lockout chips are in constant communication during the console’s uptime. This system can be defeated by cutting one of the pins on the console’s Lockout, this leaves the chip in an idle state. Alternatively, sending it a -5V signal can freeze it.
The CIC exists as a result of the fear caused by the video game crash of 1983. Nintendo’s then president Hiroshi Yamauchi decided that in order to enforce good quality games they would be in charge of approving every single one of them. You’ll notice that the Japanese model of the console, the Famicom, was in fact released before this event happened, that’s why the CIC circuitry is used for sound expansions instead.
This article is part of the Architecture of Consoles series. If you found it interesting please consider donating, your contribution will be used to get more tools and resources that will help to improve the quality of current articles and upcoming ones.
A list of desirable tools and latest acquisitions for this article are tracked in here:
## Interesting hardware to get (ordered by priority) - NTSC NES or JAP Famicom (only if found at a reasonable price) - NESRGB kit (still very expensive, may be better to wait for that) - Any development cart out there (only if found at a reasonable price)
Alternatively, you can help out by suggesting changes and/or adding translations.
Always nice to keep a record of changes.
## 2020-08-23 - Added some historical context to the CPU section - Corrected assumptions about the lack of BCD, thanks @danweiss and @konrad - (Main diagram) Removed CPU connection to Character RAM, thanks @danweiss ## 2020-06-13 - Added mention to OAM DMA ## 2020-06-06 - Expanded BCD mode - Redesigned main diagram (the NES diagram was the first one for this site, since then the style evolved a lot!) ## 2019-09-17 - Added a quick introduction ## 2019-04-06 - Corrected wrong explanation about tile glitches ## 2019-02-17 - Fixed Grammar - Replaced images and videos with better quality ones. ## 2019-01-25 - Improved first draft with the help of @dpt - Ready for publication