The Playstation 2 was not one of the most powerful consoles of its generation, yet it managed to achieve a level of popularity unthinkable for other companies.
This machine is nowhere near as simple as the original Playstation was, but we will see why it didn’t share the same fate of previous complicated consoles.
At the heart of this console we find a powerful package called Emotion Engine or ‘EE’ designed by Sony and running at ~294.91 MHz. This chipset contains multiple components, one of them being the main CPU, while the rest are at the CPU disposal to speed up certain tasks.
The main core is a MIPS R5900-compatible CPU with lots of enhancements. This is the first chip that starts executing instructions after the console is turned on. This processor provides the following features:
The core is complemented with a dedicated floating point unit (identified as ‘COP1’) that accelerates operations with 32-bit floating point numbers (also known as
floats in C).
Next to the Emotion Engine are two blocks of 16 MB of RAM, giving a total of 32 MB of main memory. The type of memory used is RDRAM (déjà vu!) which is accessed through a 16-bit bus.
At first, this can be a little disappointing to hear, considering the internal bus of the Emotion engine is as wide as 128 bits. However both RAM chips are strategically placed by following the dual-channel architecture, which consists in connecting both chips using two independent 16-bit buses (one bus on each chip) to improve data throughput. The resulting setup provides a theoretical 3.2 GB/sec, so rest assured that memory latency is not an issue in this console!
At the heart of the Emotion engine there is a powerful DMA Controller or ‘DMAC’ that transfers data between main memory and Scratchpad; or between main memory and any component inside the EE. Data transfers are done in batches of 128-bits and here is the interesting part: Every eight batches, the main bus is temporarily unlocked. This leaves a small window to perform other DMA transfers in parallel (up to ten) or let the CPU use the main bus. This modus operandi is called slice mode and is one of the many modes available on this DMA unit. Bear in mind that while slice mode reduces stalls on the main bus, it does so at the cost of slowing down the overall DMA transfer.
Wanted or not, with the amount of traffic happening inside the Emotion Engine, this design will start to suffer the consequence of the Unified memory architecture or ‘UMA’ and that is… multiple independent components trying to access main memory at the same time, causing congestion. Well, to correct these issues, Sony alleviated the need for constant memory usage by:
This sounds very convenient for applications that can benefit from cache, but what about those tasks, such as manipulating Display Lists, which shouldn’t use cache at all? Luckily, the CPU provides a different memory access mode called UnCached, which only uses the Write Back Buffer without wasting cycles correcting the cache (product of cache misses). Furthermore, the UnCached accelerated mode is also available that adds a buffer for speeding up read of continuous addresses in memory.
Inside the same Emotion Engine package, there is another processor called Image Processing Unit or ‘IPU’ designed for image decompression. This can be useful when a game needs to decode an MPEG2 movie without jamming the main CPU. To make a long story short, the game sends the compressed image streams to the IPU (hopefully using DMA) which is then decoded in a format that the GPU can display. The PS2’s operating system also relies in the IPU to provide DVD playback.
In addition to this, the IPU also allows manipulating compressed High-resolution textures which saves CPU usage and large transfers.
It’s been two years since the rivals presented their latest offering. If you read the former article and just started reading this one, I presume you are still waiting for ‘the thing’ that makes the PS2 as powerful as it seemed back then. Now, let me introduce a very important set of components Sony fitted in the Emotion Engine, the Vector Processing Units or ‘VPU’.
A Vector Processing Unit is a small independent processor designed to operate vectors, in particular, vectors made of four
floats. These processors are so fast that they only spend only one cycle per operation, which can be extremely convenient for geometry processing.
VPUs are made of the following components:
The vector unit needs to be ‘kickstarted’ to start working, for this, the main CPU is in charge of supplying the microcode. There are two VPUs fitted in the Emotion engine, but they are arranged differently, giving way to different uses and optimisations.
The first VPU, the VPU0, is positioned between the CPU and the other vector unit (VPU1). It provides an ‘assisting’ role to the main CPU.
The VPU0 has two modes of operation:
The memory map of the VPU0 also has access to some of the other VPU’s registers and flags, presumably to check its state or quickly read the results of some operations done by the other VPU.
The second VPU found, the VPU1, is an enhanced version of the VPU0 with double the amount of micro memory and VU memory. Moreover, this unit includes an additional component called Elementary function unit or ‘EFU’ which speeds up the execution of exponential and trigonometric functions.
The VPU1 is located between the VPU0 and the Graphics Interface (the ‘gate’ to the GPU), so it includes additional buses to feed the geometry to the GPU as quickly as possible and without using the main bus.
On the other side and due to its location, the VPU1 only operates in micromode.
It’s obvious that this VPU has been optimised for trigonometric operations, and may serve as a preprocessor for the GPU, being in charge of delivering the famous Display Lists.
A useful approach that can be exploited with these units is procedural generation. In other words, instead of building the scene using hard-coded geometry, let the VPUs generate it using algorithms. In this case, the VPU computes mathematical functions to produce the geometry that can be interpreted by the GPU (i.e. triangles, lines, quadrangles, etc) and ultimately used to draw the scene.
Compared to using explicit data, procedural content is ideal for parallelised tasks, it frees up bandwidth, requires very little storage and it’s dynamic (uses parameters to achieve different results). There are certain elements that can highly benefit from this technique:
On the other side, procedural content may struggle with animations and if the algorithm is too complex, the VPU might not generate the geometry at the required time.
To sum up, procedural rendering is not a new technique, but thanks to the VPUs, it opens the doors to further optimisations and richer graphics. Nonetheless, is not a simple technique to implement and Sony R&D published some papers describing different approaches to use on their console.
With these new additions, programmers now have a lot of flexibility to design their graphics engines. In fact, there are multiple papers published that analyse benchmark popular pipeline designs.
Here are some examples of graphics pipelines set up with different optimisations:
In the first example, the Parallel design, the CPU is combined with the VPU0 in macromode to produce geometry in parallel with the VPU1. The CPU/VPU0 group makes full utilisation of scratchpad and cache to avoid using the main bus, which the VPU1 relies on to fetch data from main memory. At the end, both rendering groups concurrently send their respective Display Lists to the GPU.
The second example, the Serial design, proposes a different approach where the CPU/VPU0 group works as a preprocessor for the VPU1. The first stage will fetch and process all the geometry that the VPU1 will subsequently turn into Display List.
There are many more examples out there, it is now up to the programmer to find the optimal setup and that, is a good thing.
In this example, Jon Burton (the former director of Travellers Tales) explained how his team implemented a particle system fully encapsulated in the VPU1. The VPU1 received a pre-populated database from memory which then used to calculate the coordinates of particles at any given time, the result could be transformed into Display Lists and sent right away.
With this method, the CPU was significantly offloaded, allowing it to carry out other tasks like AI and physics.
Considering all the work done by the Emotion Engine, is there something left? The last step actually: Display!
There’s a simple but speedy chip specialised in that: The Graphics Synthesizer or ‘GS’ running at ~147.46 MHz. It contains 4 MB of DDRAM embedded inside to do all processing in-house. Thus, removing the need to access the main memory. The embedded RAM is accessed using different buses depending on the type of data needed.
The GS has fewer features than other graphics systems previously reviewed in this site. Nonetheless, it is very fast at what it does.
This GPU only does rasterisation and that is… Generating pixels, mapping textures, applying lighting and some other effects. This means there are no vertex transformations (these are covered by the VPUs). Also, this is a fixed-function pipeline, so no fancy tweaking or shaders, you are stuck with a fixed shading model (e.g. Gouraud).
Looks pretty simple right? Well, let’s dive deeper to see what happens at each stage.
The Emotion Engine kickstarts the Graphics Synthesizer by filling its embedded DDRAM with the required materials (Texture bitmaps and Colour Lookup tables, the latter are also known as ‘CLUT’), assigning the GS’s registers to configure it, and finally, issuing the drawing commands (Display Lists) which instruct the GS to draw primitives (points, lines, triangles, sprites, etc) at certain locations of the screen.
Afterwards, the GS will preprocess some values that will be needed for later calculations. Most notably, the initial value for the Digital Differential Algorithm, which will be used for interpolations during drawing.
Using the previous values calculated, the renderer generates the pixels of the primitives. This unit can generate 8 pixels (with textures) or 16 pixels (without textures) concurrently, each pixel entry will contain the following values calculated:
It also performs Scissoring Tests to discard polygons outside the frame area (based on their X/Y values), some pixel properties are forwarded to the Pixel testing stage for further checks.
The pack is then delivered to the Texture mapping engine, but each property is delivered to the specialised ‘sub-engine’, which enables to process different properties in parallel.
Lighting is also provided by selecting one of the two choices available, Gauraud and Flat.
This stage is powered by a large Pixel Unit that can compute up to 16 pixels at the time, here textures will be mapped onto the polygons. Furthermore, fog and anti-aliasing effects can be applied.
Texture maps are fetched from DRAM in an area defined as Texture buffer, although this is interfaced by a separate area called Texture Page Buffer which seems to serve as a caching mechanism for textures. CLUTs are also mapped using this page system. Both elements are retrieved using a 512-bit bus.
The pixel unit performs perspective correction to map textures onto the primitives (a great improvement considering the previous affine mapping approach). Moreover, it also provides bilinear and trilinear filtering, the later one is used alongside mipmapped textures.
Here certain pixels will be discarded if they don’t meet certain conditions. Having said that, the following tests are carried out:
The last stage can apply some effects using the previous frame-buffer found in DDRAM:
Finally, the new frame-buffer, along with the updated Z-buffer, are written to memory using a 1024-bit bus.
There’s a dedicated component inside the GS called Programmable CRT Controller or ‘PCRTC’ which sends the frame-buffer in memory to the Video output, so you can see the frame on TV. But that’s not all: It also contains a special block called Merge Circuit that allows to alpha-blend two separate frame-buffers (useful if games want to reuse the previous frame to form the new one). The resulting frame can be outputted through the video signal and/or written back to memory.
With all being said, this surely brought better designs to refresh already-famous characters. Take a look at this ‘Before & After’:
I think it’s also worth including new game series whose characters were modelled with high levels of detail from the ground up:
It’s worth mentioning that games like Dragon Quest implemented a custom lighting model called Cel Shading (a term I have mentioned before, however in my previous articles I explained that the GPU was mainly responsible for this. In the PS2 case, the required colour calculations are presumably done by the Emotion Engine, since the GS isn’t as flexible as other GPUs.
As stated before, the PCRTC sends the frame-buffer through the video signal, it can broadcast the video signal using the following modes to work with TVs from any geographical region:
There’re quite a lot of modes to choose from, but it all comes down to the format adoption during the early 2000s, which narrowed down to PAL and NTSC. Also, even though PAL provided a higher resolution than NTSC, some European versions of NTSC games resorted to letterboxing to mask the unused horizontal lines and slowed down the refresh rate to fit the 50Hz limit. I call these ‘Bad ports’!
The video out port (Multi A/V) is very convenient. It outputs RGB, Component, S-Video and composite. So, all the important signals are there without requiring proprietary adapters.
The new audio chip is an incremental update of the old SPU called… SPU2! Improvements include the inclusion of 2 MB of internal memory and 48 channels available (two times the original amount).
The SPU2 is made of two sound processors inside (referred as CORE0 and CORE1) running at ~36.86 MHz, each one processes 24 channels. Curiously enough, these are still two independent processors configured by altering their registers, however, Sony warned developed that both sets of registers have to be set with 1/48000 seconds of gap. If you hurry too much, the behaviour of the SPU2 becomes unpredictable!
It still includes the same effects as the original SPU. The memory provided is used as a ‘work area’: You can store raw waveform data and reserve some space to process it and apply some effects on it. Finally, the chip can mix all channels to give stereo output. Now, here is the interesting part: The SPU2 can feed itself the mixed stereo sample as new input, this enables to EE to access it (to mix it with even more audio, for instance), or keep adding more effects.
For digital effects such as reverb, echo and delay, these can be achieved by cycling through the output of CORE0, memory and samples processed on CORE1. This requires to reserve a good portion of memory.
The signal is outputted though both Digital audio (referred as the Sony/Philips Digital Interface or ‘S/PDIF’) or Analog Audio (which goes through the digital-to-analogue converter and ends at the Multi A/V port).
The I/O of the PS2 is not complicated, yet multiple revisions of this console completely changed various internal and external interfaces.
To start with, there’s a dedicated processor that arbitrates the communication between different components, this CPU is no other than the original MIPS R3000-based core found in the Playstation 1, this time it’s called IOP and runs at 37.5 MHz using a 32-bit bus.
The IOP communicates with the emotion engine using a specialised I/O interface called System Interface or ‘SIF’, both endpoints use their DMA units to transfer data between each other. The IOP also contains its own memory used as buffer. Looking from the outside, the IOP gives access to the front ports, DVD controller, SPU2, the BIOS ROM and the PC card slot.
By including the original CPU, we can suspect PS1 compatibility would eventually happen somehow. Conveniently enough, the IOP happens to include the rest of the components that formed the CPU subsystem of the PS1 and the core can be under-clocked to run at PS1 speed. Unfortunately, the SPU2 has changed too much for the PS1, but for that, the Emotion Engine is ‘repurposed’ to emulate the old SPU.
In later revisions of this console the IOP was replaced with a PowerPC 401 ‘Deckard’ and 4 MB of SDRAM (2 MB more than before), backwards compatibility persisted but through software instead.
This console kept the previous front ports that were included in the original Playstation, yet it also featured a couple of ‘experimental’ interfaces that looked very promising at first.
The most popular addition: Two USB 1.1 ports, widely adopted by accessories, these persisted on all future revisions.
Noe, what about the ‘volatile’ ones? To start with, there used to be a front i.Link port (also known as IEEE 1394, or ‘Fireware’ in the Apple world). This port was used to connect two PS2 to enable local multiplayer, the port was removed after the third revision (presumably replaced by the ‘Network card’, more details below).
On the back of the console we had a slot for PC cards, you could buy the ‘Network Adaptor card’ from Sony that provided two extra ports. One for connecting an ethernet cable, and another one for plugging in a proprietary and external ‘Hard Disk Drive Unit’, also sold by Sony. Having a drive allowed games to store temporary data (or permanently install themselves there) for faster load times. Just a few games used this feature, though.
In later revisions, the PCMCIA port was replaced by an Expansion Bay where a 3.5” Hard drive could be fitted inside the console. You had to buy first a Network adaptor which not only provided Modem and/or ethernet ports (depending on the model), but it also included the required connections for an ATA-66 hard disk. ‘Slim’ revisions completely removed this feature, but left an ethernet port permanently installed on the back. In addition to that, the new revision added a new front port, the infrared sensor.
The new version of their controller, the DualShock 2, is a slightly improved version of DualShock. During the days of the original Playstation, lots of revisions of the original controller were released featuring different features and with this, fragmentation. Now, for the benefit of developers, there was a single controller that standardised all the previous features.
Compared to the DualShock, the new version featured a slight redesign, it included two analogue sticks and two vibration motors for a richer input.
Next to the controller slot is the Memory Card slot which is compatible with PS1 and PS2 cards. The new cards embed extra circuitry for security purposes referred as MagicGate, which enable games to block data transfers between different memory cards.
There’s a 4 MB ROM chip fitted on the motherboard that stores a great amount of code used to load a shell menu that the users can interact with, but it also provides system calls to simplify I/O access, games will use this.
Upon boot, the CPU will execute instructions in ROM which in turn will:
OSDSYSmodule, which will display the splash animation and the shell menu.
The functionality of the shell of this console is very similar to the other consoles of the same generation.
The shell features some practical sections which allow to perform day-to-day operations, like manipulating the saves of the memory card. It also provides some exceptional options, like changing the current video mode.
It is unprecedented the level of popularity this system achieved during the noughties and it is such, that at the end of its lifespan (2013, after 13 years!) the game library was filled with 1850 titles.
What happened here is really impressive. The PS2 doesn’t have a ‘programmer-friendly’ architecture (as seen from the perspective of a PC programmer) yet with such amount of games developed, I too wonder if there were more factors involved (such as licensing relief, low distribution costs, cost of development, small size of case and so on).
Sony provided the hardware and software to assist the development of games.
On the software side, there was the Playstation 2 SDK which included:
On the hardware side, Sony provided studios with dedicated hardware to run and debug games in-house. The initial devkits were bare boards stacked together to replicate the un-released hardware of the PS2. Later kits named Development Tool had a more presentable appearance, enhanced I/O and combined workstation hardware (running RedHat 5.2) with PS2 hardware to build and deploy the game in the same case.
Combining the Devkit, the official SDK and Codewarrior (a famous IDE) was a very popular setup.
The disc drive can read both DVDs and CDs, so games could be distributed using either format, but for obvious reasons you will find most titles in DVD format.
DVDs can hold up 4.7 GB of data in the case of DVD-5 (the most common ‘sub-format’) or 8.5. GB in the case of DVD-9 (dual layer version, less common). There’s actually a third format, DVD-10, which is double-sided but no games used it.
Due to the type of medium used, not only games could be played, but also movies. Now, this requires a decoder to be able to read the DVD movie format, and for that, the PS2 initially included the required bits installed in the memory card (after all, the card is just a storage medium) but later models came with the DVD software pre-installed in the BIOS ROM.
In terms of speed, CD-ROMs are read at 24x speed (so 3.6 MB/s) and DVD-ROMs are read at 4x speed (5.28 MB/s).
As you have seen, the networking features of this consoles weren’t standardised until later revisions, which arrived four years later after the first release. Similarly, game studios were in charge of providing the necessary infrastructure if they decided to provide online services (like multiplayer). In later years, Sony deployed the Dynamic Network Authentication System or ‘DNAS’, it wasn’t an online server, but an authentication system to prevent pirated games from connecting online.
Apart from all these games with their fancy graphics, Sony released a Linux distribution based on ‘Kondara’ (which is in turn based on Red Hat 6) available in two DVDs (first disc called ‘Runtime Environment’ and the second one called ‘Software Packages’) along with a VGA adapter, USB Keyboard and Mouse; plus some developer manuals. The pack was known as Linux Kit and you could run the OS by booting the first DVD and then proceed like any old school Linux environment. You obviously needed a Hardrive fitted in the console and once installed on the drive and the DVD was always required to boot this OS.
Linux Kit included the compilers targeting the EE (gcc 2.95.2 with glibc 2.2.2) and assemblers targeting the vector units, along with a window system (XFree86 3.3.6) ‘accelerated’ in the Graphics Synthesizer. Overall, this sounds like an interesting environment. In fact, one of the papers I read to write this article was done using this setup.
There’s quite a lot to talk about here so let’s start with the DVD reader, shall we.
This section was particularly concerning for game studios, since this console used a very affordable format disc to store games and had the extreme risk of being pirated.
When the OS loads a game, it does so by sending specific commands to the DVD reader. The commands specifically used to read the content of a game behave very different from the rest of commands (which can be used to read a DVD movie, for instance). It turns out authorised games contain an out-of-reach ‘map file’ in the inner section of the disc that indexes the filesystem by name, position and size. When the DVD is asked to read a game disc, it will always navigate through the disc using the map file, meaning that a pirated copy of a game, which could not include the map file, will be impossible to read. This was complemented by a region lock system that prevented imported games from working in a console of a different region.
Having explained the most critical part of this console, let’s take a look at multiple methods discovered during the life of this console that could bypass the different protection mechanisms.
As any other console of its generation (and previous ones) using disc-based systems, it was a matter of time before third-party companies reversed-engineered the DVD subsystem in order to find a usable exploit which could force the driver to read the file system without needing an out-of-reach map file.
This eventually happened in the form of modchips which lifted the region locking restrictions as well.
Along with the modchips, which required soldering skills to install, unauthorised but ‘genuine’ discs appeared in the market, allowing to defeat the region protection and use in-game cheats by patching the OS. These had the advantage of not requiring to modify the console. I guess the best example to mention was CodeBreaker.
In the middle of the latest advancements, yet another trick appeared. This time, exploiting the reader’s handling of faulty sectors. Swap Magic looks like another ‘genuine’ disc, but its ‘game’ tells the DVD to read a non-existent executable found on a deliberate faulty sector, provoking the driver to halt. This window of opportunity allowed users to swap the disc for a non-genuine one. Then Swap Magic, still loaded in memory, bootstrapped the main executable of the new disc, loading a real game at the end. All of this, with the driver still thinking a genuine disc is inserted.
This doesn’t necessarily require to alter the console. However, depending on the model, the external case of the PS2 will have to be tampered to block the eject sensors of the drive (in some cases, placing cotton in certain places will do the trick).
The PS2 stores a database file called
TITLE.DB in MemoryCard which contains some parameters to optimise the emulation of PS1 games. When a PS1 game is inserted, the OS fetches the database file and loads the whole file in memory at a fixed address (strike one). The parameter parser is implemented using
strncpy(), a function in
C that copies strings (chain of characters) from one place to another.
For people familiar with
C, you probably guessed where I’m going. The thing is that
strncpy() doesn’t know how long is a string, so unless it’s terminated (by placing
\0 at the end of the string) the copy goes on ‘forever’ (with unpredictable results!). Luckily, this function contains an optional parameter that specifies the maximum number of bytes to be copied, protecting the copy from buffer overflows. As ludicrous as it may seem, Sony didn’t use this parameter, even though each parameter entry is set to be 256 bytes long (strike two).
Upon closer inspection in RAM, TITLE.DB happens to be copied next to a saved register,
$ra, which states the address to return after the current function being executed finishes (strike three) making way to The independence exploit: Craft a Title.db with a large string, embed an executable in it and set that string so
$ra will point to the executable. If you manage to upload that file to your MemoryCard (through another exploit or a PC USB adapter) you got yourself a simple Homebrew launcher.
After the slim revision was released, the exploit got patched (I wonder how). Curiously enough, it wasn’t the last blunder that exposed clumsy code.
Some time ago, it was discovered that the BIOS of this console could be upgraded using the Memory Card, this function was never used in practice, but neither removed (at least for most of the console’s lifespan). With this, hackers discovered that if software could be managed to be installed in the MemoryCard, then the BIOS would always load it at boot. This discovery led to Free MCBoot, a program presented as ‘upgrade data’ which replaced the original shell with one that could execute Homebrew. Bear in mind these changes are not permanent, but only applied if a Memory Card with Free MCBoot installed is inserted during the console’s startup.
Additionally, this software needs to be installed somehow, so another exploit (e.g. disc swapping) is required to run the installer.
The same year after the release of Free MCBoot, another trick was discovered: Disguising games as DVD movies, effectively allowing unauthorised game copies to be read without requiring a modchip. This only required to patch the game image by adding dummy metadata and partitions only used by DVD movies. Then, when the burned copy is inserted on the console, the drive won’t reject it, but it won’t execute the game either. However, with the help of a Homebrew program called ESR, the game could be kickstarted.
Congratulations and thank you for reaching the end of the article! To be honest, there was so much to talk about that I wondered if readers would eventually get tired of Playstation-related stuff after finishing this.
Anyway, in all seriousness, I do hope you discovered new things after reading this article and if you have any comments, don’t hesitate to drop me a mail.
Until next time!
This article is part of the Architecture of Consoles series. If you found it interesting please consider donating, your contribution will be used to get more tools and resources that will help to improve the quality of current articles and upcoming ones.
A list of desirable tools and latest acquisitions for this article are tracked in here:
## Interesting hardware to get (ordered by priority) - Hard drive/Network card for the SCPH-10000. - PS2 SCPH-3x00x model to check out Homebrew using the hard drive. ## Acquired tools used - PS2 SCPH-10000 (£60) to do a proper motherboard analysis. - Old PS2 Slim grey (£40?) from back then.
Always nice to keep a record of changes.
## 2020-05-03 - Replaced wikipedia motherboard photo with better one ## 2020-04-13 - Added procedural rendering ## 2020-04-08 - Added the independence exploit - Overall corrections - I think this can be considered as the second draft - Released to the public - Corrected memory diagram, thanks @Turrican3_IT for pointing out ## 2020-04-07 - Private draft finished - Carlos, did you know the guy from dragon quest has a mouse in his pocket? ## 2020-04-05 - First *rough* draft done