Chromatic Mpact review

A different kind of hardware

Nowadays we have only one way of making 3d accelerators, but in the middle of nineties all options were open. Personal computers were just becoming multimedia machines and strong wave of digital signal processors was coming to handle the new workloads. Biggest names of the industry like IBM, Motorola, Analog Devices, TI and AT&T were developing their DSP. And then there were also fresh startups like Xenon Microsystems. Founded with Kubota (yes, those bulldozer makers had graphics division which also spawn Gary Tarolli and ATi engineers), Convex (vector supercomputer maker sold to HP in 1995), Showgraphics and Rambus, Xenon was to develop a flexible DSP for multi function multimedia solution. Before release the company was renamed to Chromatic Research. In a parallel they worked on a media oriented x86 compatible CPU which did not see the light of day. What made Chromatic special was support not only for usual audio and video processing but also 3D and other functions. 3D graphics consists heavily of vector computing, which is a strong side of media DSP. However, computational power of Mpact! comes from various logic units in it and all the higher level functionality has to be programmed. More specialized designs with minimized programmability are of course faster. Chromatic hoped to overcome this by exploiting flexibility of Mpact! and support for a lot more features with their software driven chip. Parallel hardware and software development should have also speed up time to market.

First Mpact!

The first chip was ready before the end of 1995 and named Mpact! /3000. It was rated incredible peak 3 billion operations per second at 62,5 MHz. While Mpacts had lower clocks then contemporary CPUs, their SIMD instructions made such awesome numbers possible. However, two more things were needed to bring Mpact to life: volume manufacturing partners and software. LG and Toshiba took advantage of Mpact! and built their derivatives, but software development took longer and delayed launch until September 1996. Meanwhile 135 MHz 24 bit Ramdac and a clock synthesizer were integrated into the chip. These cost efficient chips carry R in the name, i.e. R/3000. At the launch was also unveiled new higher frequency design, the R/3600 ticking at 75 MHz thanks to a transition from 0.5 to 0.35 micron process. Price of the card was at the level of midrange 3d accelerator, but Mpact! also handled high quality playback of MPEGII and other video codecs, rich spectrum of audio codecs, 3D sound and later also AC3, and 33k modem DSVD. Together with telephony and videophone Chromatic was boasting support for seven demanding multimedia techniques. But board manufacturers were not very interested and it took a while until STB, Gateway and Micron saved Mpact and used the /3600 in large as DVD decoders. This is a testament to the flexibility of the design, DVD and 56k modem standards were implemented after the chip was done. Similarly Sound Retrieval System extensions were implemented after Chromatic Research entered into an agreement with SRS Lab in 1996. Fixed function hardware rarely gains new features with time, but Chromatic's solution was flexible enough to be able to emulate legacy hardware just as well as adopting new multimedia standards.

Jabil Mpact! PCI Thanks to Vogons member hard1k, I had a chance to put this R/3000 4 MB beauty through paces. Not many vendors would pack a guide with benchmark numbers and way how to reproduce them with their card. Chromatic was very proud of their 2D performance.

Mpact has it's own real time kernel separating it from system interrupts and latencies of memory and PCI bus. The MRK can choose between real time scheduling or preemptive multitasking. The dirty part was program of the core itself, as VLIW requires a lot of assembly magic to extract performance. SIMD techniques are hard to program as well. Chromatic put so much effort into Mediaware they were very protective of it and never published it on the web. Much to dismay of users as the only chance for update was getting a CD from computer vendors. At least since version 1.4 , the Mpact gained Direct3D capability. There is a catch to it however, only cards with own video output can serve as actual 3d accelerators.

And depending on setup some features may be disabled until you authorize your precious Mpact!

Wait, cards without video output? Yes, stripped of all the I/O.

Here is STB's addon board used for ultimate DVD playback on PC. The kind which cannot be your 3D accelerator. Whenever you see that fancy letter R on a graphics chip you know there is a Rambus controller inside. Rambus is high frequency, low pin count memory solution. While only one byte of data per clock is transmitted, RDRAM uses extreme frequency of 500 or 600 MHz for Mpact /3000 or /3600 respectively. The memory has higher latencies than other synchronous RAM, but graphics chips are good at hiding them. First released card had two megabytes of memory, this one comes with four, and some carried even six.

Architecture


Grey parts are programmable. Datapath contains all the ALUs.

Design philosophy was derived from a vision of unpredictably developing multimedia heavy future. Dedicated hardware for each function should be inefficient and underutilized. Forget about graphics pipeline, this is a special purpose CPU. Mpact does not have data cache since it is pretty much useless for streaming data. The architecture revolves around multiport addressable SRAM instead of registers, but also has four special purpose registers for indirect addressing of its 72 bit SRAM entries. All the I/O ALU traffic goes through the relatively large SRAM with more than 4 GB/s bandwidth. Since graphics is computed in such ALU's and not pipeline of specialized stages, it takes Mpact four clocks to render one 3d pixel. Unlike fixed function graphics program jumps are supported and hardware loop counts too and without branch overhead. Key unit for DVD playback like in other media processors is an MPEG2 system bit-stream variable length decoder. The Mpact can feed contiguous SRAM entries with DMA transfers asynchronously with ALUs. Multimedia love such high parallelism. Four read and write ports to SRAM are available to five ALU groups. Data crossbar bandwidth is awesome 11 GB/s for Mpact /3000. By being based on 9 bit bytes instead of 8 bit bytes, this was the equivalent of adding a sign bit to each byte which is very useful for MPEG decoding. One of the founders was also a founder of RAMBUS, and not surprisingly Chromatic used RDRAM as its memory technology. Rambus specifies 9 bit memory devices for parity purposes but the Mpact uses the ninth bit for additional precision, giving data sizes of multiplications of 9. This is why Mpact has some unique precision such as 18 bit Z-buffering and audio sampling. 24 bit 2d colors are supported. 3d acceleration is limited to 16 bit colors nominally. I say nominally, because the Mpact uses six bits for every color of the RGB trio, reaching again 18 bit precision.

792 bit internal bus for massive sustainable throughput.

Very Long Instruction Word (VLIW)
Early DSPs had a two or three functional units and reduced code size thanks to CISC instruction sets which supported commonly used parallel issues as single instructions with tag determining sequential or concurrent execution. Newer DSPs have more ALUs and first Mpact features five of them. Extracting high performance from such processor requires stronger instruction feeding. Mpact issue two instructions at once and each can control multiple ALUs. Alternative super-scalar approach with run-time instruction parallelizing has higher overhead. Defining software compatibility at the source code level is not a problem for media processors, whose software life cycle is shorter than a mainstream CPU. Bigger code size of CISC architecture can be tamed with instruction stream compression. Code density of Mpact can do without it, as instructions are of RISC SIMD nature. VLIW is highly flexible and efficiently utilizes resources when given a good compiler, but only up to some amount of parallelism. Compiler complexity increases, benefits diminish with less adaptable algorithms, wiring delays, and a widening gap between on-chip processing speed and available bandwidth to external memory. Demand for higher processing power continues and modern GPUs often combine VLIW parallel processing with super scalar designs / multithreading.

Experience


Considering the complexity required to make first Mpact! do 3d graphics, I am pleasantly surprised the Jabil card was extremely stable. However, here the good new ends. First the quality slider has to be addressed. Yes, it is one of those products offering such compromises. For my tests the highest settings were used. Knocking the slider one notch sacrificed some perspective correction, most of the time barely noticeable. Moving the slider further, you can have only very approximate bilinear texture filter, that tries to mash up edges of texels together rather than interpolating between them. Given how demanding true interpolation is, even this setting is quite legitimate. In the lower half of quality slider however, the Mpact also gives up resolution, renders in quarter one internally and only upsamples the image to your target resolution. A crude help to achieve higher framerates on a card, that struggles with all games. My gripe with the slider is that Chromatic did not describe any of it and unexperienced user thus has no way how to know what effect the setting has.


At highest quality the bilinear filter does what it should, yet colors do not have to give 18 bits vibe.

There are more reasons to call out Chromatic for not being all that honest. With card's CD came a sampler highlighting Mpact's abilities. The 3D section would make you think it is about Direct3d rendering, despite the demo used for demonstration employs DirectDraw at best.


The Cancum demo which Chromatic tried to use as demonstration of 3d abilities.

The thing is, Mpact! under Direct3D can only achieve fraction of such visuals. And in terms of compatibility it blows as well. Most of the games would not work and getting properly rendered 3d image out of it is a rare occasion. Mostly due to lack of texture blending support. But most disappointing to me is memory management. The card hardly ever finds free space to render at 640x480, not to mention higher supported resolutions such as 1024x768. Application checking free video memory shell refuse the card outright, as it may report capacity below 1 MB for Direct3d. I tried to remove the memory expansion bringing video memory down to 2 MB and it made Mpact! almost completely unusable for 3d. Perhaps card with 6 MB would be the only way how to get more tests done.


The Fog City benchmark was one of the nicest thing the Mpact drew.
At lower resolution you get the ship textured as well.

Now to the screenshots themselves. They really have to be considered before looking at the performance numbers. Out of despair to get more results I was perhaps more lenient than usual. Carmageddon omitted quite a few of textures. Flight Simulator 98 for example had to use low resolution mode to output recognizable screen. Mortal Kombat 4 made me doubt whether video textures are really supported. Mpact gave up on textures in Motoracer almost completely. Most surprising incompatibility are games based on Mech Warrior 2 engine. This staple of early Direct3D games was never drawn well enough to be considered, if they run at all.

Turning off Z-buffer in Tom Raider 2 helps render the game faster and with all the resources.

Performance

Finally, the proof that we have the slowest 3d accelerator of the whole bunch. Even if everything was rendered correctly the framerates are awfully low. There are quirks to it as well. One of the most mysterious is drop of performance should you update DirectX above version 3. The Mpact! lacks fillrate like no other. Without any feature enabled the chip can rise to one pixel per clock here and there, but turning on Z-buffer, specular highlights, and of course texturing, it all causes significant hits to the framerates. Trying to play games on first Mpact you might even appreciate that quality slider earning you extra 10 % by skimping on perspective correction. But to get playable, that quarter internal resolution is more of a rule rather than exception. Let it be known there is a "3d accelerator" that really struggles even at 320x240. Here are average framerates, click on image to see minimal.

Outside of games that suffered huge image quality degradation, Homeworld is quite the outlier. The black backround its renderer can default to makes the game very light fillrate wise. Similarly MDK relying havily on 2D backgrounds is. But it is exceptional case for games to not fill the screen with large textured objects.

Summary

The data is finally in and it is crystal clear the first Mpact! is the slowest 3d accelerator of this generation. It is conceivable that this card could do simpler solid modeling in a small window rather nicely. But being an entertainment product, there is a need to play full-screen games with textures. And this chip just does not pack the fillrate or proper memory management for that. Even the 20% clock boost of /3600 would not change the overall picture. Chromatic had yet to prove themselves in 3d rendering, and in their second attempt focused on that a lot more.

Continue to Mpact! 2 review