Amstrad CPC Game Prototype

Recently, I’ve become somewhat intrigued by the Amstrad CPC.

It’s a machine I never owned or had any exposure to outside of being about 7 or 8; the only other kid I knew who had any interest in computers showed me a few games on his green screen 464 before remarking that my Spectrum was more colourful.

From a technical point of view, the Amstrad CPC 464 is pretty remarkable. 64K of RAM as standard with 2 banks of 16K ROMs built in - and you can page the ROM out to get access to all the RAM underneath. Three screen resolutions and 27 colours to choose from! It had a “proper” keyboard, built in RGB out, tape drive, joystick ports, an AY-3 sound chip as standard and more.

An interesting thing about the machine is that it was built from mostly common, off-the-shelf components - with the “Gate Array” chip being the major exception. The display driver chip was the same used in the BBC Micro and the CPU the same as the Spectrum. The CPC is built like a tank, too. Even the mistreated machines usually boot up without much of a problem - unlike the Spectrum which more than likely needs bits replacing, even when stored well.

It was also very big. Like, seriously big.

One thing that I always found strange was that on paper, the machine should have “won” the UK market over the cheaper Spectrum. Yet the majority of games were rushed ports of Spectrum games, often running pretty slowly and without sound. I never understood why - I started coding for it to see what I could learn.

Making a game prototype

I had a week off work so I figured that I’d start a new mini project to explore the system a bit more. Something less complex than SMEG and something that I would not normally do.

I always enjoyed playing Team 17’s Alien Breed on the Amiga, so I figured “let’s see what it takes to make a top down shooter like Alien Breed”.

The first choice you’re faced with when making a CPC game is what screen mode to run it in. Mode 1 gives a nice 320x200 screen with square pixels, but only 4 colours (although you can get more if you are good with timings and use palette swapping). Mode 0 gives you access to 16 colours but drops your screen resolution down to 160x200 - the pixels have a 2:1 aspect ratio. The big take away here is that both screen modes have exactly the same memory footprint, they just differ in how many pixels are packed within a byte (and how that packing is done). Regardless of the resolution, the memory footprint for the screen is 80 bytes x 200 lines - or 16,000 bytes. A touch shy of 16K. As a Spectrum person, that’s an eye-watering amount of memory to use.

Knowing this, and wanting to make a “proper” CPC game, I opted for Mode 0 in all its 16 colour, 2:1 pixel aspect glory.

After kicking out the two ROMs and losing 16K to the screen, you’re still left with 48K - should be good enough for a game, right?

The Spectrum’s screen RAM was relatively efficient to shift around, even with the weird line format. Being a monochrome bitmap display, a single 8x8 character cell fit into 8 bytes of RAM. One byte per pixel line, and 8 lines. This meant that drawing an unmasked software sprite of 16x16 needed to shift 32 bytes from one location to another. The CPC’s 16 colour Mode 0 has a 2 pixels per byte encoding, which means that for the same 16x16 sprite you’re moving 128 bytes around - although in reality you’d really make that an 8x16 sprite if you wanted to try keep the original square aspect - it’s still 2x more than the Spectrum.

On top of this extra memory use, the CPC’s screen address format is even more weird than the Spectrum - with each pixel row being $0800 bytes away, but each char row being only $0050 bytes apart. In practical terms you end up having to do a little more book keeping and can’t traverse the horizontal pixels with a straight inc l, instead needing to use inc hl or handling the 256 page change accordingly.

The thing is, this stuff mounts up. There’s more data to shift and shifting it is a little bit slower/more awkward than the Spectrum.

Redrawing

To avoid flickering or tearing you need to get pretty good at understanding how the CPC’s screen timing works. Luckily for us, there’s 6 interrupts per frame (2 lines after the vsync, then 50 lines apart) and there’s a vsync signal that we can read from the hardware to know when the frame starts. This all leads us to us being able to better time our draws to be behind the raster beam, as long as we can get them all done in the frame.

I’ve found that (for me, anyway) this leads to a few techniques. The first would be to try and squeeze it all into a frame; in my prototyping I was able to get a stable 50Hz framerate by starting my drawing at interrupt 4. The challenge was that it really doesn’t leave much room for having a lot of things that needed to be drawn. I found that the flicker did creep in if I had to redraw a lot of tiles or move a lot of sprites.

This lead to me experiment with back buffers; can I render offscreen and copy it (quickly) to the screen? You’d do all the complex masking calcs into a smaller linear buffer and then copy the bits you needed onto the main screen. The pain that creeps in here is that you’re really drawing twice in a frame; albeit the first without the pain of the screen RAM layout.

Minimizing the amount of stuff you need to redraw in a frame seems to be the key to a stable framerate here.

Double Buffering

I tried an approach whereby I tried to redraw the whole screen into a back buffer and then copy it over in the next frame (25Hz refresh). I couldn’t get a level of performance that was anywhere close to reasonable; shifting around that much data in a single frame is a big ask. So can the machine help us out?

A nice feature of the CPC is that it supports hardware double buffering (and scrolling) through use of the CRTC chip. This is the chip that basically sets the address lines for the Gate Array to read when performing its screen refresh. There’s a lot of really cool stuff that a lot of the CPC experts can pull off with the CRTC chip; many of which are beyond my current level of understanding and experience. But for a n00b like me, the hardware double buffering is simple enough to grasp quickly. You tell the CRTC which address to use as the base for its screen and it’ll trot out a sequence of address lines to the Gate Array that end up on the monitor. This means that one frame you can have a screen at $8000, the next frame you can have it at $C000.

The big kick in the teeth here is that you’re now running two screens at 16,000 bytes each. That’s half your RAM gone to the display. Ouch. Our 64K machine is now a 32K machine.

I’m currently playing with a solution that uses the CRTC double buffering combined with dirty tile tracking. This allows me to redraw only the parts of the screen I need to (eg: shift less data around) and still have a large chunk of the frame available.

Started adding wall collision in - got to tweak those hitboxes a little pic.twitter.com/k47Oh6a2DC
— @evolutional (@evolutional) November 29, 2021

In the example here, the frame times are:

Red/White: Tile redraw
Green: Player redraw
Magenta: Aliens redraw
Yellow: Wait for swap (arbitrarily interrupt 4)
Black: Wait for vsync

I still have more than half the frame for more stuff and it’s running at 50Hz. Gives me plenty of options, including falling back to 25Hz if I need to.

Thoughts so far

The result of the week playing around is promising. I’ve learned a fair bit about pushing pixels around on the CPC; importantly I’ve learned that there’s a lot if don’t know (there’s still a lot to learn to get good). When you see what experienced CPC devs can achieve with the machine you’ll see how far this machine can be pushed.

Something that took me by surprise was that I’m already worried about memory - even with nothing happening on screen. In the 32K I have left I still need to store the masked sprite data, tile data and room data. There’s code to write, and maybe even sound effects to figure out. The graphics data adds up fast, so compression is likely going to be needed (or more creativity around the graphics).

A hidden benefit of the Mode 0 screen is that at 2 pixels per byte I can get reasonably smooth player movement at 2px steps without the need to use pre-shifted sprites. On the Spectrum you’d be needing to handle this shift to fit the character cell-alignment and avoid any jerky movement - doing this stuff can eat up memory though!

I can see why the CPC suffered as a games machine in the 1980’s. The “easy” route of porting a Speccy game over didn’t take advantage of the CPC’s enhanced colour palettes but at the same time suffered from having more data to move around in the same time window. The CPC provides some nice tools in the hardware but using them can result in a hefty memory penalty, forcing you to fall back to multi-load or even removal of content to fit into RAM.

Anyway - I had fun exploring this. I’m contemplating taking this exploration and doing a little more with it in the same sprit of “do stuff quickly”.

Oli's old stuff

Tinkering with retro and electronics

Amstrad CPC Game Prototype

Making a game prototype

Redrawing

Double Buffering

Thoughts so far