Oli's old stuff

Tinkering with retro and electronics

Mar 24, 2023 - 8 minute read - retro electronics hardware pio minicube64 rp2040 raspberry pi pico

Making the Minicube64 on the Raspberry Pi Pico

The Minicube64 is a retro-inspired fantasy console created by MonstersGoBoom as a “joke” lowrez jam entry. It is now maintained by aeriform who is actively writing games for it.

The Minicube64 is powered by Mike Chamber’s fake6502 emulation running at 6.4Mhz, it has 64Kb of RAM, outputs a 64x64 pixel screen at 60Hz and originally had a palette of 64 colours (the palette has since been extended to 256 colours). Games are loaded as ROM images, copied directly into the RAM and executed.

It emulates a 4 button controller with Start, A, B, C and the normal directional buttons mapped to the keyboard. Audio is handled by a NES APU emulation created by Matthew Conte.

What’s interesting about the Minicube64 is just how simple it is.

Not so fantasy?

One day I was wondering, would it be possible to move the Minicube64 from the “fantasy” console realm and make it as a standalone device that you can plug into a TV/monitor and play?

For this to work we’d need:

  • Video out via VGA, SCART or Composite Video
  • Controller input via a Mega Drive or SNES joypad
  • SD Card support for loading the ROM images
  • Audio output
  • A microcontroller to run the machine

For this experiment I will be using a Raspberry Pi Pico; I’ve grown to really like the device and the PIOs are really powerful for dealing with I/O.

Minicube64 Breakdown

Let’s breakdown how the Minicube64 actually works.

The main loop runs at 60fps; each frame we perform the following actions:

  • Sample the inputs, write the encoded byte to 6502 address $0102
  • Run 6400000/60 emulated 6502 cycles (~106,666 cycles)
  • Read the location of the screen from 6502 address $0100, this points to a 4K page that contains 64x64 palette indices for the current screen
  • Read the location of the palette from 6502 address $0101, this points to a 768 byte block of 256 palette entries of 3 bytes (R, G, B).
  • Update the framebuffer by looping 64 rows by 64 columns, mapping the palette index to the RGB colour in the palette
  • Raise an interrupt on the 6502
  • Wait for vsync, and repeat

So far, so simple, right?

Hacking on the Pico

The first thing to do when trying to get this to run on the Pico is to remove a lot of code.

To start with I removed:

  • Sokol, used for Audio out as it isn’t supported by the Pico
  • Minifb, used for framebuffer and input

I also commented out various other bits of code to do with audio, input and other things as this won’t be used as-is and will require different solutions.

In order to have something to run I converted the boot.bin into a C include file using xxd so that we could have something for the 6502 emulation to run that would also generate a video signal. The boot.bin is the ROM that gets loaded if you don’t specify a game; it shows a nice animated logo and will serve as a smoketest for the system.

In order to build all this I generated a CMakelists.txt with the bits I needed, which included bringing in and configuring the Pico’s SDK.

Eventually I ended up with something that would compile for the Pico. Time to move onto the next step, video out.

Video Out

There’s various solutions to outputting video from the Pico. Everything from composite video, VGA and even DVI.

I decided to use VGA out as it should also be compatible with SCART RGB with csync (in theory).

Rather than figure it out myself or try to adapt the pico-extras scanvideo code, I decided to explore Miroslav Nemecek’s PicoQVGA as it seemed to be pretty much what I needed. QVGA stands for “quarter VGA” which is a 320x240 screen at 60Hz. This is more than enough for the Minicube and probably most retro systems in general.

The PicoVGA library was really easy to integrate; but it does require a dedicated PIO and CPU core. It has no assumptions about your colour format, it only needs to know if the pixel data is 8bpp, 16bpp or 32bpp and what pins you’re using for Colour and HSYNC/VSYNC. Colour has to be a contiguous range, and so do the sync signals (although they can be a different range to the colour pins).

I decided to use an 8bpp colour system of RGB332 (3 bits red and green, 2 bits blue), consuming 10 GPIO in total.

The GPIOs were run through a rudimentary DAC made of resistors to create 3 analog colour signals in the range 0v-0.7v as needed by VGA.

    RED0    --|  2 K |---\
    RED1    --|  1K  |------ VGA_R
    RED2    --| 470R |---/

    GRN0    --|  2 K |---\
    GRN1    --|  1K  |------ VGA_G
    GRN2    --| 470R |---/

    BLU0    --|  1K  |------ VGA_B
    BLU1    --| 390R |---/

The lower resistor value for BLU1 means that when highest blue is set it contributes a little more in the DAC. This is to account for it being only a two bit value compared to the 3 bit values for red and green.

The next step in this process is to update the Minicube64’s rendering code to output to the framebuffer that is used by the PicoQVGA library. For this I decided to also scale up the screen by 3, so we’d need to write 9 QVGA pixels for every Minicube pixel. This involved updating the mfb_setpix code to output directly to my framebuffer. The scaling was already handled by the mfb_rect_fill code, I just needed to change the scale factor to 3 (the original uses 4).

The final change was to deal with the RGB332 encoding; the original Minicube uses RGB888 packed into a 32 bit integer (as is needed by Minifb). I needed to pack this into an 8 bit value.

For this I changed the MFB_RGB macro to use the following algorithm:

    r = r & 0b11100000
    b = (g & 0b11100000) >> 3
    b = (b & 0b11000000) >> 6
    rgb332 = r | g | b

Basically, take the MSB of the colour we need and pack it into a byte.

Testing VGA out

Time to upload to the Pico and try it out:

It worked! Well, mostly - the video lines were mixed up on the first attempt so the colours came out all wrong, but that was easily fixed. I’d simply mixed up the high and low bits in the RGB DAC, which meant that it had the wrong colour proportions.

The main thing to notice here is that the framerate is much, much slower than the original. I didn’t measure it, but it looks to be in the order of at least 10x slower.

Minicube Boot

The first optimization I applied was to remove the use of the mfb_rect_fill routine and inline the pixel output. I also decided to reduce the scale to 2px at this point, as 3px looked a little blocky on an actual screen.

    const uint8_t* vram = memory + vram_loc;
    for(int y = 0; y < 64; ++y) {
        uint8_t* fb = framebuffer + (((y*2)+yoffset)*320) + xoffset;
        for(int x = 0; x < 64; ++x) {
            int lookup = *(vram++)*3;
			uint8_t rgb332 = VGA_RGB(palette[lookup], palette[lookup+1],palette[lookup+2]);
            *fb++ = rgb332;   // pixel (x0)
            *fb-- = rgb332;   // pixel (x1) 

            fb += 320;        // next vga line (+y)
            *fb++ = rgb332;   // pixel (x0)
            *fb = rgb332;     // pixel (x1)
        }
    }

This resulted in a doubling of the framerate; it had a huge impact. But it was still slow.

Controller Input

I’ve posted about the Mega Drive Controller circuit and PIO code before; so it was largely a case of wiring it up to work. Go and read that post and the first one about how it all works.

The controller reading PIO runs continuously and sends the state of the controller to the PIO’s RX buffer as a 16 bit value containing the results of both of the SELECT line strobes. It’s our job to read that value and do something with it.

The 16 bits I get back are:

      SELECT HIGH          SELECT LOW

    7 6 5 4 3 2 1 0      7 6 5 4 3 2 1 0
    - - C B R L D U      - - S A - - - -

But the minicube wants the bits in the form:

    7 6 5 4 3 2 1 0
    R L D U S C B A 

So we need to do some shifting about:

    enum mc_controller_buttons {
        MC_CONTROLLER_NONE = 0,
        MC_CONTROLLER_UP = (1 << 4),
        MC_CONTROLLER_DOWN = (1 << 5),
        MC_CONTROLLER_LEFT = (1 << 6),
        MC_CONTROLLER_RIGHT = (1 << 7),
        MC_CONTROLLER_B = (1 << 1),
        MC_CONTROLLER_C = (1 << 2),
        MC_CONTROLLER_A = (1 << 0),
        MC_CONTROLLER_START = (1 << 3),
    };

    uint8_t r = (a & MD_CONTROLLER_BIT_UP) ? MC_CONTROLLER_UP : 0
            | (a & MD_CONTROLLER_BIT_DOWN) ? MC_CONTROLLER_DOWN : 0
            | (a & MD_CONTROLLER_BIT_LEFT) ? MC_CONTROLLER_LEFT : 0
            | (a & MD_CONTROLLER_BIT_RIGHT) ? MC_CONTROLLER_RIGHT : 0
            | (a & MD_CONTROLLER_BIT_C_ST) ? MC_CONTROLLER_C : 0
            | (a & MD_CONTROLLER_BIT_B_A) ? MC_CONTROLLER_B : 0
            | (b & MD_CONTROLLER_BIT_C_ST) ? MC_CONTROLLER_START : 0
            | (b & MD_CONTROLLER_BIT_B_A) ? MC_CONTROLLER_A : 0;

Once that’s done we can poke the value into $0102:

    uint8_t md_pad = md_controller_read();
	write6502(IO_INPUT, md_pad);

Next Steps

We have video output and controller input; so the next steps would be to get the system loading from an SD Card and then start looking at optimizations to get the framerate up.

I’ll talk about this in my next post.