Replacing an Amiga’s brain with Doom

Thom Holwerda 2025-08-05 Amiga & AROS 9 Comments

There’s a lovely device called a pistorm, an adapter board that glues a Raspberry Pi GPIO bus to a Motorola 68000 bus. The intended use case is that you plug it into a 68000 device and then run an emulator that reads instructions from hardware (ROM or RAM) and emulates them. You’re still limited by the ~7MHz bus that the hardware is running at, but you can run the instructions as fast as you want.
These days you’re supposed to run a custom built OS on the Pi that just does 68000 emulation, but initially it ran Linux on the Pi and a userland 68000 emulator process. And, well, that got me thinking. The emulator takes 68000 instructions, emulates them, and then talks to the hardware to implement the effects of those instructions. What if we, well, just don’t? What if we just run all of our code in Linux on an ARM core and then talk to the Amiga hardware?
↫ Matthew Garrett

This is so cursed. I love it.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

9 Comments

2025-08-05 12:51 pm
Alfman verbose=1
DMAing through the bitplanes to display them it was incrementing those registers to point at the next address to DMA from. Which means that every frame we need to set those registers back.
I’ve seen a lot of low level graphics programming CGA/EGA/VGA/etc but that seems weird. Every platform has it’s eccentricities, haha.
Recording the screen and watching in slow motion revealed that the glitches often showed parts of two frames displaying at once. The Amiga hardware is taking responsibility for scanning out the frames, and the code on the Linux side isn’t synchronised with it at all. That means I could update the bitplanes while the Amiga was scanning them out, resulting in a mashup of planes from two different Doom frames being used as one Amiga frame. One approach to avoid this would be to tie the Doom event loop to the Amiga, blocking my writes until the end of scanout. The other is to use double-buffering – have two sets of bitplanes, one being displayed and the other being written to. This consumes more RAM but since I’m not using the Amiga RAM for anything else that’s not a problem. With this approach I have two copper lists, one for each set of bitplanes, and switch between them on each frame. This improved things a lot but not entirely, and there’s still glitches when the palette is being updated (because there’s only one set of colour registers), something Doom does rather a lot, so I’m going to need to implement proper synchronisation.
Yeah, base metal software always has to deal with these kinds of quirks 🙂 Many of us are probably familiar the palette glitches (when registers get written while the video system is rendering them) on IBMs.
This would be insane….but given that the RPI may be fast enough to do this, then in theory you could use a precise phase locked loop to update the registers at the right moment before they are needed. This lets you obtain higher color density than the hardware can otherwise render in a static frame. To set this up, you use completely static video memory and then then bit bang the actual colors by updating the palette registers for blocks of pixels in real time. I don’t know that the bus is fast enough, but could it work on the amiga hardware with perfect timing? You wouldn’t need any bandwidth updating screen pixels as they would never change, say a repeating sequence of 0x0 – 0xf. You might choose to write high palette values while the video system is outputting the low palette values (and visa versa). By the time the video system gets to the high values, the RPI can write the low values
Video Ram:
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F — static pixel data does not change (indexes into palette)
Palette Ram:
0 R 2 3 4 5 6 7 W W W W W W W W — high palette values being written by RPI while low values rendered
W W W W W W W W 8 9 R B C D E F — low palette values being written by RPI while high values rendered
If you could get this working reliably, then you could emulate a true color graphics buffer on the linux side of things and render arbitrary software on the amiga, haha. You wouldn’t be limited to doom. Minecraft, cyberpunk, whatever.
I have used a linux SBC for realtime servo control (I believe it was an orange pi). While the standard linux kernel is almost useless for scheduling real time IO, I did succeed at building my own purely userspace scheduler on top of a standard kernel using a realtime thread that never yielded CPU(*). While a CPU’s gigahertz frequencies is more than enough for accurate real time IO., if you don’t force the CPU to ramp up either programmatically or heuristically, then you can end up with erratic timing glitches.
This is a simplification of my implementation, I actually did yield the CPU back to the OS, but scheduled events early enough that 1) the real time window would not be missed, and 2) early enough to give the CPU time to ramp up the base frequency before a realtime IO event.

2025-08-05 7:32 pm
sukru
Alfman,
https://www.youtube.com/watch?v=wsADJa-23Sg
“Doom didn’t kill the Amiga…Wolfenstein 3D did”
Their weird choice graphics was what DOOMed amiga to extinction. People have tried very hard to run even Wolf 3D on that hardware and failed. Maybe adding a modern CPU might help, but still it will be limited by the BUS.
I would expect even an SPI based display would work better than Amiga. Which is a major shame.
(Apparently it can even run on RP2040 — raspberry pi Pico, an Arduino competitor)
https://github.com/rsheldiii/rp2040-doom-LCD

2025-08-05 9:00 pm
cybergorf
GRIND comes to the rescue
Some Wolfenstein/Doom-like that runs on a plain 7MHz Amiga:
https://www.youtube.com/watch?v=X-SAnj6E9vY

2025-08-05 11:33 pm
sukru
Yes,
GRIND is mentioned in the video I shared:
https://youtu.be/wsADJa-23Sg?t=864
But it came too late. Amiga was already dead long ago.
2025-08-05 11:45 pm
sukru
cybergorf,
Just for curiosity I tried to find out how they achieved this.
After all, real time updating textures pixel by pixel is pretty much impossible at that speed. Nor doing it one vertical slice at a time.
Texture Data Preparation (The “Secret Sauce”): This is the crucial offline step, done when the game loads. Textures are not stored as simple 64×64 pixel images. They are transformed into a format optimized for the Blitter.
Column-Major Format: Instead of rows of pixels, the texture is stored as columns of pixels. So you’d have 64 individual columns for a 64-pixel wide texture.
Pre-Shifting: Since the Blitter operates on 16-pixel wide words, drawing a 1-pixel column at an arbitrary x coordinate is tricky. To solve this, 16 versions of each texture column are created. Each version is bit-shifted from 0 to 15 positions.
This is what Google’s AI (gemini) thinks what is happening. And it makes sense. They would be pre-calculating (almost/all/plenty of) 16-pixel texture combinations. Since they already have limited amount of textures, it is possible. But would require an estimated 1MB of RAM. (Amiga has 512KB RAM, which is one of the reasons this requires extra RAM — or they can have a lower res version for base hardware).
This makes sense, because a lot of programming is doing trade-offs between storage and real-time calculations. They might have solved the real time speed issue with ahead of time texture stitching.
(Not all texture combinations are possible. That is the saving grace. The 2D nature of the map and raycasting means there are only certain combinations that need to be precalculated).

2025-08-06 4:39 am
The123king
The Amiga is a classic example of a system attempting to be be the computer of tomorrow, constrained by the limits of today. There’s been quite a few of them in history, maybe the most notable being the Apple Lisa, and maybe more recently, the PS3.
It’s great being forward thinking, but being revolutionary rather than evolutionary is a tall order. Apple, for example, probably didn’t forsee the Macintosh outliving the 24bit memory address space. The designers of the PS3 were expecting a world of pervasive multithreading (like Be Inc), which didn’t quite materialise. The Amiga, again, couldn’t forsee the massive drop in memory prices, making their custom graphics chipset wizardry obsolete.

2025-08-06 11:12 am
Alfman verbose=1
The123king,
The Amiga is a classic example of a system attempting to be be the computer of tomorrow, constrained by the limits of today. There’s been quite a few of them in history, maybe the most notable being the Apple Lisa, and maybe more recently, the PS3.
Interesting view. This seems much less relatable today, we throw so much excess hardware at problems that don’t deserve it and it hardly bothers anyone. Some of us want things to be efficient out of principal, hearkening back to older times, but we’re the exception.
The Amiga, again, couldn’t forsee the massive drop in memory prices, making their custom graphics chipset wizardry obsolete.
Memory wasn’t the only bottleneck. Graphics cards had more memory than the CPU could directly access and the CPU’s limited address space into video memory was the reason behind planar video modes. I hated these modes. I never really needed to deal with them because they were before my time.
I didn’t experience the Amiga, I learned on IBM PCs and I had access to VESA VGA’s larger address spaces that made all those hacks completely unnecessary,. In conjunction with a DOS extender. it was actually quite pleasant to do graphics in DOS by that point.

2025-08-06 11:58 pm
sukru
Alfman,
Interesting view. This seems much less relatable today, we throw so much excess hardware at problems that don’t deserve it and it hardly bothers anyone.
Do not worry, we always find new workloads to completely make today’s hardware obsolete. The AI models for example require massive amounts of VRAM, making even the top end nvidia server GPUs or Mac Studio desktops struggle to even load them into memory.
The123king,
The designers of the PS3 were expecting a world of pervasive multithreading (like Be Inc), which didn’t quite materialise.
I would argue the opposite. They designed PS3 for a single purpose: a mainstream BluRay player at every home. The fact that it could play games was a secondary concern.
This won them the format war, but almost lost them the consoles (they can thank Don Mattrick and Kinect for that)
Even though Cell was sold as a “supercomputer” (even I bought into that concept for a while) it was extremely limited, and a design for older times.
Though things are cyclic…
Today we once again have custom designs. TPUs for accelerating certain machine learning models. Video encoding and decoding engines, crypto accelerators, ray tracing modules and so on on CPUs and GPUs. Limited purpose cores to save power.
But who stay relevant are those can survive these ebbs and flows. Specialized -> Generic -> Specialized -> Generic -> Specialized -> … This has been the case since the dawn of computing.

2025-08-07 2:39 am
Alfman verbose=1
sukru,
Do not worry, we always find new workloads to completely make today’s hardware obsolete. The AI models for example require massive amounts of VRAM, making even the top end nvidia server GPUs or Mac Studio desktops struggle to even load them into memory.
Incidentally just today I got my hands on gpt-oss-20b. I don’t have enough RAM to run the gpt-oss-120b variant. I am running a 16C/32C cpu with 64GB ram. Some processing seems to be offloaded to the GPU, but I don’t have enough VRAM for the whole model so it’s CPU bound. Even without a hardware upgrade, I am seeing generational improvements. I had been using llama3.1-70b. but this newer version of gpt-oss-20b absolutely flies in comparison and scores better to boot. For an open model that runs locally, this may be the best choice right now.
It also has “thinking” mode, which is very interesting to analyze. With the thinking output enabled, it becomes much clearer how hallucinations come into play. Normal output had been notoriously over-confident. However the thinking reveals something new and unexpected (to me): the LLM literally debates itself over falsehoods and gets stuck debating those falsehoods for a long time. These conflicts weren’t really apparent at the surface. It’s as the the LLM picked one argument and then squelched competing arguments, the confident shell not revealing the internal conflict. Hallucinations are bad, but the fact that the LLM is in conflict under the hood gives me hope that future models may be better able to adjust output confidence appropriately. I also noticed that if you ask the LLM about it’s internal thought process, it doesn’t seem conscious of it. We were never meant to interact with it in this way, but interesting to study and observe.