Coding with Balls

June 10, 2016

Amiga Elkstravaganza

Filed under: Uncategorized — codingwithballs @ 08:14

Not too long ago we┬áreleased “Party Elkstravaganza” – the invitation for Solskogen 2016. It’s a 36 kilobyte intro for Amiga 500 and it took home the 1st prize in the olschool intro compo at Simulaatio. It’s also the first Spaceballs production in 14 years with music from my old partner in crime, Teis! (The previous one being a grossly underrated diskmag intro )

The main effect was inspired by 2 animations from El Visio (a man of many great ideas!). I hadn’t yet thought of making a new Solskogen invitation so there wasn’t any real plan to it, but I really liked the patterns and decided to try replicating them in realtime on the A500. (Thinking that even if it failed the process and / or results might still be interesting.)

Patterns and movement

I quickly lost hope of instinctively “seeing” the logic behind the patterns and simply asking El Visio for the algorithms he used didn’t feel like enough of a challenge (with Amiga coding being a prestigious e-sport and everything).
Thus, I ended up staring at the clips for a long time, looking for pattern repetitions, and then just copied the motion without actually understanding it. The result of this was a (fairly inelegant and messy) piece of 68k asm code which generated 2D vertices and edges that matched El Visio’s animations.
Simply precalculating the vertex postions offline and including them as data was of course an option, but doing it at run-time is more fun, takes less space and (most importantly) makes it a lot easier to experiment with the effect while putting the actual production together.

Rendering

Throughout the intro we’re switching between different “scenes”, each one a 50 frames long loop with a specific pattern movement and triangle size.
The data for a full loop is generated before it is shown (while we’re displaying the previous scene) and it consists of 4-color (2 bitplanes) slices that are 320 pixels wide. There are 50 slices (one for each frame) and they’re typically between 30 and 100 pixels tall (the height of each slice depends on the triangle positions for the corresponding frame).

A single slice

A single slice

When displaying the effect a single slice of data is repeated multiple times (and sometimes mirrored) using the Copper, in order to fill the entire 256 pixel height of the screen.
The reason for doing this rather than naively drawing lines and filling bitplanes for the whole screen with the Blitter is of course that the latter would be too slow to achieve 50 FPS. We are of course using the Blitter when pre-generating the slices – that processing doesn’t have any impact on the display framerate since it happens in parallel with displaying the previous scene.

As for memory usage the different loops in the intro range from around 140k to 310k.

Colors (of which there are a lot)

We’re using a total of 4 bitplanes, but not all the 16 color registers that enables. There’s 2 bitplanes for the patterns (background + two triangle colors) and 2 bitplanes for the text overlays (background/transparent + outline and text fill colors).
All color values are in 12bit RGB (which is the OCS/ECS hardware maximum) but there’s a lot of semi-controlled flickering to provide temporal antialiasing and make things look extra cool on 50hz CRT screens.

The colors for the overlay (or rather, the overlay blended on the patterns) are simply set once per frame. They’re flickered slightly every frame and also flashed in sync with the music (of course!).
The triangle patterns are colored using two Copper gradients. Each of these have 64 entries which means the colors can change every 4th scanline (for 256 lines display height). The gradients are also offset 2 scanlines every second frame in order to smoothen them out (again: especially nice on CRTs).
The gradients are generated and modified in different ways throughout the intro by interpolating between different colors and brightness levels. As the basis for this I used a pre-generated table of colors containing 256 different gradients, each with 64 entries. This is actually a photo of neon tubes which has been cropped and post-processed in Photoshop. In raw form this takes up 32k but slightly reorganized and delta-encoded it crunched very well (keeping the file size well below the important 40k limit). In other words: no reason to bother with removing unused data or try to generate something with code.

The base color table - Obviously 12 bit RGB isn't all that hot on its own so we'll have to add a bit of Flicker & Flash!

The base color table – Obviously 12 bit RGB isn’t all that hot on its own so we’ll have to add a bit of Flicker & Flash!

-

The generated gradients applied to the effect. (The first three patterns use Copper mirroring and the gradients are swapped for each slice.)

Text plotter

The overlays are generated at runtime (as opposed to in DPaint or Photoshop). This was necessary both to keep the file size down and to have enough memory available for the triangle pattern data. The plotter uses a bitmap font (one size only) with 2 bitplanes (outline and fill). Each character is manually positioned using a bare-bones editor and all the text is stored as arrays of {CharacterIndex, Xpos, Ypos} which are then used to plot characters with the Blitter. The editor itself (not included in the intro) was also implemented in 68k assembler because I have nothing better to do with my time.

The overlays are an example of what you can get away with through composition, coloring and motion. Some of the text screens look absolutely horrid on their own but (in my opinion) work out nicely in the final product.

Control flow and logistics

Sounds awfully enterprise’ish, but it’s just about “what should happen when”.

Only about half of the available RAM on a standard A500 is accessible to the custom HW chips (meaning that the other half can’t be used for things that are on-screen). This meant data for the next scene had to be generated on-the-fly and shuffled around in parallel with displaying the current scene.

While interrupts and pseudo-multi-threading isn’t exactly rocket surgery it was still a bit finicky to combine everything with the sync and progression of the intro.
There’s VBlank interrupt code which handles all color changes and updates to the on-screen 50 FPS effect (mainly by generating new Copper lists). It also initiates the jobs running in parallel with the effect, which are:
Generate new pattern and store it in “public memory”. A heavy job which can take anywhere from 5 – 20 seconds depending on the pattern.
Move new pattern into chip mem – happens once the screen has faded to black, in order to hide glitches when the old data is overwritten.
Plot new text overlay – starts after the interrupt code has switched off the previous overlay (as there was no memory for doublebuffering the overlays).

-

The mainloop for the non-interrupt code. Exciting stuff!

The interrupt code stealing cycles from the pattern generation each frame imposed some limitations on the visuals and ordering of the parts. If the next pattern already takes a lot of time to generate and we make it even slower by using complex color gradients in the current part then we’ll have to watch the same scene for a very long time. It could be argued that we don’t need many different patterns if we have nice color variations, but when I’ve made a decent piece of code I’d like to show it off a bit.

Size optimization

As there was never any real risk of exceeding the 64k limit of the compo at Simulaatio I didn’t bother much with size optimizations. However, when I saw how small it got when crunched with CrunchMania (50-something kilobytes) I figured we should cater to the real oldschool connoisseurs by getting it below 40k.
In the end it turned out to be very easy: Delta-encoding the color table (in addition to the high-precision sine table) obviously improved crunch rate quite a bit. Blueberry’s awesome Shrinkler also gave much better results than CrunchMania (as expected). And finally: Teis delivered a final version of the soundtrack that was both cooler and smaller than the previous one.

Stuff that didn’t happen

As always there were some ideas and potential optimizations that never made it.

  • Keeping 2 smaller patterns in chip memory at the same time and switching seamlessly between them. This could’ve provided a bit more variation and most of the code was actually in place.
  • Playing more with the overlay and background colors, for instance by using gradients for the text fill.
  • Optimizing the pattern generation enough that each frame slice could be rendered in real-time. This would have freed up lots of memory and allowed much more variation in the effect patterns. I’m not sure it would have worked but precalculating the Blitter config data for all the lines and / or done CPU linedrawing in parallel with the Blitter might be worthwhile.

Final thoughts

Doing a short one-trick-pony intro (as opposed to a larger demo) was a nice and fairly smooth experience (even the last 30h crunch before the deadline wasn’t too bad). I’ll definitely do more projects like this in order to play with new ideas (I tend to be a lot more motivated when there’s an actual release target in front of me).

And of course: you should all come to Solskogen. It’s a great party with a really unique atmosphere and a very nice rural location not too far from Oslo. ­čÖé

Advertisements

Create a free website or blog at WordPress.com.