Coding with Balls

May 19, 2018

Goa Gubbar – Datastorm invitation

Filed under: Uncategorized — codingwithballs @ 10:02

Time for another Amiga 500 write-up! The invitation for this summer’s Datastorm went out not too long ago:

There are several reasons why I like to do party invitations. In addition to the obvious “support an event you like”-aspect they tend to be good vehicles for one-trick ponies where you let an effect linger while you try to convey some more-or-less relevant party info.

This one was a fairly quick project that went from idea to release within a week. I’d been thinking about different effects and visual styles for it for a while, but when Wasp from the Datastorm crew informed me that the tickets would be released soon I still hadn’t come up with anything that’d fit.
With the added (mostly self-imposed) time pressure I decided to save the hardcore effects for another time and go with something simple, trying to do fresh-looking visuals with basic tech.

After a couple of quick detours I ended up with the following ingredients:

  • A bob plotter used to draw 1-bitplane circles of different sizes. (Reusing an old one turned out to be a bad idea. More on that later).
  • Two simple routines for changing the bob positions (or *particles* if you’re a bit more modern).
  • A simple copperlist to make it look less empty.

Display setup

The whole intro runs in 3 bitplanes, meaning that there are 8 “active” color registers. The actual number of colors on-screen is slightly higher due to the copperlist which changes color registers every 16th line. Updating this copperlist with the CPU was fast enough that it didn’t require double-buffering (which you sometimes need for larger lists to avoid display glitches when the raster beam catches up with the CPU writes).

2 bitplanes are used for the bobs, which are updated on alternate frames in order to achieve *screen motion* at 50 frames per second even when the total amount of bobs requires 2 full frames to render. E.g. we update one bitplane containing half of the particles on even-numbered frames, and the other half on odd frames. This ends up looking a lot smoother than just updating the full display at 25 FPS.

Particle motion

There are two different types of motion used here, both of them done in 2D on vertex positions with 3 bits of decimal precision. The actual *rendering* of the bobs doesn’t care about sub-pixel precision though, the coordinates are simply shifted down 3 bits before calculating the screen position.

The exploding particles use the simplest possible physics model with pseudo-randomized initial speed and constant acceleration.

Basically, for each particle:
position[i].xy += speed[i].xy
speed[i].xy += acceleration.xy

Here the speed (“forces”) and acceleration have 4 more bits of precision than the particle position, thus the signed shift of 4.

The other type of motion, where the bobs slide around to form the silhouette over the text overlays, is actually just sine distortions. Here it’s used on vertex positions but it’s pretty much the same type of distortion used on bitplane graphics in the classic “fractal zoomer / Dweezil zoomer” made *so* popular by Stellar’s Bananamen intro from 1993. Using it to “rebuild an image” by reversing the distortion isn’t a recent idea either, just check Babenoise from 1995.

So, when the intro starts we first do 150 “forward” iterations on the bob positions and save the relevant parameters, such as constant speed, sine offset, amplitude, period (here the latter two were baked into the different sine tables for performance).
Then, when the effect is shown, we simply reverse the distortion using the same parameters in order to play the whole thing backwards and move the bobs into their original positions.

Of course, as there aren’t _that_ many particles and each scene is quite short, it would’ve been possible to just use any non-reversible distortion and simply store the coordinates for each frame in memory to be replayed at display time. If the point had been to get as many particles as technically possibly then that would have been the obvious approach (saving the cpu cycles required for per-frame reverse distortion).
The approach used in the invitation feels more elegant though, so I opted for pleasing my coder’s ego instead.

Again, the 68k code for this is quite simple:

Now with a few comments to make the whole “screenshots of assembly code”-thing a bit easier to stomach. 🙂


This is Basic Oldschool Democode and shouldn’t have caused any problems, but my decision to reuse some old code wasn’t entirely ideal. The bob plotter which I’ve used several times before was made for scenarios where there’s an almost constant number of bobs on-screen and they’re all fairly large. Thus it cared about things like clipping on the Y-axis and the setup code was a bit sloppy since the large blits would hide the CPU cycles spent (e.g. the CPU would always have enough time to get things ready before the blitter, working in parallel with the CPU, was ready to draw the next bob).
This time, however, the bobs were smaller, of non-uniform size, and there were more of them. The blitter ended up waiting for the CPU to finish its clipping and setup work, which didn’t pay off anyway because there wasn’t much data to clip for the smaller bobs.
Rewriting the plotter would’ve been a really quick job but instead I figured it’d all be fast enough with the “update half of the bobs every 2nd frame”-trick mentioned above. When I realized that it didn’t, and got slowdowns in the parts with most particles, it was late at night and I was too tired to do a proper job of it. It ended up as a mess of just enough hacks to make the already completed particle movements run at 50 fps. If there’s any lesson to be learned here it’s that sometimes all you need is an ok idea and a bit of kludging, not flawless code. 🙂


In the end it turned out that the Datastorm ticket release had to be postponed due to unexpected intervention from The Man (showing severe lack of respect for underground retrocomputing cults). However, I’m happy the perceived deadline turned this into a really quick project. It was fun to do something that wasn’t trying to show off hardcore code, but instead just play around with basic concepts (and a color scheme inspired by ice cream wrappers). I’m quite pleased with the end result feeling more like a TV ad than a traditional “invitation intro”.

And of course, nothing can stop Datastorm so things are back on track now and I’m told the tickets will be released soon.

You should probably go there, it’s great. 🙂


February 19, 2017


Filed under: Uncategorized — codingwithballs @ 03:19

About a week back we released a new demo called “Makt” at Datastorm in Gothenburg and won the Amiga 500 demo competition there.
While we *strongly* recommend you watch it on proper hardware and a nice fat CRT you can also check out this youtube capture:


When I heard Datastorm would return after a 3-year break I knew we had to make something special for the Amiga competition there. Mainly because I’ve really enjoyed previous Datastorms (it’s *the* oldschool party for me), but also because it was likely there’d be some decent competition. (And it was! Although it’d be cool if more groups ventured away from the comfort of 1992-style..)

I also knew exactly which project to choose from my list of “stuff I’d really like to implement (but rarely get the time to start)”. For a while I’d been playing with the idea of making A500 versions of the main effects in You are Lucy and Dataskull and if I managed to pull it off I hoped it’d be a proper party banger.

I obviously didn’t plan to port those effects directly. Lucy and Dataskull were made for high-end 060 / AGA machines and even there their chunkybuffer effects were chugging along at 16 to 25 frames per second. Instead I wanted to do something that would give a similar look & feel, while running on much less capable (and much more fun!) hardware.

Note: While writing this I’ve been informed that there are unimaginative and uninformed people out there who believe the whole thing is just an animation (“it’s only 1 bitplane so you have room for a lot. durrr…”). While hilarious it’s of course also utter bollocks. 🙂

So, what *is* the trick then? Image-based rendering! Just like on serious computers! Or.. well… BOBs, really.

Bob the billboard imposter

All the effects in the demo are drawn using a (fairly low) number of bobs which are used to stencil parts of a texture into a single bitplane. In other words: you move small pixel masks around in the framebuffer while also scrolling the texture that you can see “through” the mask.

It’s all pretty much Amiga Blitter 101, although I did have to spend a bit of time rediscovering the blitter’s reverse mode for the first time in ages.
But while it’s basic stuff at this level (not unlike “plotting pixels in a buffer”), the really interesting bit comes when you decide *how* to move things around and what to put in the source texures. What we’ve effectively got here is a fast way of drawing a lot of moving pixels by just playing with the on-screen bob positions and the texture coordinates. As such, the “real work” was to use this rendering technique to create visuals that looked powerful (and quite atypical) for the old A500.

Details on the bob rendering:

  • All bobs are 32×32 pixels (48×32 blit size to enable scrolling)
  • We use suitably noisy / dithered masks to make things nice’n’fuzzy and avoid any unattractive sharp edges where the bobs overlap.
  • The bob plotter is fairly fast by itself but definitely not record material. I also did some basic experiments interleaving it with the coordinate transformations but saw surprisingly little improvement. Things seemed fast enough for what we needed so I didn’t pursue it further.
  • In the final demo (with copper post-processing & music) all effects run at 25 fps on a standard A500, except for the wobbly intestine which drops down to 16 fps for a few seconds.
  • Different effects use different texture sizes, from 96×96 in the cube parts to 960×256 for the intestine.
  • In the spirit of Just Hacking It As We Go Along each part has it’s own bob routine that’s 90% the same as the others.
  • Most of the parts draw just 70-80 bobs per frame. The intestine uses the most, maxing out at around 100 on-screen bobs.
  • Everything is drawn into just 1 bitplane. The 2 most recent frames are displayed using 2 bitplanes and then a bunch of flickering copper gradients are applied on top.

On to the effect code!


This was the first thing I tried out after deciding to do bob imposters, mainly because it seemed like it’d give results with fairly little work.
The goal was to make “something that looks like You are Lucy on A500” and I even ended up using a texture (post-processed and dithered to 2 colors) and depth map from there.
The parallax effect in Lucy does per-pixel depth distortion but that’s of course a no-go on A500. Instead we just assign a depth value to each of the bobs and do per-bob depth distortion. If the bobs overlap a bit and the depth projection works out, then it just might look ok! 🙂

An important point for all the effects were that the source images needed to be detailed enough to give a feeling of depth and surface texture. In practice this was handled by playing with contrast and blur in Photoshop before noisedithering down to 2 colors.

Texture from You are Lucy

Texture from You are Lucy

Depth map (just a slightly blurred version of the texture)

Depth map (just a slightly blurred version of the texture)

1 bitplane texture used in Makt

1 bitplane texture used in Makt

In action!

Aaaand action!

Other tech stuff:

  • The dataset consists of one picture (256×256 x 1bpl) and around 80 vertices for the bobs.
  • The bobs are pre-sorted and never switch drawing order
  • The depth transforms are done in two stages:
    • Regular projection (makes the face turn slightly as it moves around the screen).
    • A Face Dragging Vector for more brutal mimicry (e.g. when it gets stretched and torn apart).
  • As far as I remember there’s no animation of the texture coordinates at all. It could probably have been used for more interesting distortion and stretching.
  • Like in several of the other parts the blitter only clears every 2nd line of the current framebuffer. This adds a nice little messy trail of pixels after the bobs and also helps hide some of the glitches where they don’t overlap.

Wobbly Intestine

Pretty much the same effect as the face, just with a larger texture and more bobs (higher density and larger object). The intention here was to take it a bit slower & more majestic to give the idea that it miiight be the last effect already.

The waving and pulsating is just a simple sine-distortion. I had originally planned a much more interesting twist here (no, not actual twisting) but I got a bit short on time (and also noticed that I was already quite close to dropping below 25 fps as it was).
Due to the higher number of bobs this was the only part where I had to do any culling prior to the actual plotting stage in order to (mostly) reach 25 fps. The quick zoom-out before the waving begins drops down to every 3rd frame though (e.g. 16.6 fps).

Warped cubes

This is all done by changing the texture coordinates and there’s no movement of the actual bobs themselves except for a randomized 0-7 pixels jitter (which adds a *lot* to the final look though!)
The deformation is just the standard thing you’d do for a grid expander: scaling the texture coordinates based on the bobs distance from a given point (mainly the center of the screen because I’m old). However, since we’re just offsetting the textures per-bob rather than stretching them it looks a bit rawer and less generic. It just becomes a mess if you distort too much of course, so we try to avoid that.

The patterned cube is rendered into a 96×96 bitplane buffer which is then used as the source texture for the bobs. The cube drawing is really naive and inefficient but I never got around to optimizing it. That would’ve enabled for a fair bit more bobs on screen but as the effect still looked ok it wasn’t a major priority.


This was a bit more fun and, together with the Skull, one of the few effects to use animated bob masks and multiple textures.
The idea is simple enough though: just adapt a normal “endless zoomer” effect to stencil-bob rendering:

  1. “Pre-generate several images of the same motive with different zoom levels.” In the demo there’s 4 badly looping ones.
  2. “Zoom a bit on one image.” Here done simply by applying a uniform scale to all the texture coordinates, while the bobs themselves don’t move at all. Obviously we’re just sliding different bits of the underlying image around, but at scaling factors in the range of 0.7 – 1.3 it looks decent.
  3. “Blend between two images at different zoom levels.” Which we did by jittering in more and more pixels from the next image, while removing the bobs using the previous texture.

As it would’ve been too slow to draw all the bobs for both images at the same time we just jittered in 8 at the time. E.g. for a given zoom factor you have:
– Draw X bobs with texture 0
– Draw 80-8-X bobs with texture 1
– Draw 8 bobs with texture 1, using masks with increasing amounts of pixels.


Jitter-mixing in action (some artifacts here due to reading outside of the texture)

With proper textures and a bit of coloring

With proper textures and a bit of coloring

Moving around in the picture just came down to offsetting the center of the coordinate scaling. This of course won’t affect the pre-generated images so if you move too far, too slow and don’t flash the screen enough then it might look a bit crap.


Both this part and the warped cubes would go nicely in a 4k intro as there’s really no data to speak of. We’re rendering cubes to textures again, but this time we’re keeping 64 of them in memory and updating one per frame. Different textures are then selected for each cluster of bobs that ends up on the screen. Each cube consists of 4 bobs and the depth scaling is simply done by moving the bobs closer together or further apart.
For each line-of-bobs there are 8 cubes, thus there’s a total of: 3 lines x 8 cubes x 4 bobs = 96 bobs in total, although there’s always some that are off-screen and get culled. (On a side-note: I was really lazy this time around and just immediately culled anything close to the screen edges. It’s sloppy but it’s less of a concern with the kind of busy & noisy visuals we have here.)

The bobs can move freely around in 3D but as I couldn’t use too many of them I opted for simple linear patterns and only rotating around the z-axis. I kinda wish I’d pushed that a little bit further.
The movement patterns were set up (there were just 3 lines in 3d space, so “keyframed” is kinda misleading) by hooking in mouse & keyboard controls to the effect code itself. Perhaps a bit overkill but I think the end result got better than what I’d managed by just typing in coordinates by hand.

Adjusting movement patterns

Adjusting movement patterns

This is also the only part where there’s just 1 bitplane enabled instead of mixing the two most recent frames. I wanted fast movement while still being able to make out the sharp patterns in the cubes and the pseudo-afterburn of the 2nd bitplane just made the visuals too smudgy.


This is the fun one. The face & the wobbly intestine were effectively just “2D images with a bit of depth added” (and the cuberush more traditional imposter billboarding) but here I wanted something that I could play around with properly in 3D. It’s somewhat related to Dataskull (they’re both rotating a bunch of points that make up a skull) but while Dataskull uses (comparatively) many particles, Makt relies on less than 100 of them and clever texturing instead.

As for the subject matter I just like a good skull effect. They’re interesting to look at, corny enough to remind us that all of this demostuff is just a good laugh, and kinda tough & evil in the right setting.

The basic principle of the effect is easy, as always:
1. Generate some images of a 3D skull from different angles. We used 4 different images to represent 180 degrees rotation around the y-axis.
2. Also generate depth maps for the same angles (e.g. just dump the z-buffer).
3. For each of the 4 images make a small batch of bobs. Assign the bobs z-values based on the depth map.
4. Find some way to draw this *efficiently* while using bobs from different batches based on how you’d like to rotate the on-screen skull.

Points 1. – 3. here are pretty much the same as what we did for the face and the intestine. The main differences being that the texture and depth map was now generated by rendering a skull object in OpenGL rather than retouching photos of faces and tree branches in Photoshop.


One of the depth buffers, this time in 16 bit packed into 2 color channels

The 4 images used to create the rotating skull:

skull-2 skull-1 skull0 skull1

The last point – actually making stuff look good and doing it with reasonable performance – required quite a bit of experimentation and content-specific tweaking.
This is what we ended up with:

  • Generate the data for each of the 4 images / batches as described above.
  • Obviously you only need to deal with 2 of the 4 batches in any specific frame. For instance: if you want to render the object at 22 degrees rotation (around the y-axis) then you use the image (and accompanying bob batch) representing 0 degrees, the next one representing 60 degrees, and then just “mix them together in some way”.
  • At the rendering level the “mixing together” was done in the same way as for the zoomer: use bob masks with different amounts of pixels.
  • Determining which bobs to remove and which to add while rotating was done mainly based on each bobs x-coordinate in the original “straight-on” position (e.g. the position the texture was generated in). This required quite a bit of tweaking to make sure enough bobs were actually removed (so we didn’t always render 2 full batches) without too many gaps appearing. Of course: for performance reasons this culling of bobs was done *before* anything was actually rotated.
  • Rotation around the x-axis (when the skull nods or tips back-/forward) is just basic 3D rotation and required no custom work. We just had to make sure it didn’t tip too far as there were no textures showing the top or bottom of the skull (quite similar to Dataskull which also had a hole in the head).
  • Sorting the bobs in real-time (on top of all the other stuff going on) was a bit too slow. What we did instead was to pre-sort and merge 2 & 2 batches (e.g. one buffer with batches A+B, one with B+C and one with C+D). Simply presorting based on the z-value of each bob in its original “straight-on” position worked surprisingly well.

Dissolving a single batch of bobs as the rotate


And using all 4 batches & textures. In the demo some of the glitches and noticeable transitions are covered up by the 2nd bitplane “afterglow” and color flashes. 🙂

And that’s about it really. Summing it up now it seems very straight-forward but there was a significant amount of trying, adjusting and tweaking involved. 🙂


As always in recent years there were some ad-hoc tools involved. The main “tool” this time was messy piece of 68k asm code for manually placing bobs in 2D space, sampling each bob’s z-value from a depth map and then checking the results on the fly. It’s a “tool” only in the most rudimentary sense, with mouse control and F-keys used to control different drawing and rendering modes. I had various versions of this for each of the effects depending on the specific characteristics of each and I hope I never have to edit any of those sources again.

A word of warning though: This is very much the development process *I* prefer. If you were to do similar effects on Amiga (that would be awesome by the way!) then you’d probably be better off doing a lot more of the data generation in a high-level language on a PC. For instance: generating a texture atlas with impostors from many different angles might give much better results than manually picking them from a small number of full images. That said, when experimenting with new stuff I like to stay in Asm-One (on a blisteringly fast WinUae-emulated Amiga) to minimize the amount of mental context switches, benefit from all the old bits of code I’ve got and be able to freely move prototype code to the actual effect (when it’s not too slow).

Post processing

Nothing new in the copper department for this demo, except for the orange bars briefly used in the early zoomer glitch-outs. The rest of the copper coloring was taken directly from Party Elkstravaganza and then dumbed down to only use one base color (whereas Elkstravaganza blended multiple). We’re also just sampling from the same color table that I described in a previous post.
I originally planned to do a lot more fancy stuff here but then I kinda fell in love with the rawer single-gradient look and stuck with it.

Data accounting

It’s actually a rather small demo. Uncrunched it comes in at just above 500k and after going through Cranker it’s 322k. No attempt was ever made to reduce the file size during development as I tend to postpone that until it’s really required. Here the only real concern was memory usage rather than disk space. We kept the voice sample separate from the tune itself so that it could be kept in slowmem until it was needed. Other than that there was none of the tedious janitorial RAM-shuffling you sometimes have to do.

The larger bits of data are:

  • 131k for the soundtrack
  • 72k for the voice sample
  • 40k texture data for the Skull (four 1-bitplane images at 320×256, where 30-50% is completely empty)
  • 40k texture data for the Zoomer (four images at 320×256)
  • 30k texture data for the wobbly intestine (960×256, again with a lot of empty space)
  • 8k texture for the face (256×256)
  • 37k of sine tables (!) just because I forgot they were there.
  • 34k for the intro text, “MAKT” logo, end text, bob masks & stencil patterns for the cubes
  • 32k color table (same as in Elkstravaganza but not delta-modulated this time as there was need to crunch)
  • 13k (or thereabouts) of inefficiently stored bob coordinate data, including at least 2 batches that were never used.

Missed opportunities

Of which there were *so many* this time around!
Even when excluding the crazier and potentially-impossible ideas I’d say that the final product is only about 65% of what we aimed for. Some of the missing bits might show up in a later demo but I’ll definitely do something completely different first.

Things that weren’t:

  • Mixing different bob masks in the same frame. This looked promising in the trials and there are actually more masks in the demo data.
  • Morphing and growing stuff on objects. Would’ve made the skull way more evil.
  • Lots of ideas for abstract patterns in both the effects and the backgrounds (we don’t fear the black background of death but it’s not always what we aim for either).
  • Some fairly glitchy bitplane-distortions were also implemented but never used.
  • Feedback effects! The bob rendering would be well suited for noisy variations on Dweezil-style chaos zoomers.
  • The entire first part. Lug00ber finished the soundtrack for it and I have one-and-a-half effect ready (completely different stuff from what’s in the main part).
  • The intro sequence for part 2 was not planned to be just text. 🙂


It was great fun working on this one. The effects were fun to play around with and looked *almost* the way I imagined them in my head. I would of course have liked to spend more time at the party than just two late nights but the last two days of hotel coding were quite enjoyable and without any of the desperation & doubts that can appear when you’re over-tired and fed up.

In summary: we still like to make demos and we still enjoy winning compos so we’ll continue with both.

June 10, 2016

Amiga Elkstravaganza

Filed under: Uncategorized — codingwithballs @ 08:14

Not too long ago we released “Party Elkstravaganza” – the invitation for Solskogen 2016. It’s a 36 kilobyte intro for Amiga 500 and it took home the 1st prize in the olschool intro compo at Simulaatio. It’s also the first Spaceballs production in 14 years with music from my old partner in crime, Teis! (The previous one being a grossly underrated diskmag intro )

The main effect was inspired by 2 animations from El Visio (a man of many great ideas!). I hadn’t yet thought of making a new Solskogen invitation so there wasn’t any real plan to it, but I really liked the patterns and decided to try replicating them in realtime on the A500. (Thinking that even if it failed the process and / or results might still be interesting.)

Patterns and movement

I quickly lost hope of instinctively “seeing” the logic behind the patterns and simply asking El Visio for the algorithms he used didn’t feel like enough of a challenge (with Amiga coding being a prestigious e-sport and everything).
Thus, I ended up staring at the clips for a long time, looking for pattern repetitions, and then just copied the motion without actually understanding it. The result of this was a (fairly inelegant and messy) piece of 68k asm code which generated 2D vertices and edges that matched El Visio’s animations.
Simply precalculating the vertex postions offline and including them as data was of course an option, but doing it at run-time is more fun, takes less space and (most importantly) makes it a lot easier to experiment with the effect while putting the actual production together.


Throughout the intro we’re switching between different “scenes”, each one a 50 frames long loop with a specific pattern movement and triangle size.
The data for a full loop is generated before it is shown (while we’re displaying the previous scene) and it consists of 4-color (2 bitplanes) slices that are 320 pixels wide. There are 50 slices (one for each frame) and they’re typically between 30 and 100 pixels tall (the height of each slice depends on the triangle positions for the corresponding frame).

A single slice

A single slice

When displaying the effect a single slice of data is repeated multiple times (and sometimes mirrored) using the Copper, in order to fill the entire 256 pixel height of the screen.
The reason for doing this rather than naively drawing lines and filling bitplanes for the whole screen with the Blitter is of course that the latter would be too slow to achieve 50 FPS. We are of course using the Blitter when pre-generating the slices – that processing doesn’t have any impact on the display framerate since it happens in parallel with displaying the previous scene.

As for memory usage the different loops in the intro range from around 140k to 310k.

Colors (of which there are a lot)

We’re using a total of 4 bitplanes, but not all the 16 color registers that enables. There’s 2 bitplanes for the patterns (background + two triangle colors) and 2 bitplanes for the text overlays (background/transparent + outline and text fill colors).
All color values are in 12bit RGB (which is the OCS/ECS hardware maximum) but there’s a lot of semi-controlled flickering to provide temporal antialiasing and make things look extra cool on 50hz CRT screens.

The colors for the overlay (or rather, the overlay blended on the patterns) are simply set once per frame. They’re flickered slightly every frame and also flashed in sync with the music (of course!).
The triangle patterns are colored using two Copper gradients. Each of these have 64 entries which means the colors can change every 4th scanline (for 256 lines display height). The gradients are also offset 2 scanlines every second frame in order to smoothen them out (again: especially nice on CRTs).
The gradients are generated and modified in different ways throughout the intro by interpolating between different colors and brightness levels. As the basis for this I used a pre-generated table of colors containing 256 different gradients, each with 64 entries. This is actually a photo of neon tubes which has been cropped and post-processed in Photoshop. In raw form this takes up 32k but slightly reorganized and delta-encoded it crunched very well (keeping the file size well below the important 40k limit). In other words: no reason to bother with removing unused data or try to generate something with code.

The base color table - Obviously 12 bit RGB isn't all that hot on its own so we'll have to add a bit of Flicker & Flash!

The base color table – Obviously 12 bit RGB isn’t all that hot on its own so we’ll have to add a bit of Flicker & Flash!


The generated gradients applied to the effect. (The first three patterns use Copper mirroring and the gradients are swapped for each slice.)

Text plotter

The overlays are generated at runtime (as opposed to in DPaint or Photoshop). This was necessary both to keep the file size down and to have enough memory available for the triangle pattern data. The plotter uses a bitmap font (one size only) with 2 bitplanes (outline and fill). Each character is manually positioned using a bare-bones editor and all the text is stored as arrays of {CharacterIndex, Xpos, Ypos} which are then used to plot characters with the Blitter. The editor itself (not included in the intro) was also implemented in 68k assembler because I have nothing better to do with my time.

The overlays are an example of what you can get away with through composition, coloring and motion. Some of the text screens look absolutely horrid on their own but (in my opinion) work out nicely in the final product.

Control flow and logistics

Sounds awfully enterprise’ish, but it’s just about “what should happen when”.

Only about half of the available RAM on a standard A500 is accessible to the custom HW chips (meaning that the other half can’t be used for things that are on-screen). This meant data for the next scene had to be generated on-the-fly and shuffled around in parallel with displaying the current scene.

While interrupts and pseudo-multi-threading isn’t exactly rocket surgery it was still a bit finicky to combine everything with the sync and progression of the intro.
There’s VBlank interrupt code which handles all color changes and updates to the on-screen 50 FPS effect (mainly by generating new Copper lists). It also initiates the jobs running in parallel with the effect, which are:
Generate new pattern and store it in “public memory”. A heavy job which can take anywhere from 5 – 20 seconds depending on the pattern.
Move new pattern into chip mem – happens once the screen has faded to black, in order to hide glitches when the old data is overwritten.
Plot new text overlay – starts after the interrupt code has switched off the previous overlay (as there was no memory for doublebuffering the overlays).


The mainloop for the non-interrupt code. Exciting stuff!

The interrupt code stealing cycles from the pattern generation each frame imposed some limitations on the visuals and ordering of the parts. If the next pattern already takes a lot of time to generate and we make it even slower by using complex color gradients in the current part then we’ll have to watch the same scene for a very long time. It could be argued that we don’t need many different patterns if we have nice color variations, but when I’ve made a decent piece of code I’d like to show it off a bit.

Size optimization

As there was never any real risk of exceeding the 64k limit of the compo at Simulaatio I didn’t bother much with size optimizations. However, when I saw how small it got when crunched with CrunchMania (50-something kilobytes) I figured we should cater to the real oldschool connoisseurs by getting it below 40k.
In the end it turned out to be very easy: Delta-encoding the color table (in addition to the high-precision sine table) obviously improved crunch rate quite a bit. Blueberry’s awesome Shrinkler also gave much better results than CrunchMania (as expected). And finally: Teis delivered a final version of the soundtrack that was both cooler and smaller than the previous one.

Stuff that didn’t happen

As always there were some ideas and potential optimizations that never made it.

  • Keeping 2 smaller patterns in chip memory at the same time and switching seamlessly between them. This could’ve provided a bit more variation and most of the code was actually in place.
  • Playing more with the overlay and background colors, for instance by using gradients for the text fill.
  • Optimizing the pattern generation enough that each frame slice could be rendered in real-time. This would have freed up lots of memory and allowed much more variation in the effect patterns. I’m not sure it would have worked but precalculating the Blitter config data for all the lines and / or done CPU linedrawing in parallel with the Blitter might be worthwhile.

Final thoughts

Doing a short one-trick-pony intro (as opposed to a larger demo) was a nice and fairly smooth experience (even the last 30h crunch before the deadline wasn’t too bad). I’ll definitely do more projects like this in order to play with new ideas (I tend to be a lot more motivated when there’s an actual release target in front of me).

And of course: you should all come to Solskogen. It’s a great party with a really unique atmosphere and a very nice rural location not too far from Oslo. 🙂

May 10, 2016


Filed under: Uncategorized — codingwithballs @ 19:12

A couple of people have asked about the effects in Dataskull so I figured I’d post this (originally from an email right after the demo compo).

And just to remind you what it’s all about: Dataskull is a demo for “high-end Amigas” (AGA, 68060 CPU at 50MHz or above) which was released at last years Assembly in Helsinki (where it ended up 3rd in the Amiga compo).

This was a fairly quick project done in about a week (right after the much more time-consuming Korreks) because I didn’t want to show up at Assembly without a demo when they had their first dedicated Amiga compo since 1999 (and because my friend Wrec deserved a birthday greeting!)

The main effect was based on some old ideas and unfinished code I’d wanted to pick up again and its main purpose was to look massive and voxel’ish.

The tech

It’s actually just z-sorted point sprites with a lot of hacking. 🙂 (Note: these aren’t Amiga hardware sprites, but “CPU sprites” plotted into a chunky buffer).

Offline processing

I started with volume data from a CT scan of a skull which I downsampled and thresholded to a more manageable size, storing the resulting points (those with high enough density) as regular 3d coords.
Since this was still far too much data for the poor A1200 I then did optimistic (lossy) view-dependent pre-culling to bring it down to a manageable number of sprites. This was basically just brute-forced by rendering the object from a lot of different angles and then removing the points that didn’t contribute enough to the final image (because they were too small, occluded by other sprites, or not visible in enough frames).

The final skull object used in the demo consists of just 1909 3D points (the cluster of spheres is 1828 points).


The points were transformed in 3D, z-sorted (cheaper than a z-/span- buffer due to the slow memory, small sprites and relatively high penalty of conditionals) and rendered with basic depth shading (and 2D screenspace clipping) into a 160 x 96 pixels buffer (1/4 size of the final screen).

This was then used as the source for various combinations of 2 post-processing effects which produced the final 320 x 192 screen buffer.
The noise filter does edge detection on the z-shaded sprite buffer and spawns 2D dots which move pseudo-randomly out from the edges.
The voronoi filter just uses precalculated look-up tables to draw voronoi cells that are colored based on the contents of the 160 x 96 sprite buffer. (In other words, it’s just basic “tunnel effect tables”.) There are 8 different tables for different amounts of cells (“voronoi resolution”, if you will). Each buffer is 320 x 192 x 16bits.

And while all of this runs at ~16FPS (every third frame) on an A1200 with a 50MHz 68060 CPU we adjust the hardware color registers from the vblank interrupt at 50 FPS to add instant coolness and give an illusion of higher effect frame rate (at the risk of triggering some photosensitive epilepsy).

I also switched between 2 different sprite renderers throughout the demo to balance performance against the zoom level of the object: One slightly faster version that just plotted fixed-size squares, and one slower but better looking version that plotted circular(‘ish) shapes at different sizes.

Highest voronoi resolution - sprite size adjusted to match object size

Highest voronoi resolution – sprite size adjusted to match the on-screen size of the object

Too small sprites causing ugly gaps - Increase sprite size or apply aggressive post-processing!

Too small sprites causing ugly gaps – Increase sprite size or apply aggressive post-processing!

Lowest voronoi resolution - ok sprite size

Lowest voronoi resolution – ok sprite size

A couple of weeks after the compo I also did some work on a final version of the demo. The intro part has been improved quite a bit, but I never got around to fixing the sync and post-proc for the main part. I’ll see if I manage to get it out at some point.

March 29, 2015

Slow and inaccurate screenspace voronoi vectorization.

Filed under: Uncategorized — codingwithballs @ 16:56

<@Slummy_> speaking of anti-advisory.. I’ve been doing screenspace voronoi vectorization in 100% asm recently. 😀

<@Slummy_> worst thing is that it’s just for playing around, or “offline precalc” at most, so it could’ve been done in any sane language on a PC

<@Slummy_> but hey, it’s been too long since I did any amigacoding so it seemed like a decent way of playing a bit

<@Raylight-PwL> Slummy_: any tip on algo for that btw? voronoi is da shit you know!

Basically the task is: take a 2D voronoi diagram rendered as pixels and convert each cell to a polygon.

Without going into any detail of how (or why) it was implemented in 68k asm, this is the basic approach. It’ll obviously also work for other 2D graphics with “voronoi-like” characteristics (convex shaps with lots of straight edges) but it’s far from a general purpose vectorizer.

Render your voronoi with a different ID (colour) for each cell.


Identify all the corners / vertices in the image. I do this by scanning through the buffer and comparing each pixel to 3 of its neighbours: (x+1), (y+1) and (x+1, y+1), as well as some special-casing for edges. In the general case a pixel is on a vertex if it has a different colour than at least 2 of its 3 neighbours, and those neighbours are also different.

For each detected vertex store it’s screen space position (x,y) as well as the IDs of the 4 sampled pixels (e.g. the 3 or 4 cells that this vertex is part of)



Then, for each cell in the screen buffer find:

  • All vertices participating in the cell
  • The cell’s bounding box
  • The center of the bounding box


To get a polygon out of this we need to sort the vertices according to winding order. In other words: for each vertex find the angle of the vector from the centre of the bounding box and then sort the vertices according to those.

Finding the angle, either through the magic of basic trigonometry or the glorious path of brute force, is left as an exercise for the reader. 🙂


And voila!


While it’s neither clever nor elegant it worked as intended.

November 17, 2011

So, it’s been really quiet in here…. but we made a new demo!

Filed under: Uncategorized — codingwithballs @ 19:23

I will get back to the effects from Norwegian Kindness in the not-so-distant future, but in the meantime, check out “You are Lucy” – the latest AGA/060 demo from Spaceballs. It won the oldschool demo competition at Kindergarden 2011, and a videocapture can be found at: or on youtube.

As a matter of fact, a large part of You are Lucy is based on an effect derived from Norwegian Kindness. It’s been massively optimized (through the well-proven technique of slapping one’s forehead, saying “doh!”, and reimplementing from scratch) and has several new features, but the fundamental concept is still the same as in the blue’ish twister from NK.

Separated at birth:

A playful twister showing a bit of voxelized curves in full 3D. (Norwegian Kindness)

A handsome model wearing the latest in volumetric post-processing. (You are Lucy)

April 17, 2011

Bitplanes + colour registers = A private rainbow!

Filed under: Uncategorized — Tags: , , — codingwithballs @ 13:10

Most of the parts in Norwegian Kindness use a simple, but effective, trick to achieve a bit more colour and (all-important) Freshness. As the Amiga has bitplane-based graphics, regardless of whether your actual effect code works in a “chunky” 1-byte-per-pixel mode or not, we’re able to combine the output of the main effect code (your typical cube, tunnel or zoomer) with a separate colour overlay, somewhat similar to colouring a greyscale photo. From the CPU’s point of view this operation has no performance cost, as it’s handled by the Amiga graphics hardware instead of requiring explicit (CPU) per-pixel blending. The only additional overall impact is the increased DMA/memory traffic caused by the hardware having to read more bitplane data, but the exactly same cost would have been incurred by using all 8 bitplanes (i.e. rendering 256 colors) in the effect code itself. In “modern” demos this additional overhead is also negligible compared to the time spent by the CPU on various tasks.
As a result we can go from this (effect only, rendered in 64 colours with a linear greyscale palette):

to this (no change in effect code, only 2 bitplanes of static overlay added + a new palette with 4 * 64 entries):

without any performance impact whatsoever.

The 2 bitplane overlay used to “select” between the 4 different 64 colour palettes for each pixel in this case looks like this (colouring only for illustration purposes):

In the demo this trick was applied to almost all the effects (we’ve never cared much for subtlety), with the exception of the blue’ish voxel twister, the initial greyscale clouds, and the text screens. In most cases (as above) the effect itself uses 64 colours and therefore renders into 6 bitplanes (all effects are originally rendered in chunky pixel mode and then converted to bitplanes). This leaves 2 bitplanes for the colour overlay (since the Amiga AGA chipset supports a total of 8 bitplanes), allowing us to create a fairly low-resolution gradient with 4 different palettes. Two effects (the flying anti-aliased cubes and the red-white volume/grid effect) are instead rendered in 32 colours, providing 3 bitplanes / 8 palettes and a somewhat smoother gradient.

What could have been better?
As always things took a lot longer than expected, and even with top-notch Datastorm organizers extending deadlines until compo-start, and numerous friendly providers of Jaloviina, Aquavit and beer, there were issues we simply did not get around to fixing in time for release (as for the “final release” it was just intended to get the demo out, not change the demo noticeably).

In some parts the dithered static overlays are very noticeable, creating a visible aliasing effect and breaking the illusion of a smooth colour gradient. This could have been avoided/improved on both by making the overlays dynamic (animating them) and chosing colour gradients with less contrast between the 4/8 different palettes. It also appears that some video projectors and screens amplify the banding/aliasing effect, making it appear “worse than it actually is” (this might be caused by conversion of the 24bit Amiga colour output to 16bit RGB565, or similar gremlins in the pixel-tubes). During the development of the demo I considered toning down the colour gradient in some parts (for instance the orange-purple clouds), but eventually decided against it as I was afraid it would lose some of its effect on the big-screen (again with the subtlety…) Unfortunately I never got around to animating the overlays, which could have been done in an interrupt at 50Hz (i.e. higher than the effect refresh rate, thus creating an illusion of higher overall performance and a smoother final image). However, Ephidrena used this animated technique in their demo Dancing In Software Foam Oil (some might argue that the effect is less noteworthy, but the issues due to “steep gradients” are also a lot less noticeable).

April 15, 2011

Norwegian Kindness? Is that a joke?

Filed under: Uncategorized — Tags: , , — codingwithballs @ 17:03

Yes, it is.

But it’s also an AGA/68060 demo which was released at Datastorm 2011, where it won the Amiga demo competition.
Although the demo group Spaceballs hasn’t been completely dead in recent years, NK is still somewhat of a “comeback demo” as it’s our first production to contain new effects since around 2004 (rather than rehashed code from yesteryear mixed with blurred video borrowed from the internet). The ideas behind the different effects were thought out between 2005 and 2010, but the majority of the actual code (except for basic frameworks and the infamous ‘Texture Mapper from 1999’) was developed between September ’10 and February ’11.  Most of the effects were implemented in September and October (with testing and optimization cycles relying heavily on WinUAE, irc and email due to the lack of an actual Amiga). The actual “linking” and demomaking took place during one intense week right before Kindergarden (where we were unable to finish the demo) and 1.5 week before/during Datastorm.
As always the demo ended up as something quite different than what we originally planned, but in the end I’m fairly happy with it as an in-you-face effect show. The fact that we beat Ephidrena, the uncrowned kings of subtle design touches on Amiga (you won’t notice, but they know it’s there!), shows that good old-fashioned “flash the screen and keep hitting them with new effects”-design ™ still works with the retro-geeks of today.

The next post  deal with the colour gradient trick used in NK (and in the Eph/Kvg/RNO demo from Datastorm), before we get into some of the more effect-specific code later on.

Blog at