magazine logo
Welcome to the official Attitude website - Europe's #1 C64 scene disk magazine
On Speedcode
by Puterman/Fairlight

If you want to impress anyone besides the worst lamers with your advanced code hacking abilities, you have to optimize your code. And most of the times you optimize something for speed on the C-64, you'll use speedcode, because there's no other way to make it as fast. So basically, using speedcode is a necessary ability for any wannabe demo coder (in case you're uneducated enough not to know what speedcode means, it means unrolled loops, and if you don't know what that means, check the examples).

There are some disadvantages, of course. First of all, it may "waste" lots of memory. Second, you'll have to write code to generate it (unless you type it all in, which is universally stupid, of course). Then there's the usual set of programming problems, like using it in the right place and so on, but I won't go into that. You'll have to do some work yourself. Although it might not be very trendy right now, you can actually learn a lot by writing code (unlike people who expect the famous VIC Article and coding articles in "Attitude" to give them all the answers).

I'll start with a very simple (and common) example to get you going: code to clear a charset. If the charset is at $2000, the code should look like this:

lda #$00
sta $2000
sta $2001
sta $2002
[...]
sta $27fe
sta $27ff

Here's some code to generate the sta's (I'll leave the generation of the lda as an exercise to the reader :)). The code is placed at $2800.

lda #$00 ; address of speedcode
ldx #$28
sta $02
stx $03
loop ldy #$00
lda #$8d ; opcode for sta $xxxx
sta ($02),y
lo lda #$00 ; address of charset lo
iny
sta ($02),y
hi lda #$28 ; address of charset hi
iny
sta ($02),y

lda lo+1 ; increase charset address
clc
adc #$01
sta lo+1
lda hi+1
adc #$00
sta hi+1
cmp #$30
bne loop

This will result in 3*$0800=$1800 bytes of speedcode. Running it will take $2000 cycles (4*$0800=$2000=8192). Don't forget to add an rts to the end of it, so that you can jsr to $2800 and regain control after the speedcode has finished executing.

Now, that was simple, about as simple as it gets actually. As you can see, it needs quite a lot of memory, and if your memory is running short, you should of course use something like this instead:

lda #$00
ldx #$7f
loop sta $2000,x
sta $2080,x
sta $2100,x
[...]
sta $2700,x
sta $2780,x
dex
bpl loop

That one doesn't need much memory, and will complete in 8832 cycles, so you won't lose all that many cycles, while you'll save quite a lot of memory. "Wow-wow-wow, man", you say (or something like that, possibly in some exotic language), "650 cycles gained and $1800 bytes wasted, sounds like this speedcode thing sucks!" Well, that depends on what you want to achieve, and of course, this is the simplest possible example, and sometimes there are much more cycles to save, because you don't have to reserve a register as loop counter, update addresses etc. And some routines just aren't possible without speedcode, for example SHFLI. You really don't have 5 cycles to spare on each line if you're going to code a kickass raster effect.

So, I guess we should move on to another example, that shows the advantages of speedcode a bit more clearly. Let's say you have a loop where the body looks something like this:

iny
ldx tab,y
lda tab2,x
sta tab3,x

If we're going to put this into a loop, we'll get the problem that we'll run out of registers to use as loop index. So we'll have to save a register each time we run the loop code. It'd look something like this:

ldx #$00
loop stx xsave+1
iny
ldx tab,y
lda tab2,x
sta tab3,x
xsave ldx #$00
dex
bne loop

The body of the loop will be executed $100 times, and the loop overhead is 13 cycles each time, giving a total of $0d00 (=3328) cycles, while the actual body of the code runs in 14 cycles, giving a total of $0e00 (=3584) cycles. The corresponding speedcode would look like this:

iny ; 0
ldx tab,y
lda tab2,x
sta tab3,x
iny ; 1
ldx tab,y
lda tab2,x
sta tab3,x
[...]
iny ; $ff
ldx tab,y
lda tab2,x
sta tab3,x

So with the speedcode we don't get any overhead at all, so we save almost 50% of the cycles that the loop version needed. Of course, we're still wasting some space, but saving more than 3000 cycles every frame should make it worth it in quite a few cases.

The two examples above are very simple, compared to some real world stuff, so they're quickly and easily generated by a short piece of assembly code. If you need to generate some more complicated code, and you're as lazy as me, you might consider using a high level language to pre-generate the speedcode. This might also be necessary if it takes a lot of time to generate the code. I like C, and I think it fits this purpose very well, because it's weakly typed (right, so computer scientists don't like it, because it doesn't protect them from some bugs like eg. ADA or SML does, but then again, this is about demo coding, not about proving the correctness of concurrent systems or something like that) and it has bitwise operators, thus leaving the control of the code to you, instead of getting in your way, like some languages like to do (end of political section).

Pre-generated speedcode will of course mean that your stuff takes longer to load, and as always, it's your job to decide if it's worth it or not.

In case you want some real-world examples, you might want to check out the sideborder plotter (the rotating hand) in "Emanation Machine". The speedcode is at $1000, and the main code of the routine starts at $0803, so somewhere around there you'll find the code that generates it. You'll find an example of pre-generated speedcode in that yellow rotating thingy part in "We/Laser". I'll leave it as an exercise to the interested student to find it and analyze it (that's another way of saying that I'm too lazy to check where it's located). If you want more examples, check out the code for any decent part, there's bound to be some speedcode somewhere in there.

The deadline is near, and so is the end of this article. It'd be interesting to get some feedback on these coding articles, because I've written quite a few by now, and I don't have a clue if anyone at all reads them. So please send me an email and tell me what you think. The address is uxm165t@tninet.se. See you in the next issue of "Attitude"!

PUTERMAN/FAIRLIGHT

   Add/view comments on this article (comments: 0)

SCENE GROUPS
 
OPINION POLL
Do you believe we are
able to cope with
releasing "Attitude"
on a regular basis?

yes no

 YES: 282 (70.68%)
NO:117 (29.32%) 

NEWS COMMENTS

ART COMMENTS

STATISTICS
all visits:

visits today:


website started:
23/09/2004
 
Official Webpage
of Attitude
Copyright 2004-2018
 
DJ Gruby/TRIAD