# Fastest possible multiplication routine?

Página 1/7
| 2 | 3 | 4 | 5 | 6

Hi,

today I stumbled over a routine I developed for my old 3D demos and which I uploaded to the CPCWiki some time ago:

http://www.cpcwiki.com/index.php/Programming:Ultrafast_Multiplication

As this is 100% Z80 and not platform related it maybe interesting for you.
AFAIK it's the fastest way of multiplication which is possible on Z80 for this precision (16bit x 8/7bit).
I wonder, if it could be even done faster, and how much faster the R800 does with its inbuilt MUL-commands?

CU,
Prodatron

Login sesión o register para postear comentarios

The code is a bit hard to read. How ultra fast is it?

Sometimes I like to use ":" and put multiple commands in one line
The whole thing uses (more or less) ONE routine for EACH rigth-side value, that's it. The speed depends on the rigth-side value (8/7 bit) as different routines are selected then.

Do you have some cpu cycle numbers? worst case/best case/ average...

Actually I guess I can do that myself. Its not that hard...

By doing shifts on DE (or BC) and accumulate on HL, you'll be faster in some cases

MUL073 for example can be implemented like this (input args may need to be modified to fit your code):

```MUL073:
ld      h,d
ld      l,e
sla     e
rl      d
sla     e
rl      d
sla     e
rl      d
sla     e
rl      d
ld      a,h
ret
```

Saves 26 cycles if I did the math correct.

Also, why do you use:

```        PUSH BC
RET
```

```        JP (HL)
```

I did a similar routine not too long ago. Don't have it on this computer but the I think I did the table lookup like this:

```Mul:
ld      h,MulTab
ld      b,(hl)
inc     l
ld      h,(hl)
ld      h,b
jp      (hl)
```

Saves another 20 cycles

Anyone with a real turbo-r to tell us about the r800 mul routines?Actually i am interested to buy a turbo-r by the way.How much would a good price be? 200-250 euros?

Sorry for double post

I think some of the cases can be optimized quite a bit. here are some examples.
These particular cases are maybe 10-30% faster. But I think most of the cases in your routine are close to optimal. But if say 20% of the cases can be made significantly faster, I think it will be noticeable on the overall performance.

```MUL007:
ld      d,h
ld      e,l
ld      a,h
ret

MUL011:
ld      d,h
ld      e,l
ld      a,h
ret

MUL063:
ld      d,h
ld      e,l
rr      h
rr      l
rr      h
rr      l
xor     a
ld      h,l
ld      l,a
sbc     hl,de
ld      a,h
ret

MUL064:
rr      h
rr      l
rr      h
ld      a,l
rrca
ret

MUL065:
ld      d,h
ld      e,l
rr      h
rr      l
rr      h
rr      l
ld      h,l
ld      l,0
ld      a,h
ret

MUL073:
ld      d,h
ld      e,l
sla     e
rl      d
sla     e
rl      d
sla     e
rl      d
sla     e
rl      d
ld      a,h
ret

MUL127:
ld      d,h
ld      e,l
rr      h
rr      l
xor     a
ld      h,l
ld      l,a
sbc     hl,de
ld      a,h
ret
```

Some of the versions I put here may be wrong. I'll try to find the real implementation and post it here

Página 1/7
| 2 | 3 | 4 | 5 | 6