To read all comments associated with this story, please click here.

TempleOS,

Note that you can (and should) reply to specific posts instead of starting a new comment thread each time.

"For everybody who does not know the joys of MUL and DIV without CPU instructions, have a look:"

Anyways, I noticed your mul uses a conditional jump inside the loop that I think could be avoided by doing bit manipulations. Here's some untested code that eliminates the inner jump. Who knows if it'd make a difference in execution time on x86, where this is obviously just for fun. But it might make a difference on the earlier processors where branches were expensive?

mov rax, A ; input

mov rbx, B ; input

mov rcx, 64 ; loop counter

xor rdx, rdx ; output

.next:

mov rsi, rax

and rsi, 0x0001 ; isolate lsb = 0 , 1

neg rsi ; = 0 , 0xffff...

and rsi, rbx ; = 0 , rbx

add rdx, rsi ; add in

shr rax, 1

shl rbx, 1

loop .next

; rdx = A * B

"Taylor is slow to converge, so I've heard. I'm sure there was a standard way to do trig by 1975."

It's very easy to recognize on a plot after even just 3 taylor terms, each additional 2 terms represents another full sine cycle, it takes shape pretty quick around the origin, but I hadn't measured the actual accuracy.

It's not very good, but wikipedia does have a picture:

http://en.wikipedia.org/wiki/Taylor_series

Anyways I was just curious if you knew what algorithms the early computers actually used, not that it matters much.

Here's another attempt at combining operations and using one less register. Anyone else give it a shot?

mov rax, A ; input

mov rbx, B ; input

xor rdx, rdx ; output

.next:

xor rcx, rcx

rcr rax ; isolate lsb into carry flag

sbc rcx, 0 ; rcx = 0 - carry = 0, 0xffff

and rcx, rbx ; rcx = 0 , rbx

add rdx, rcx ; add in

shl rbx, 1

jnz .next

; rdx = A * B

Depending on the typical values passed in, one might test whether rax=0 at each iteration and exit early when it is. The additional test might nevertheless save extraneous loop cycles.

*Edited 2013-04-04 19:14 UTC*

It's very easy to recognize on a plot after even just 3 taylor terms, each additional 2 terms represents another full sine cycle, it takes shape pretty quick around the origin, but I hadn't measured the actual accuracy.

It's not very good, but wikipedia does have a picture:

http://en.wikipedia.org/wiki/Taylor_series

Anyways I was just curious if you knew what algorithms the early computers actually used, not that it matters much.

If you read up numerical algorithms, you might find some gems. For example, I do know that lookup tables are small and very worthwhile. Also, there are trigonometric identities to exploit -- keep a copy of pi and pi/4 in a constant somewhere, and map everything to the first quadrant. Then use double angle formulae and so on to make the initial value really small, then do one good sine or cosine taylor series expansion. Then manipulate algebraically into the value you want.

There are also repeated fractions, rational functions and other algorithms, all interesting in their own right.

For example, the tangent or arctangent (I cannot remember) is particularly inefficient in Taylor's expansion. Repeated fractions truncated somewhere tends to give a FAR BETTER calculation.

Do note, however, that all methods suffer from indirection -- the error terms do add up, and they can be bigger than the actual value if you are not careful enough. One must be careful to stop further calculation when the improvement in accuracy is more than destroyed by a degradation of precision.

It is no joke that the standard sine and cosine calculations are hundreds of instructions long, if not a lot more. It is also the reason why computer games have all sorts of crazy approximation methods that run far faster, and why computer games use up more and more resources -- people just stop bothering with the older approximation methods, so that resource usage creep upwards for no real gains.

Member since:

2013-04-03

The Apple and C64 had trig functions, or at least the C64 did. Taylor is slow to converge, so I've heard. I'm sure there was a standard way to do trig by 1975. The C64 ROM was 20K for everything! Interpolation and a table can't be right, but I don't know.

Just doing floating-point MUL and DIV on a 6502 is pretty damn hard, or at least tedious. You have ADD, SUB, AND, OR, NOT SHIFT's and BRANCHes on 8-bits. You have no idea! Integer MUL and DIV take about 12 instructions.

For everybody who does not know the joys of MUL and DIV without CPU instructions, have a look:

http://www.templeos.org/Wb/Demo/Asm/DivByHand.html

http://www.templeos.org/Wb/Demo/Asm/MulByHand.html