From: CBFalconer - view profile
Date: Wed, Mar 21 2001 1:11 pm
Email: CBFalconer
Groups: comp.os.cpm
Subject: Falconer FP package part 5
"arobase, Salle multimédia" wrote:
> Falconer FP package part 5:
> ; FUNCTION.ASM
> ; ------------
> ;
> ; See FALCONER.WS4 as doc.
> ;
> ; (Retyped by Emmanuel ROCHE.)
> ;
> ;--------------------------------
> ; External routines required, see FLTARITH
> ;--------------------------------
> ;
> ; External arithmetic error trap
> ;
> extrn aerc
> ;
> ; External floating arithmetic
> ;
> extrn fadd,fdiv,fdivr
> extrn fint,fixr
> extrn fmul,frcip,fsubr
... snip ...
those look familiar. They ARE NOT the earlier version
published in DDJ, but a later and MUCH improved version. I made
further improvements, don't have them here. The major speed
improvement was in the integer multiply, which improved the whole
package.
The ".lvl" operations were for an assembler that kept track of
stack depth, so I could easily find silly goofs. Some of the
macros depended on the numerical internal assignments of register
values, something like A=0, b=1 ... psw = 7, I forget the details
so users beware. Similarly the rtn macro, which basically checked
I had cleaned the stack correctly.
Since the package did proper rounding, it outperformed (in
accuracy) many of the 24 bit significand systems then current.
Since it did all arithmetic in registers, it was an order of
magnitude faster than anybody elses. The roughly 4.7 digit
accuracy was more than adequate for my purposes at the time.
I have squirreled away your messages - who know, I may want to use
it again some time. I do have paper listings of it, but nothing
machine readable. Thanks.
Just for the record, if I make no transcription errors, here is
the code for the improved multiplication:
;
; Modifications of routines by
; Jerry L. Goodrich,
; Pennsylvania State university,
; replaces previous .imul .idiv routines, faster but longer
;
; Unsgned integer multiply
; Operand range 0 to 65535, prod 0..4295*10^6 (approx)
; (dehl) := (bc) * (de)
; d,e,h,l
.imul: push psw
mov a,e; low order multiplier byte
push d; save high mult. byte
call bmult; do 1 byte mult
xthl; save low order product, get multiplier
push psw; save hi order byte of 1st product
mov a,h; hi order mult. byte
call bmult; 2nd 1 byte mult.
mov d,a; position hi order prod. byte
pop psw; hi order byte of 1st prod
add h; update 3rd prod byte
mov e,a; to e
jnc imul1
inr d; propagate any carry
imul1: mov a,l; low byte, 2nd prod
pop h; low 2 bytes, 1st prod
add h; combine
mov h,a
jnc poppsw
inx d; propagate carry
; " "
; pop psw and exit
poppsw: pop psw
ret
;
; 8 by 16 bit unsigned multiplication
(ahl) := (a) * (bc)
(d) := (e) := 0
; a,f,d,e,h,l
bmult: lxi d,8; d := 0, e := bit ctr
mov h,d
mov l,d; clear hl
add a; 1st mult. bit to cy
jnc bmult2; 0 bit
bmult1: dad b; 1 bit, add to partial product
adc d; cy to (a) rh bit
bmult2: dcr e
rz; done
dad h; left shift product
adc a; into (a), mul bit to cy
jc bmult1; bit is 1
jmp bmult2; bit is 0
--
Chuck F