Findpixels

Findpixels... one of the prettiest blobs of code. They convert an x and y position to a pointer and bitmask of the pixel on screen (with screen width of 128 pixels and screen at $FC00, i.e. TI-85 and 86). And why try to make your own when it'll probably not even be half as good as one of these. Choose your flavour and enjoy.

  time size registers destroyes notes
ÜBERoutine 203
726
24
23
(d,e) to hl:a b smallest; b=0
CLEM's #1 131 28 (b,c) to hl:a - perfect balance
CLEM's #2 127 35 (b,c) to hl:a bc -
ZAPO 105 40 (c,b) to hl:a de needs mem at $8700
Eble-Yopp-Yopp.. 122 34+ (e,d) to hl:a c code alignment
Edited Eble-Yopp 127 35 (b,c) to hl:a de hl minus $FC00
Yopp's 1391 34 (b,c) to hl:a bde don't use!

Yopp's

Right. Let's start with this one... Any my first findpixel should be better than this. What does he do? He multiplies by adding 16 y times. One of the worst ideas thinkable (at least add y 16 times, but don't do that neither.) Close your eyes and scroll down.

;---------------------------------------------------------------------
;  Routine by James Yopp
;
;  134-2647 cycles - 34 bytes
;  (b,c) to hl:a - destroyes: bde
;---------------------------------------------------------------------

FindPixel:  ; ©1997 James Yopp. If you use it, leave this comment in the source
	push bc
	ld h,$FC
	srl b
	srl b
	srl b
	ld l,b
	ld b,c
	ld de,16
	xor a
BLoop:
	add hl,de
	cp b
	jr z,XitLoop
	dec b
	jr BLoop
XitLoop:
	pop bc
	ld a,b
	and %00000111
	ld b,a
	inc b
	ld a,%00000001
ALoop:
	rrca
	djnz ALoop
	ret

Eble-Yopp-Yopp...

Onwards, to another Yopp routine, but the addition of a few Ebles makes it a lot better! As far as I know, this one's the most commonly used. Probably because it looks so cool with the .org displacement-thingy. If only users knew what it means. For it means you have to place it at exactly right in your program, or it'll grow in size. Not very handy. It is fast though. Just the zapo routine is faster, but this one is easier. Conclusion: for the fastest "easy" routine take this one. Just check the placement.

;---------------------------------------------------------------------
;  Eble-Yopp-Yopp-Eble-Eble-Eble-Yopp-Eble Faster FindPixel Routine
;
;  122 cycles - 34-41 bytes
;  (e,d) to hl:a - destroyes: c
;
;  118 cycles - 33-40 bytes
;  (e,d) to hl:c - destroyes: a
;---------------------------------------------------------------------

FindPixel:
	ld hl,FP_Bits
	ld a,e
	and $07         ; bit offset
	or l            ; add bit offset to HL
	ld l,a
	ld c,(hl)       ; c=bitmask
	ld h,$3F
	ld a,d
	add a,a
	add a,a
	ld l,a          ; l=y*4
	ld a,e
	rra
	add hl,hl
	rra
	add hl,hl       ; hl=$fc00+y*16
	rra             ; a=x/8
	or l            ; add x
	ld l,a          ; video-byte
	ld a,c          ; bitmask (can be removed)
	ret

	.org (($+8) & 0FFF8h)   ; align FP_Bits on the next 8-byte boundary
FP_Bits:
	.db $80,$40,$20,$10,$08,$04,$02,$01

Edited Eble-Yopp

Edited Eble-Yopp? Mainly rebuilt i'd say. A very nice routine because of a great idea. It uses the mainly unknown instuction rld. Neat. Too bad it somehow didn't work out though :( Unlike what the author may believe, it's actually slower and larger than the original (not much, but still.) Wouldn't use it, but it sure works cool ;)
Warning: hl will not point to the screen, but rather the screen offset! You'll have to add $FC00 to use it like any other findpixel.

;---------------------------------------------------------------------
;  An edited EBLE-YOPP Routine
;
;  127 cycles - 35 bytes
;  (b,c) to hl:a - destroyes: de
;---------------------------------------------------------------------

Find_Pixel:
	ld de,FP_Bits
	ld a,b
	and %00000111
	add a,e
	ld e,a
	adc a,d
	sub e
	ld d,a                ; (de)=bitmask

	ld hl,FP_RLD
	ld (hl),c             ; FP_RLD=y
	ld a,b                ; a=x
	rra
	rra
	rra                   ;   x/8
	rld                   ; a(hl)=16y+x/8
	and %00000011
	ld l,(hl)
	ld h,a
	ld a,(de)
	ret

FP_Bits:
	.db 128,64,32,16,8,4,2,1
FP_RLD:
	.db 0                 ; temp byte needed but doesnt have to be in program

ZAPO

ZAPO's long gone now, but they left us this great findpixel. Guess what? It's the fastest one yet! Only 105 cycles. Furthermore, the routine itself is also the smallest one I know of. One problem: it requires you to run some initiation code before use. This makes your program 19 bytes larger, and it'll also use a little memory at $8700. If you need a really fast findpixel, this one's the best. Otherwise, better not.

;---------------------------------------------------------------------
;  The ZAPO even faster findpixel
;  by ~Code7~
;
;  105 cycles - 21 bytes
;  (c,b) to hl:a - destroyes: de
;
;  101 cycles - 20 bytes
;  (c,a) to hl:a - destroyes: de
;---------------------------------------------------------------------

FindPixel:
	ld a,b
FindPixelShort:
	ld h,$3f
	add a,a
	add a,a
	ld l,a     ; hl=$3f00+4*y
	ld a,c
	rra
	add hl,hl
	rra
	add hl,hl  ; hl=$fc00+16*y
	rra        ; 8*x
	add a,l
	ld l,a
	ld d,$87
	ld a,c
	and d      ; =7
	ld e,a
	ld a,(de)
	ret

;  the offset table program must be placed at $8700
;
;  163 cycles  19 bytes    destroyes: bcdehl

InitializeFP:
	ld de,$8700
	ld hl,Offset
	ld bc,8
	ldir
	ret
Offset:
	.db $80,$40,$20,$10,$8,$4,$2,$1

Clem #2

Nothing too special about this one. Good optimizing, nice routine, but there are better ones outthere.

;---------------------------------------------------------------------
;  CLEM's FINDPIXEL
;  by Clem
;
;  127 cycles - 35 bytes
;  (b,c) to hl:a - destroyes: bc
;---------------------------------------------------------------------

FindPixel:
	ld h,63
	ld a,c
	add a,a
	add a,a
	ld l,a    ; hl=$3f00+4*y
	ld a,b
	rra
	add hl,hl
	rra
	add hl,hl ; hl=$fc00+16*y
	rra       ; a=x/8
	or l
	ld l,a
	ld a,b
	and 7
	ld bc,FP_Bits
	add a,c
	ld c,a
	adc a,b
	sub c
	ld b,a
	ld a,(bc)
	ret
FP_Bits:
	.db $80,$40,$20,$10,$08,$04,$02,$01

Clem #1

Same author, different routine. This one is special. With 28 bytes one of the smallest (only snake86 is smaller), and still quite fast (7% slower than Eble-Yopp.) And, bonus, no registers are destroyed!! In a lot of cases this one would be ideal.

;---------------------------------------------------------------------
;  CLEM's FIND_PIXEL
;  by Clem
;
;  131 cycles - 28 bytes
;  (b,c) to hl:a - destroyes: none
;---------------------------------------------------------------------

FindPixel:
	ld h,63
	ld a,c
	add a,a
	add a,a
	ld l,a    ; hl=$3f00+4*y
	ld a,b
	rra
	add hl,hl
	rra
	add hl,hl ; hl=$fc00+16*y
	rra       ; a=x/8
	or l
	ld l,a
	ld a,b
	and 7
	cpl
	rlca
	rlca
	rlca
	ld (FP_Bit),a
	xor a
FP_Bit =$+1
	set 0,a
	ret

Über

And last but not least, the smallest findpixel ever! Look at it: only 24 bytes! And with an average of 158 cycles still quite usable. For most progs which don't need that speed anyhow, this one's your choice. And afterwards, b will be 0, so you can use that too to shave off another byte or so (think ld c,16 \ add hl,bc.)

I just called it the über routine because it isn't really released by anyone. I took it out of Snake86 (unreleased project by David Phillips, Jonah Cohen and Ben Mickle). The byte-part seems to come straight out of the Eble-Yopp-Yopp routine, while we've seen the bit-part before in the routine by just Yopp (so that wasn't all bad afterall ;). Those combined give a kickass piece 'a code of just 25 bytes. When I showed it to my friend Free Bird he thought of a way to shave off another byte (I won't let him have all the credits though. It's simply a routine by and for everybody :))

Also see the comments for how to shave of yet another byte, courtesy of Xavier van Rijnsoever. It does make it lots slower, but in most cases it'll still do fine (still only 115µs on average).

So voici! Beat this:

;---------------------------------------------------------------------
;  Ueber-FindPixel (Snake86 + Free Bird)
;  by Friends of the 86
;
;  144-263 cycles - 24 bytes
;  (d,e) to hl:a - destroyes: b

;  140-4475 cycles (726 avg) - 23 bytes
;  (d,e) to hl:a - destroyes: b
;---------------------------------------------------------------------

FindPixel:
	ld l,e
	add hl,hl
	add hl,hl
	ld h,$fc/4  ; hl=$3f00+4*y
	ld a,d
	and 7
	inc a       ; remove this and replace the line below for small version
	ld b,a
	ld a,d
	rra
	add hl,hl
	rra
	add hl,hl   ; hl=$fc00+16*y
	rra         ; a=x/8
	or l
	ld l,a
	ld a,1      ; change to ld a,128 for small version
PixelLoop:
	rrca
	djnz PixelLoop
	ret