summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRangi <remy.oukaour+rangi42@gmail.com>2019-02-18 12:20:10 -0500
committerRangi <remy.oukaour+rangi42@gmail.com>2019-02-18 12:20:10 -0500
commit5c7169948cf27a98c1dd49cf29818e352f493bbc (patch)
treecf4235d5251f284b2b9bdb4cd0c7775d35bd26a9
parentb17fc465e7f144715466e1afde66614e0b6a21f1 (diff)
More
-rw-r--r--Optimizing-assembly-code.md294
1 files changed, 258 insertions, 36 deletions
diff --git a/Optimizing-assembly-code.md b/Optimizing-assembly-code.md
index 3e26854..9ee065a 100644
--- a/Optimizing-assembly-code.md
+++ b/Optimizing-assembly-code.md
@@ -6,26 +6,34 @@ Most of these tricks come from either [Jeff's GB Assembly Code Tips v1.0](http:/
## Contents
- [Registers](#registers)
- - [Set `a` to zero](#set-a-to-zero)
+ - [Set `a` to 0](#set-a-to-0)
- [Set `a` to some constant minus `a`](#set-a-to-some-constant-minus-a)
+ - [Invert the bits of `a`](#invert-the-bits-of-a)
+ - [Multiply `hl` by 2](#multiply-hl-by-2)
- [Add `a` to a 16-bit register](#add-a-to-a-16-bit-register)
- [Loading from an offset to `hl`](#loading-from-an-offset-to-hl)
- [Exchanging two 16-bit registers](#exchanging-two-16-bit-registers)
-- [Branching](#branching)
- - [Compare `a` to zero](#compare-a-to-zero)
+ - [Loading a constant into `[hl]`](#loading-a-constant-into-hl)
+ - [Loading two constants into a register pair](#loading-two-constants-into-a-register-pair)
+ - [Loading a constant into `[hl]` and incrementing or decrementing `hl`](#loading-a-constant-into-hl-and-incrementing-or-decrementing-hl)
+ - [Incrementing or decrementing `[hl]`](#incrementing-or-decrementing-hl)
+- [Branching (control flow)](#branching-control-flow)
+ - [Compare `a` to 0](#compare-a-to-0)
- [Compare `a` to 1](#compare-a-to-1)
- [Compare `a` to 255](#compare-a-to-255)
- - [Chaining comparisons](#chaining-comparisons)
-- [Functions](#functions)
+- [Subroutines (functions)](#subroutines-functions)
- [Tail call optimization](#tail-call-optimization)
- [Calling `hl`](#calling-hl)
- [Inlining](#inlining)
+ - [Fallthrough](#fallthrough)
+- [Jump and lookup tables](#jump-and-lookup-tables)
+ - [Chaining comparisons](#chaining-comparisons)
## Registers
-### Set `a` to zero
+### Set `a` to 0
Don't do:
@@ -75,6 +83,41 @@ But do:
```
+### Invert the bits of `a`
+
+Don't do:
+
+```asm
+ xor $ff ; 2 bytes, 2 cycles
+```
+
+But do:
+
+```asm
+ cpl ; 1 byte, 1 cycle
+```
+
+
+### Multiply `hl` by 2
+
+Don't do:
+
+```asm
+ ; 6 bytes, 6 cycles
+ sla l
+ rl h
+```
+
+But do:
+
+```asm
+ ; 1 byte, 2 cycles
+ add hl, hl
+```
+
+(The `SpeciesItemBoost` routine in [engine/battle/effect_commands.asm](../blob/master/engine/battle/effect_commands.asm) actually does this!)
+
+
### Add `a` to a 16-bit register
(The example uses `hl`, but `bc` or `de` would also work.)
@@ -113,7 +156,7 @@ But do:
.no_carry:
```
-or better (doesn't require a label):
+Or better (doesn't require a label):
```asm
; 5 bytes, 5 cycles
@@ -175,9 +218,123 @@ If you care about size:
```
-## Branching
+### Loading a constant into `[hl]`
+
+Don't do:
+
+```asm
+ ; 3 bytes, 4 cycles
+ ld a, CONST
+ ld [hl], a
+```
+
+But do:
+
+```asm
+ ; 2 bytes, 3 cycles
+ ld [hl], CONST
+```
+
+
+### Loading two constants into a register pair
+
+(The example uses `bc`, but `hl` or `de` would also work.)
+
+Don't do:
+
+```asm
+ ; 4 bytes, 4 cycles
+ ld b, ONE
+ ld c, TWO
+```
+
+But do:
+
+```asm
+ ; 3 bytes, 3 cycles
+ ld bc, ONE << 8 | TWO
+```
+
+Or better, use the `lb` macro in [macros/code.asm](../blob/master/macros/code.asm):
+
+```asm
+ ; 3 bytes, 3 cycles
+ lb bc, ONE, TWO
+```
+
+
+### Loading a constant into `[hl]` and incrementing or decrementing `hl`
+
+Don't do:
+
+```asm
+ ; 2 bytes, 4 cycles
+ ld [hl], a
+ inc hl
+```
+
+But do:
+
+```asm
+ ; 1 bytes, 2 cycles
+ ld [hli], a
+```
+
+And don't do:
+
+```asm
+ ; 2 bytes, 4 cycles
+ ld [hl], a
+ dec hl
+```
+
+But do:
+
+```asm
+ ; 1 bytes, 2 cycles
+ ld [hld], a
+```
+
+
+### Incrementing or decrementing `[hl]`
+
+Don't do:
+
+```asm
+ ; 3 bytes, 5 cycles
+ ld a, [hl]
+ inc a
+ ld [hl], a
+```
+
+But do:
+
+```asm
+ ; 1 bytes, 3 cycles
+ inc [hl]
+```
+
+And don't do:
+
+```asm
+ ; 3 bytes, 5 cycles
+ ld a, [hl]
+ dec a
+ ld [hl], a
+```
+
+But do:
+
+```asm
+ ; 1 bytes, 3 cycles
+ dec [hl]
+```
+
+
+## Branching (control flow)
-### Compare `a` to zero
+
+### Compare `a` to 0
Don't do:
@@ -257,34 +414,8 @@ with:
```
-### Chaining comparisons
-
-Don't do:
-
-```asm
- cp 1
- jr z, .equals1
- cp 2
- jr z, .equals2
- cp 3
- jr z, .equals3
- ; ...
-```
-
-But do:
-
-```asm
- dec a
- jr z, .equals1
- dec a
- jr z, .equals2
- dec a
- jr z, .equals3
- ; ...
-```
-
+## Subroutines (functions)
-## Functions
### Tail call optimization
@@ -355,3 +486,94 @@ if `GetOffset` is only called a handful of times. Instead, do:
```
You can set `(some code)` apart with blank lines and put a comment on top to make its self-contained nature clear without the extra `call` and `ret`.
+
+
+### Fallthrough
+
+Don't do:
+
+```asm
+ ...
+ call Function
+ ret
+
+Function:
+ ...
+```
+
+But do:
+
+```asm
+ ...
+ ; fallthrough
+Function:
+ ...
+```
+
+You can still `call Function` elsewhere, but one tail call can be optimized into a fallthrough.
+
+
+## Jump and lookup tables
+
+
+### Chaining comparisons
+
+Don't do:
+
+```asm
+ cp 1
+ jr z, .equals1
+ cp 2
+ jr z, .equals2
+ cp 3
+ jr z, .equals3
+ ...
+```
+
+But do:
+
+```asm
+ dec a
+ jr z, .equals1
+ dec a
+ jr z, .equals2
+ dec a
+ jr z, .equals3
+ ...
+```
+
+Or do:
+
+```asm
+ dec a
+ ld hl, .jumptable
+ ld e, a
+ ld d, 0
+ add hl, de
+ add hl, de
+ ld a, [hli]
+ ld h, [hl]
+ ld l, a
+ jp hl
+
+.jumptable:
+ dw .equals1
+ dw .equals2
+ dw .equals3
+ ...
+```
+
+Or better, do:
+
+```asm
+ dec a
+ ld hl, .jumptable
+ rst JumpTable
+ ...
+
+.jumptable:
+ dw .equals1
+ dw .equals2
+ dw .equals3
+ ...
+``` \ No newline at end of file