diff options
author | Rangi <remy.oukaour+rangi42@gmail.com> | 2019-02-18 12:20:10 -0500 |
---|---|---|
committer | Rangi <remy.oukaour+rangi42@gmail.com> | 2019-02-18 12:20:10 -0500 |
commit | 5c7169948cf27a98c1dd49cf29818e352f493bbc (patch) | |
tree | cf4235d5251f284b2b9bdb4cd0c7775d35bd26a9 /Optimizing-assembly-code.md | |
parent | b17fc465e7f144715466e1afde66614e0b6a21f1 (diff) |
More
Diffstat (limited to 'Optimizing-assembly-code.md')
-rw-r--r-- | Optimizing-assembly-code.md | 294 |
1 files changed, 258 insertions, 36 deletions
diff --git a/Optimizing-assembly-code.md b/Optimizing-assembly-code.md index 3e26854..9ee065a 100644 --- a/Optimizing-assembly-code.md +++ b/Optimizing-assembly-code.md @@ -6,26 +6,34 @@ Most of these tricks come from either [Jeff's GB Assembly Code Tips v1.0](http:/ ## Contents - [Registers](#registers) - - [Set `a` to zero](#set-a-to-zero) + - [Set `a` to 0](#set-a-to-0) - [Set `a` to some constant minus `a`](#set-a-to-some-constant-minus-a) + - [Invert the bits of `a`](#invert-the-bits-of-a) + - [Multiply `hl` by 2](#multiply-hl-by-2) - [Add `a` to a 16-bit register](#add-a-to-a-16-bit-register) - [Loading from an offset to `hl`](#loading-from-an-offset-to-hl) - [Exchanging two 16-bit registers](#exchanging-two-16-bit-registers) -- [Branching](#branching) - - [Compare `a` to zero](#compare-a-to-zero) + - [Loading a constant into `[hl]`](#loading-a-constant-into-hl) + - [Loading two constants into a register pair](#loading-two-constants-into-a-register-pair) + - [Loading a constant into `[hl]` and incrementing or decrementing `hl`](#loading-a-constant-into-hl-and-incrementing-or-decrementing-hl) + - [Incrementing or decrementing `[hl]`](#incrementing-or-decrementing-hl) +- [Branching (control flow)](#branching-control-flow) + - [Compare `a` to 0](#compare-a-to-0) - [Compare `a` to 1](#compare-a-to-1) - [Compare `a` to 255](#compare-a-to-255) - - [Chaining comparisons](#chaining-comparisons) -- [Functions](#functions) +- [Subroutines (functions)](#subroutines-functions) - [Tail call optimization](#tail-call-optimization) - [Calling `hl`](#calling-hl) - [Inlining](#inlining) + - [Fallthrough](#fallthrough) +- [Jump and lookup tables](#jump-and-lookup-tables) + - [Chaining comparisons](#chaining-comparisons) ## Registers -### Set `a` to zero +### Set `a` to 0 Don't do: @@ -75,6 +83,41 @@ But do: ``` +### Invert the bits of `a` + +Don't do: + +```asm + xor $ff ; 2 bytes, 2 cycles +``` + +But do: + +```asm + cpl ; 1 byte, 1 cycle +``` + + +### Multiply `hl` by 2 + +Don't do: + +```asm + ; 6 bytes, 6 cycles + sla l + rl h +``` + +But do: + +```asm + ; 1 byte, 2 cycles + add hl, hl +``` + +(The `SpeciesItemBoost` routine in [engine/battle/effect_commands.asm](../blob/master/engine/battle/effect_commands.asm) actually does this!) + + ### Add `a` to a 16-bit register (The example uses `hl`, but `bc` or `de` would also work.) @@ -113,7 +156,7 @@ But do: .no_carry: ``` -or better (doesn't require a label): +Or better (doesn't require a label): ```asm ; 5 bytes, 5 cycles @@ -175,9 +218,123 @@ If you care about size: ``` -## Branching +### Loading a constant into `[hl]` + +Don't do: + +```asm + ; 3 bytes, 4 cycles + ld a, CONST + ld [hl], a +``` + +But do: + +```asm + ; 2 bytes, 3 cycles + ld [hl], CONST +``` + + +### Loading two constants into a register pair + +(The example uses `bc`, but `hl` or `de` would also work.) + +Don't do: + +```asm + ; 4 bytes, 4 cycles + ld b, ONE + ld c, TWO +``` + +But do: + +```asm + ; 3 bytes, 3 cycles + ld bc, ONE << 8 | TWO +``` + +Or better, use the `lb` macro in [macros/code.asm](../blob/master/macros/code.asm): + +```asm + ; 3 bytes, 3 cycles + lb bc, ONE, TWO +``` + + +### Loading a constant into `[hl]` and incrementing or decrementing `hl` + +Don't do: + +```asm + ; 2 bytes, 4 cycles + ld [hl], a + inc hl +``` + +But do: + +```asm + ; 1 bytes, 2 cycles + ld [hli], a +``` + +And don't do: + +```asm + ; 2 bytes, 4 cycles + ld [hl], a + dec hl +``` + +But do: + +```asm + ; 1 bytes, 2 cycles + ld [hld], a +``` + + +### Incrementing or decrementing `[hl]` + +Don't do: + +```asm + ; 3 bytes, 5 cycles + ld a, [hl] + inc a + ld [hl], a +``` + +But do: + +```asm + ; 1 bytes, 3 cycles + inc [hl] +``` + +And don't do: + +```asm + ; 3 bytes, 5 cycles + ld a, [hl] + dec a + ld [hl], a +``` + +But do: + +```asm + ; 1 bytes, 3 cycles + dec [hl] +``` + + +## Branching (control flow) -### Compare `a` to zero + +### Compare `a` to 0 Don't do: @@ -257,34 +414,8 @@ with: ``` -### Chaining comparisons - -Don't do: - -```asm - cp 1 - jr z, .equals1 - cp 2 - jr z, .equals2 - cp 3 - jr z, .equals3 - ; ... -``` - -But do: - -```asm - dec a - jr z, .equals1 - dec a - jr z, .equals2 - dec a - jr z, .equals3 - ; ... -``` - +## Subroutines (functions) -## Functions ### Tail call optimization @@ -355,3 +486,94 @@ if `GetOffset` is only called a handful of times. Instead, do: ``` You can set `(some code)` apart with blank lines and put a comment on top to make its self-contained nature clear without the extra `call` and `ret`. + + +### Fallthrough + +Don't do: + +```asm + ... + call Function + ret + +Function: + ... +``` + +But do: + +```asm + ... + ; fallthrough +Function: + ... +``` + +You can still `call Function` elsewhere, but one tail call can be optimized into a fallthrough. + + +## Jump and lookup tables + + +### Chaining comparisons + +Don't do: + +```asm + cp 1 + jr z, .equals1 + cp 2 + jr z, .equals2 + cp 3 + jr z, .equals3 + ... +``` + +But do: + +```asm + dec a + jr z, .equals1 + dec a + jr z, .equals2 + dec a + jr z, .equals3 + ... +``` + +Or do: + +```asm + dec a + ld hl, .jumptable + ld e, a + ld d, 0 + add hl, de + add hl, de + ld a, [hli] + ld h, [hl] + ld l, a + jp hl + +.jumptable: + dw .equals1 + dw .equals2 + dw .equals3 + ... +``` + +Or better, do: + +```asm + dec a + ld hl, .jumptable + rst JumpTable + ... + +.jumptable: + dw .equals1 + dw .equals2 + dw .equals3 + ... +```
\ No newline at end of file |