summaryrefslogtreecommitdiff
path: root/Optimizing-assembly-code.md
diff options
context:
space:
mode:
authorRangi <remy.oukaour+rangi42@gmail.com>2019-02-19 18:45:43 -0500
committerRangi <remy.oukaour+rangi42@gmail.com>2019-02-19 18:45:43 -0500
commitab7db971ff16ed4a9101a50b8dcaa3d875c1eb20 (patch)
tree27f7551aa5d85921d3e4adf12cd2b0b0c76b694b /Optimizing-assembly-code.md
parentb8e1245b7a56779276a669024eb3c5d21c11e583 (diff)
Opt
Diffstat (limited to 'Optimizing-assembly-code.md')
-rw-r--r--Optimizing-assembly-code.md87
1 files changed, 39 insertions, 48 deletions
diff --git a/Optimizing-assembly-code.md b/Optimizing-assembly-code.md
index f45a44a..2cc248a 100644
--- a/Optimizing-assembly-code.md
+++ b/Optimizing-assembly-code.md
@@ -7,30 +7,29 @@ Most of these tricks come from either [Jeff's GB Assembly Code Tips v1.0](http:/
- [Registers](#registers)
- [Set `a` to 0](#set-a-to-0)
- - [Set `a` to some constant minus `a`](#set-a-to-some-constant-minus-a)
- [Invert the bits of `a`](#invert-the-bits-of-a)
+ - [Set `a` to some constant minus `a`](#set-a-to-some-constant-minus-a)
- [Multiply `hl` by 2](#multiply-hl-by-2)
- [Add `a` to a 16-bit register](#add-a-to-a-16-bit-register)
- - [Increment a 16-bit register](#increment-decrement-a-16-bit-register)
+ - [Increment or decrement a 16-bit register](#increment-or-decrement-a-16-bit-register)
- [Load from an address to `hl`](#load-from-an-address-to-hl)
- [Exchange two 16-bit registers](#exchange-two-16-bit-registers)
- [Load two constants into a register pair](#load-two-constants-into-a-register-pair)
- [Load a constant into `[hl]`](#load-a-constant-into-hl)
- - [Load a constant into `[hl]` and increment or decrement `hl`](#load-a-constant-into-hl-and-increment-or-decrement-hl)
+ - [Load a constant into `[hl]` and incrementing or decrementing `hl`](#load-a-constant-into-hl-and-incrementing-or-decrementing-hl)
- [Increment or decrement `[hl]`](#increment-or-decrement-hl)
- [Branching (control flow)](#branching-control-flow)
- [Relative jumps](#relative-jumps)
- [Compare `a` to 0](#compare-a-to-0)
- [Compare `a` to 1](#compare-a-to-1)
- [Compare `a` to 255](#compare-a-to-255)
- - [Add `a` to `hl` without using a 16-bit register](#add-a-to-hl-without-using-a-16-bit-register)
- [Subroutines (functions)](#subroutines-functions)
- [Tail call optimization](#tail-call-optimization)
- [Call `hl`](#call-hl)
- [Inlining](#inlining)
- [Fallthrough](#fallthrough)
- [Jump and lookup tables](#jump-and-lookup-tables)
- - [Chain comparisons](#chaining-comparisons)
+ - [Chain comparisons](#chain-comparisons)
## Registers
@@ -66,38 +65,38 @@ Don't use the optimized versions if you need to preserve flags. As such, `ld a,
```
-### Set `a` to some constant minus `a`
+### Invert the bits of `a`
Don't do:
```asm
- ; 4 bytes, 4 cycles
- ld b, a
- ld a, CONST
- sub b
+ xor $ff ; 2 bytes, 2 cycles
```
But do:
```asm
- ; 3 bytes, 3 cycles
- cpl
- add CONST + 1
+ cpl ; 1 byte, 1 cycle
```
-### Invert the bits of `a`
+### Set `a` to some constant minus `a`
Don't do:
```asm
- xor $ff ; 2 bytes, 2 cycles
+ ; 4 bytes, 4 cycles
+ ld b, a
+ ld a, CONST
+ sub b
```
But do:
```asm
- cpl ; 1 byte, 1 cycle
+ ; 3 bytes, 3 cycles
+ cpl
+ add CONST + 1
```
@@ -150,7 +149,7 @@ and don't do:
But do:
```asm
- ; 5 bytes, 5 cycles
+ ; 5 bytes, 5 or 6 cycles
add l
ld l, a
jr nc, .no_carry
@@ -159,19 +158,28 @@ But do:
.no_carry:
```
-Or better (doesn't require a label):
+Or better, do:
```asm
; 5 bytes, 5 cycles
- add l ; = a + l
- ld l, a ; cache a + l
- adc h ; = a + l + carry + h
- sub l ; = carry + h
+ add l
+ ld l, a
+ adc h
+ sub l
ld h, a
```
+Or if you can spare another 16-bit register and want to optimize for size over speed, do:
+
+```asm
+ ; 4 bytes, 5 cycles
+ ld d, 0
+ ld e, a
+ add hl, de
+```
+
-### Increment / decrement a 16-bit register
+### Increment or decrement a 16-bit register
When possible, avoid doing:
@@ -179,13 +187,19 @@ When possible, avoid doing:
inc hl ; 1 byte, 2 cycles
```
-If you can ensure that the low byte won't overflow (or that it won't matter if it does), then do this:
+```asm
+ dec hl ; 1 byte, 2 cycles
+```
+
+If the low byte won't overflow, then do:
```asm
inc l ; 1 byte, 1 cycle
```
-Further, if you intend your code to run on DMG (black & white GB), avoiding 16-bit inc/dec means less occasions to trigger the OAM corruption bug.
+```asm
+ dec l ; 1 byte, 1 cycle
+```
### Load from an address to `hl`
@@ -451,29 +465,6 @@ with:
```
-### Add `a` to `hl` without using a 16-bit register
-
-Don't do:
-
-```asm
- ; 4 bytes, 5 cycles
- ld d, 0
- ld e, a
- add hl, de
-```
-
-But do:
-
-```asm
- ; 5 bytes, 5 or 6 cycles
- add l
- ld l, a
- jr nc, .no_carry
- inc h
-.no_carry
-```
-
-
## Subroutines (functions)