Lifting the Scales
“He that is strucken blind can not forget the precious treasure of his eyesight lost.”
— William Shakespeare
The viewing problem
I’ve been getting increasingly frustrated by a particular problem while I’m using the Enbugger. I have, I think, created one of the most error prone mechanisms of programming ever conceived in the history of computing…and that was the point. However, I’m getting to the stage where the short programs I’m writing are sufficiently long that examining them byte by byte is impractical. What I need is a way to dump a section of memory on the screen so I can quickly see what was typed in and inevitably where to correct it.
So we’re going to write a dump utility.
First things first we need a way to put a string on the screen. Okay, that’s not true, first we need to know what constitutes a string and that’s not as straightforward as you might think. A string is a data construct consisting of a number of characters where order has significance. That seems a generic enough definition.
But computers don’t know about strings, or the alphabet, or anything but numbers really. This means we need a way to have a bunch of bytes in memory mean the same thing as a sentence. For the sake of moving things along we are going to use ASCIIZ strings. In this format character in the string is encoded by mapping it to an ASCII character and storing the relevant value in memory. The string is terminated by a 0x0 byte, also known as a NULL character.
This is not a great way of encoding text, but it’s simple and it doesn’t take much space or effort. It also happens to be the mechanism BIOS uses and we’re still attached to that particular umbilical at present. At some far later date the default string encoding method will be chosen as something else, but this will do for now.
Components
First we want a way to dump ASCIIZ strings onto the screen and, because the BIOS expect strings like this it’s pretty simple.
#-------------------------------------------------------------------------------
# Writes a ASCIIZ (NULL terminated) string to the screen
#
# Notes:
# Right now we're using BIOS interrupt 10,e to print in teletype mode so it
# handles moving the cursor and scrolling for us.
#
# Inputs:
# DS:SI - Pointer to the string
#
# Destroys:
# AX, BX
#-------------------------------------------------------------------------------
1:
2180: lodsb ac
# End of string?
2181: or al, al 08 c0
2183: jz 2 74 09
2185: mov ah, 0e b4 0e # int 0x10, 0x0e is BIOS teletype print
2187: mov bx, 07 bb 07 00 # Select page 0, white on black text
218a: int 10 cd 10 # Print character in %al
218c: jmp 1 eb f2
2:
218e: retf cb
Now we have a more interesting problem to solve. Let’s say we have a byte of memory and its value is 0xcd, or 205 in decimal, or 1100 1101 in binary. We, as dwarves recognise that 1100 is a hexadecimal ‘C’. We’ve made a little mapping in our head that says 1100 binary is 12 decimal which is a ‘C’ in hexadecimal. We’ve mapped numbers into characters and to dump out a number as a character we’ll need to let the computer do the same. And here’s how we’re going to do it.
#-------------------------------------------------------------------------------
# Return ASCII hex representation of a byte
#
# Notes:
# The reason for the ordering of AX is most of the time when this is called
# the result will immediately be put into memory and x86 is little-endian so
# we can store the two chars in AX in the right order with one word sized MOV.
#
# Inputs:
# DL - The byte to convert
#
# Outputs:
# AL - ASCII hex of the high nibble
# AH - ASCII hex of the low nibble
#
# Destroys:
# AX
#-------------------------------------------------------------------------------
# Split the nibbles of AL into DH, DL
2140: mov al, dl 88 d0
2142: mov ah, dl 88 d4
2144: and ax, f00f 25 0f f0
2147: shr ah, 4 c0 ec 04
# ASCII '0' is 0x30 so translate accordingly
214a: add ax, 3030 05 30 30
# ASCII 'A' is 0x41 so adjust 1st char if required
214d: cmp al, 3a 3c 3a
214f: jl skipLow 7c 02
2151: add al, 7 04 07
skipLow:
# Swap bytes so we can use short opcodes
2153: xchg al, ah 86 c4
# Adjust 2nd char if required
2155: cmp al, 3a 3c 3a
2157: jl skipHigh 7c 02
2159: add al, 7 04 07
skipHigh:
215b: retf cb
We bring the whole thing together by looping through a piece of memory byte by byte, building an output line and then putting the line on screen.
#-------------------------------------------------------------------------------
# Dumps a section of memory to screen
#
# Notes:
# Dumps 256 bytes in 16x16 grid. We'll use 3100->3151 as a buffer for each
# line of the output.
#
# Inputs:
# DS:SI - Pointer to start in memory
#
# Destroys:
# AX, BX, CX, DX, BP, SI, DI, [3100-3151]
#-------------------------------------------------------------------------------
# A little prep work, initialise our buffer to be spaces
21c0: mov cx, 0028 b9 28 00
21c3: mov di, 1d00 bf 00 1d
21c6: mov ax, 2020 b8 20 20
21c9: rep stosw f3 ab
# We're using ASCIIZ strings so add a NULL terminator
21cb: mov [1d51], 00 c6 06 51 1d 00
# May as well use BP for the line count since it's unused
21d0: mov bp, 0010 bd 10 00
1:
21d3: mov di, 1d00 bf 00 1d
21d6: mov cx, 0010 b9 10 00
2:
# Load byte at ds:si into AL
21d9: lodsb ac
# Convert byte to hex characters
21da: mov dl, al 88 c2
21dc: call 0000:2140 9a 40 21 00 00
# Store in buffer and add space
21e1: stosw ab
21e2: inc di ff c7
21e4: loop 2 e2 f3
# Put buffer on screen
21e6: mov di, si 89 f7
21e8: mov si, 1d00 be 00 1d
21eb: call 0000:2180 9a 80 21 00 00
21f0: mov si, di 89 fe
21f2: dec bp ff cd
21f4: jnz 1 75 dd
21f6: retf cb
If anyone’s paying attention, or even reading in the first place, you’ll have noticed that this 0x1d51 is not the 0x3151 I said in the comment. This discrepancy is because thanks to being able to dump memory I’ve realised the Enbugger calls thing with DS and ES set to 0x140. That, of course, means adjusting any memory offsets by 0x1400. A strange quirk but I’m not changing the Enbugger now.
This isn’t nice code, but it gets the job done. We’re building tools to build better tools here. The important thing is I can now do this.

I think you can agree that’s significantly easier to use than peeking through 256 bytes individually.
Convention
Up until now I’ve been pretty slapdash about inputs and outputs to components but we’ve got a few built now and it’s time to be a little bit more formal.
Here’s how this is going to work.
Immediate values passed to components will use registers in this order:
DX, SI, BX, BP, AX, DI, CX, Stack
Return values will be passed back in this order:
AX, DI, BX, BP, DX, SI, CX, Stack
Memory locations should be passed in via DS:SI and back via ES:DI.
That ordering is not written in stone, but I’ll be trying to follow it. What you’ll mostly see is input using DX (Data Register) and output using AX (Accumulator). The x86 has lots of fun short codes to use if you keep data in AX and a lot of memory functions expect DS:SI and ES:DI. In real mode BX can be used in addressing, but we’re not planning on staying in real mode so it will eventually just be another general purpose register. CX is saved until last simply because it’s used for loop counters, so reserving it saves storing it every single time you call a component. Remember that it’s the caller which is responsible for preserving register contents, not the component.
Onwards
You may have already figured out where I’m headed. Now that I made viewing memory easier the next logical step is to make editing memory easier. So I’ll be pushing on towards making a hex editor to help wean us off the Enbugger. It will probably work a lot like the Hexer editor but significantly more limited. That is a long way in the future though.
I love that the spell-checker keeps trying to correct Shakespeare. Know your place machine!
— Curufir