Addressing, the real issue
If you gaze too long into the abyss, the abyss may start giggling.
Some history
Shortly after the first flickering rays of light fell upon the newly formed Earth Intel produced a processor called the 8080. The 8080 combined two 8-bit registers to make a 16-bit value for pointers and could address 65,536 (64KiB) addresses in memory. This was a 1:1 relationship and all was good and understandable.
Then Intel created the 8086 which only had 16-bit registers. However, they wanted to have more addressable memory (1MiB). They could have done the same trick and had a 32-bit value for pointers which would have solved that problem and maintained the 1:1 relationship. But maybe it wasn’t technically possible at the time, or maybe they wanted to ease compatibility with the 8080 and at this point who cares? All we know is that instead the bowels of hell opened and gave birth to segmentation.
The engineers that came up with this were significantly smarter than me and will have had excellent reasons for why things were done this way. I don’t know what they were, all I know is I hate this addressing system.
Segmented addressing
You have a segment register and an offset register.
To get a physical address you multiply the segment register by 16 and add the offset register.
An example
CS contains 0x0500, IP contains 0x00ff so address 0500:00ff in segmented addressing.
To get the physical address we multiply CS by 16 (shift left by 4):
0x0500 * 0x10 = 0x5000
Then we add the offset:
0x5000 + 0x00ff = 0x50ff
Hold up
The more mathematically inclined reader is asking themselves a question at this point. And that question is likely to be “Isn’t there more than one way to generate a physical address?”. Yes, e.g., 0x50ff can also be addressed by 0000:50ff, 050f:000f and many other combinations.
Incidentally this is why at the top of pretty much every boot sector you’ll see a far jump to set the code segment. The BIOS loads the sector into a physical address of 0x7c00 but it can make that address any way it sees fit.
So the 1:1 relationship has been lost and now there’s a logical address space which must be translated by the CPU into the physical address space. Or in our case translated by me any time there’s a memory instruction.
Equally observant readers will have remembered I said the 8086 could address 1MiB of memory and the segment:offset algorithm can produce an address above that. They can reassure themselves with the knowledge that it wraps around, no I’m not making this up.
Why is this relevant?
You’ve got a shiny new 64-bit x86 processor right? It was probably made at least in the last decade. So why would this bit of ancient history have any meaning?
Because the processor still starts in an 8086 compatible mode called real mode. Yes, even today although it’s almost certainly emulated. So the first thing any x86 operating system, or more usually their bootloader, does is run away from this mode and lock segmentation in the cellar where it belongs.
However, that’s a long way away for DwarfOS so for now I’ll have to deal with all the quirks and perils real mode has to offer. Segmentation and all.
— Curufir