Does anyone have a similar article with more detail? I don't quite want to read the datasheet of your favorite microprocessor, but I would like a decent amount more detail than what's provided. Especially before UEFI/BIOS.
UEFI is an interface implemented by firmware (literally, Unified Extensible Firmware Interface), it's not the firmware itself. Saying "it starts the machine" is a bit of a nomenclature faux pas. The firmware starts the machine, you talk to the firmware via UEFI.
This post skips all the interesting things in the modern firmware dance. Not the least of which is when you call ExitBootServices() you're already in long mode. There's no need for the journey through real and protected.
As should be clear from the reset vector, the 80286 and its successors actually boot in unreal mode. On the 80386, the base address of the code segment is 0xffff0000, which cannot be obtained by shifting the 16-bit CS register by 4. The descriptor cache simply gets loaded with the correct value at reset. Writing to CS in real mode overwrites the cached value with CS * 16.
> When power stabilizes, the CPU resets itself to a tiny, old‑fashioned mode called real mode. Real mode dates back to the original 8086 chip. The rules are simple on purpose. Memory addresses are built from two values the CPU keeps in special fast storage called registers. You combine a segment and an offset like this:
physical_address = (segment << 4) + offset
Your grandmother sounds unusually proficient with this sort of thing.
I dont know, i just don't like the tone. This is a complex subject where the target audience should probably already know what is an hexadecimal number or an interrupt and the explanation of a cpu register ought to be better than:
"A register is a tiny slot inside the CPU. It holds a number the CPU is using right now."
If the subject interest you, you deserve better.
There's nothing wrong with it but most people striving to do this would make each bit clickable to detail or something so that you can read at your level. I imagine this was a constraint of the site framework or the author's writing style.
But hey what the heck, it's fine. An LLM can rewrite it to whatever level of knowledge you like so the deepest level is optimal.
One of the things we were taught in uni was audience analysis. I think about it a lot. What's expected to already be known? What acronyms or phrases need defining? Etc. This is an art I'm far from perfect at and it seems a lot of tech writers are too
Related to this topic, what is the best way to replace the code involved in the entire boot process? This is useful when sanitizing a system received from a provider that may not be trustworthy, as malware could be hidden at low levels.
The disk could be wiped from the BIOS. One could also run “fwupdmgr update” from a live USB to update the motherboard firmware and then reinstall the operating system. However, I’m not sure if this would completely clear the system.
It's a weird article for me. On one side it is an interesting topic. On the other hand why are we explaining what a hex number is? Who is interested in this level of detail but doesn't know hex? Maybe I'm overanalyzing.
At the same time this doesn't address my biggest open question on the topic - how do we get from the physical push to the reset vector? Somehow that magic works in HW, physics and electronics - how?
ARM and lots of non-x86 architectures often use a series of bootloaders to kick up ram, wake up parts of the hardware, blah blah, and read devicetree blobs to know what the hardware looks like
Nice to see the good old hacker energy & independent blogs explaining things showing up on top of hacker news. Welcome change from insufferable agent this and vibe that
Video device initialization is intimately intertwined and a dependency for all this early boot stuff. I was hoping to learn more but it's not even mentioned. Still, neat.
It's not a dependency for Linux boot at all. You can do well with serial port alone, as anyone who brought up eg. an ARM SoC in Linux will attest to.
Also it's not very interesting either. At simplest, Linux just needs to take a pointer to a beginning of a framebuffer and some metadata, and will write to the framebuffer whenever there's something to update.
Soekris (rip) had an x86 network device. Four 10/100s and the disk was a CF. Could only serial console that thing - or SSH once it's running. Best router I ever had.
Also, in 2000 when Windows crashed you could get a serial debugger. Wonder if they still do that?
Maybe not linux specifically, but POST requires a video device software (BIOS Option ROM or UEFI GOP Drivers) of some sort does it not? That's been my experience with all PCs for 30 years. But maybe there are cases where it doesn't?
edit: Apparently it's a desktop motherboard firmware thing. Ubiquitous but not technically a requirement for POSTing a computer.
fascinating how it's all over the place wrt level of detail. and absolutely unreadable. luckily the layout is simple and reader mode works.
> Hex is base 16
i would argue that someone that understand bases (in the first place), understands what the << operator does (context where base 16 is explained), but doesn't understand what base 16 is, doesn't exist. this is the kind of haphazard approach of this article i'm talking about. even the author's name, 0xkato, is an example of this.
as to the content, i wish it had touched on TPM, PCRs, UEFI secure boot, and ME pre-boot.
i'm forgiving all the actual errors since it is a pretty broad overview.
i'm guessing first-year uni student.
rather amazed a post like this can make it to the #1 spot.
In the page source:
OwO what's this?Its a working progress.
Stunning and brave
[flagged]
[dead]
Does anyone have a similar article with more detail? I don't quite want to read the datasheet of your favorite microprocessor, but I would like a decent amount more detail than what's provided. Especially before UEFI/BIOS.
UEFI is an interface implemented by firmware (literally, Unified Extensible Firmware Interface), it's not the firmware itself. Saying "it starts the machine" is a bit of a nomenclature faux pas. The firmware starts the machine, you talk to the firmware via UEFI.
This post skips all the interesting things in the modern firmware dance. Not the least of which is when you call ExitBootServices() you're already in long mode. There's no need for the journey through real and protected.
Where do I read more about this?
Here's one source: https://depletionmode.com/uefi-boot.html
As should be clear from the reset vector, the 80286 and its successors actually boot in unreal mode. On the 80386, the base address of the code segment is 0xffff0000, which cannot be obtained by shifting the 16-bit CS register by 4. The descriptor cache simply gets loaded with the correct value at reset. Writing to CS in real mode overwrites the cached value with CS * 16.
Seems like there are many useful suggestions for the author. Here is mine: maybe an interactive style would work much better for educational content.
There is a well praised post on HN: https://www.nan.fyi/database, built with the framework: https://github.com/nandanmen/NotANumber
Hard to read on my phone due to faded text.
The styling is bad on a desktop browser too. If you use Firefox or Firefox Mobile then reader mode is good for cases like this.
The self deprecatingly downvoted look.
The topic is interesting but it seems to be targeted for my grandmother.
> When power stabilizes, the CPU resets itself to a tiny, old‑fashioned mode called real mode. Real mode dates back to the original 8086 chip. The rules are simple on purpose. Memory addresses are built from two values the CPU keeps in special fast storage called registers. You combine a segment and an offset like this:
Your grandmother sounds unusually proficient with this sort of thing.I dont know, i just don't like the tone. This is a complex subject where the target audience should probably already know what is an hexadecimal number or an interrupt and the explanation of a cpu register ought to be better than: "A register is a tiny slot inside the CPU. It holds a number the CPU is using right now." If the subject interest you, you deserve better.
>A register is a tiny slot inside the CPU. It holds a number the CPU is using right now
What's your issue with this? Would you prefer it mentioned the x86 register doesn't always correspond to the same place in the register file?
There's nothing wrong with it but most people striving to do this would make each bit clickable to detail or something so that you can read at your level. I imagine this was a constraint of the site framework or the author's writing style.
But hey what the heck, it's fine. An LLM can rewrite it to whatever level of knowledge you like so the deepest level is optimal.
Agreed. A lot of these articles leave me with more questions than answers.
These blog posts really annoy me because I feel like with 20% more effort you could have something worth reading.
One of the things we were taught in uni was audience analysis. I think about it a lot. What's expected to already be known? What acronyms or phrases need defining? Etc. This is an art I'm far from perfect at and it seems a lot of tech writers are too
[dead]
Related to this topic, what is the best way to replace the code involved in the entire boot process? This is useful when sanitizing a system received from a provider that may not be trustworthy, as malware could be hidden at low levels.
The disk could be wiped from the BIOS. One could also run “fwupdmgr update” from a live USB to update the motherboard firmware and then reinstall the operating system. However, I’m not sure if this would completely clear the system.
This is old school BIOS boot. EFI bootloaders work very differently.
and with GRUB running under UEFI, it actually uses UEFI load procedures instead of fumbling with 16-bit CS, DS, SS registers
GRUB is mentioned but not detailed.
Here are some details: https://www.pixelbeat.org/docs/disk/
It's a weird article for me. On one side it is an interesting topic. On the other hand why are we explaining what a hex number is? Who is interested in this level of detail but doesn't know hex? Maybe I'm overanalyzing.
At the same time this doesn't address my biggest open question on the topic - how do we get from the physical push to the reset vector? Somehow that magic works in HW, physics and electronics - how?
ARM and lots of non-x86 architectures often use a series of bootloaders to kick up ram, wake up parts of the hardware, blah blah, and read devicetree blobs to know what the hardware looks like
Light gray text on white??
Nice to see the good old hacker energy & independent blogs explaining things showing up on top of hacker news. Welcome change from insufferable agent this and vibe that
I'm going to save this guys blog in one step
https://webaim.org/resources/contrastchecker/
(this is the site: https://webaim.org/resources/contrastchecker/?fcolor=D0D0D0&...)
Video device initialization is intimately intertwined and a dependency for all this early boot stuff. I was hoping to learn more but it's not even mentioned. Still, neat.
If you can find a copy of this on the high seas, it's a great resource. I wrote my own OS by starting with this and the Linux source in the mid-90s:
https://www.amazon.com/-/he/Developing-32-Bit-Operating-Syst...
It's not a dependency for Linux boot at all. You can do well with serial port alone, as anyone who brought up eg. an ARM SoC in Linux will attest to.
Also it's not very interesting either. At simplest, Linux just needs to take a pointer to a beginning of a framebuffer and some metadata, and will write to the framebuffer whenever there's something to update.
If you would like to see an actually "interesting" boot, I recommend checking out how Raspberry Pi's boot.
It is a unique monstrosity that boots from the video / GPU core instead of one of the ARM cores. It has an arcane undocumented architecture.
Soekris (rip) had an x86 network device. Four 10/100s and the disk was a CF. Could only serial console that thing - or SSH once it's running. Best router I ever had.
Also, in 2000 when Windows crashed you could get a serial debugger. Wonder if they still do that?
People still need to do driver ddvelopement. So you can still set up a Windows PC to expose kernel debug interface over serial port: https://learn.microsoft.com/en-us/windows-hardware/drivers/d...
Maybe not linux specifically, but POST requires a video device software (BIOS Option ROM or UEFI GOP Drivers) of some sort does it not? That's been my experience with all PCs for 30 years. But maybe there are cases where it doesn't?
edit: Apparently it's a desktop motherboard firmware thing. Ubiquitous but not technically a requirement for POSTing a computer.
I've found AM4/AM5 boards will still boot Linux without a discrete or integrated GPU, running a GPUless CPU, not an APU.
I'm probably going to read this, but who thought putting light grey text on a white background was a good idea?
fascinating how it's all over the place wrt level of detail. and absolutely unreadable. luckily the layout is simple and reader mode works.
> Hex is base 16
i would argue that someone that understand bases (in the first place), understands what the << operator does (context where base 16 is explained), but doesn't understand what base 16 is, doesn't exist. this is the kind of haphazard approach of this article i'm talking about. even the author's name, 0xkato, is an example of this.
as to the content, i wish it had touched on TPM, PCRs, UEFI secure boot, and ME pre-boot.
i'm forgiving all the actual errors since it is a pretty broad overview.
i'm guessing first-year uni student.
rather amazed a post like this can make it to the #1 spot.
Funny how those three posts are in hacker news top 5 now. I guess today is the low level appreciation day.
edit: formatOh hey, a fellow noticing person!
yes, and the bar is not at all at the same level.
weekend hackernews best hackernews