Wasted Neurons Wednesday - The Architecture of MS-DOS

I have a lot of neurons wasted on MS-DOS. Probably more than for any other technology I've ever worked with.

Which is a bit ironic, as most people would say that the entire problem with MS-DOS is that there's so little of it...

And I'll admit that I would agree MS-DOS has an architecture in the same way that a bivouac has - there's the beginnings of something there, but it's not exactly going to fill a textbook.

That very small architecture is pretty simply described with the following handy architecture diagram:
IO.SYS --> MSDOS.SYS --> COMMAND.COM

Yep. That's it. That's the whole of it all.

IO.SYS
IO.SYS provides a set of BIOS-like Input/Output routines. It's the Hardware Abstraction Layer of MS-DOS.
If you were lucky enough to have one of the first ever 40Mb hard disks, you'd have found that it didn't actually work with an IBM Compatible PC. Considerable changes were needed to make MS-DOS handle more than 32Mb (yes, megabytes) of disk - and those changes were made in IO.SYS. (I would like to point out that the 32Mb limit is a partitioning and filesystem one, based around FAT12/FAT16 issues. But that seems to be overly tedious, so I'll skip that bit.)
IO.SYS also creates the various DOS devices (CON, NUL, COM1/2, LPT1) that allow access to the relevant hardware.
The last (and first!) job of IO.SYS is to be the bootloader - it parses and processes CONFIG.SYS, loads MSDOS.SYS, and finally hands control over to the shell.

MSDOS.SYS
MSDOS.SYS provides the many and varied APIs that MS-DOS has. Interrupt 21h, and, um, interrupt 21h. Oh, and don't forget interrupt 21h!
OK, so MS-DOS was actually more than just interrupt 21h. But 99% of everything you'd ever ask DOS to do was on Interrupt 21h, so it usually felt like it was all that mattered. Many of the other interrupts in DOS were one-trick ponies, like interrupt 20h - which just stopped the current program - whereas interrupt 21h actually took options and did useful things like create files or folders.
In fact, if you look at this Wikipedia article on the MS-DOS API, you'll notice it really is almost all Interrupt 21h. The other interrupts are just for disk read/writes, or are reserved for DOS functions. The most obvious exception to this is Interrupt 27h, which is used to Terminate and Stay Resident. (We are not going to cover TSRs today. Maybe in a future Wasted Neurons Wednesday.)

COMMAND.COM
Where IO.SYS provides the low level hardware access and emulated devices, and MSDOS.SYS provides the DOS API, COMMAND.COM is the public face of MS-DOS.
It is the thing that provides this famous string of characters:
C:\>
That famous prompt is provided by COMMAND.COM, which also handles all the "internal" DOS commands such as COPY, DEL, MKDIR for which there is no executable program.
COMMAND.COM doesn't provide much that any other programs need - the only parts of the API that are of great note are the handlers for CTRL+BREAK and critical error handling. Otherwise, COMMAND.COM is more of an interface layer than a critical part of the operating system.

 
And that's it. I'd draw a diagram, but it would just be three boxes on top of each other. Nothing to it.

Oh, and if you're using IBM PC DOS or some versions of Novell DOS, then the names of the two system files are different but the functions are the same:
IO.SYS == IBMBIO.COM
MSDOS.SYS == IBMDOS.COM
COMMAND.COM == COMMAND.COM (this filename stays the same as 3rd party programs might run it - for instance, to run batch files.)

Yet we ran entire industries on that architecture for over a decade.

And for some reason, I'm still wasting neurons on this stuff...