Windows Multitasking: A Historical Aside


As some of you may know, I came up in the world of microcomputers, having had my earliest experiences with PC-DOS 3.21 on an IBM PC (yes, the 5150). I first started as a simple user, merely running programs and playing with them. My first foray into what you might–if you were feeling particularly generous–call programming was writing DOS batch files, augmented by small utilities assembled directly into memory using DEBUG.COM. At that point, I was only reading the assembly code out of books and magazines, and typing them into DEBUG’s primitive user interface. What these strange PUSH and MOV and INT instructions were was still a mystery to me. I was nine years old.

After this, I dug out the elegantly-bound BASIC reference manual and started hacking away on simple programs in BASICA. I did this until I was about 14, when I got a 90MHz Pentium box with a whopping 16MB of RAM and Windows 3.11.  By that time, I was familiar with CALL ABSOLUTE and using EXE2BIN. I got into Visual Basic programming somewhere in the neighborhood of v3, and started getting familiar with Turbo Pascal for Windows and dabbling with MFC apps in Visual C++ 1.0: VB3 compiled to threaded pseudocode; threaded in the sense that the binary output of the compiler was threaded with the interpreter that ran the code, and the interpreter was embedded in the compiled executable. VB3 also could not use Windows APIs that took callbacks as parameters, or APIs that used a technique called “subclassing”, nor could VB3 generate DLLs. Suffice it to say that any VB3 programmer who wanted to develop any application of sufficient complexity would, at some point, resort to implementing a DLL in another, more capable language.

During my foray into VB3, 4, and 5–the latter of which finally got native code compilation and an “AddressOf” operator to support callbacks–I became intimately familiar with the strengths and weaknesses of the Windows operating system (3.1, 95, 98, and NT 4, specifically), as well as many obscure details of their internal implementations. Somewhere along the way, I became an avid fan of IBM’s OS/2 Warp and Commodore’s Amiga OS, and finally got into Linux in 1996, and various commercial UNIX implementations and the Macintosh after that. While I can say that many of these alternatives to Microsoft products are technically superior, and while IBM’s OS/2 and DEC’s OpenVMS have a slight edge for me over the others, I’ll never swear allegiance to one operating system or family of operating systems to the exclusion of all others. I find them all to have their interesting or entertaining points. While I don’t use Windows, other than for the occasional game of World of Warcraft, or when I’m called professionally to develop a Windows application, I try to be as accurate in my criticism of it as possible, and avoid blanket statements.

To that end, I want to address (and possibly dispel) a myth about the operating system family that many of my fellow geeks love to hate more than all others. You guessed it: Windows.

Windows doesn’t have “real” multitasking

This myth pops up from time to time with several variants, ranging from “Windows doesn’t have real multitasking” to “Windows only got real multitasking in version [insert Windows version here]”, and for the most part, this is false.

To understand this, we need to first understand that there are two distinct product lines that have borne the name “Windows” over the years. This family includes Windows 1.x, 2.x, /286, /386, 3.0, 3.1, 3.11, 95, 98, and an abomination of a product called Windows Me. This Windows product line started off as something of a graphical shell for MS-DOS, taking over more and more core OS functions until Windows 95 finally reduced the role of DOS to being essentially a boot loader and an environment which could be called up on-demand to run old DOS applications. Versions 1.x-3.1 were strictly 16-bit environments, running 16-bit code in one of three modes:

  • Real mode: essentially used the processor of the machine as a fast 8086, with a 1MB address space, no virtual memory, or hardware multitasking. Task switching had to be “faked” purely in software. Introduced in Windows 1.0, and removed in Windows 3.1.
  • Standard mode: used the native mode of the rather braindead 80286 processor, even on 386 and higher systems. Increased the memory addressing capabilities of Windows applications and provided some degree of hardware support for memory protection. Could only run a single MS-DOS application under Windows, although Windows applications could be cooperatively multitasked. Introduced in Windows/286, removed in Windows 3.11.
  • 386 Enhanced mode: used the protected mode of the 80386 and higher processors. The 80386 provided a “virtual 8086” feature to run multiple 16-bit processes in what amounts to a set of virtual machines while the CPU was in protected mode. Windows applications were still 16-bit, and executed in virtual 8086 mode, and disk and file access were still provided by DOS or system BIOS calls, also running in virtual 8086 mode, until Windows for Workgroups 3.11 introduced 32-bit disk, file, and network access, which essentially made that product a DOS replacement. Introduced in Windows/386, continued in essence until Windows Me.

These versions of Windows are mostly implemented in C and assembly language, are non-portable, intimately tied to Intel x86 hardware, do not support multiple threads, and do not support more than one CPU or CPU core in a system.

The other family of Windows operating systems is Windows NT, including Windows NT 3.1, 3.5, 3.51, 4.0, Windows 2000, Windows XP, Windows Vista, Windows 7, and Windows 8. The Windows NT family shares no code with the other Windows products, and its kernel was designed from the ground up by Dave Cutler, who also designed OpenVMS. The product line was fully 32-bit at its inception, later adding a 64-bit version. The NT kernel was designed for portability, having run not only on x86 CISC hardware, but also AXP, PowerPC, and MIPS RISC hardware. NT also supports multiple threads, as well as compatibility layers supporting the rudimentary APIs from the POSIX standard and OS/2 1.x.

So, what constitutes “real” multitasking?

Most computer geeks–myself included–would say that a true multitasking operating system should be “preemptive”. This means that the operating system’s scheduling algorithm may, at any time, suspend the currently-running process, save its state, and hand over control to another process. This is in contrast to “cooperative” multitasking, where the scheduler may only become involved when a process “yields”. MacOS through version 9 and Windows 1.x-3.11 are all examples of cooperative multitasking systems. In Windows 1.x-3.x, this yielding is done by means of a particular system call included in every application’s main event loop. Cooperative multitasking can work well only if application programmers are careful to yield, as one process failing to yield will irretrievably freeze all of the others, and the only way to recover is to reboot the entire computer. This approach does, however, make certain aspects of OS design easier: because each application must explicitly yield control to the scheduler, system call implementations never have to be designed as reentrant code (all system calls will have returned by the time the application yields control to the scheduler). This distinction is key to understanding some of the difficulties when we look at the multitasking implementation in Windows 95, 98, and Me, so do try to remember it.

Windows NT has used preemptive multitasking since its introduction, and its system calls have been reentrant from the beginning. We will not discuss it further here.

Windows 95: Preemptive multitasking with compromises

Windows 95, as I mentioned before, introduced support for 32-bit applications to the consumer-oriented Windows product family. It also introduced preemptive multitasking and multithreading to this Windows product line for the first time. This implementation was not, however, without compromises. To understand these compromises, let’s look at the design constraints under which Windows 95 was built:

  • It needed to run at least as fast as Windows 3.1 on an 80386 CPU with 4MB of RAM
  • It needed to be more stable than Windows 3.1
  • It needed to support 32-bit applications
  • It needed to run Windows 3.1 and DOS applications with minimal compatibility problems
  • Support for preemptive multitasking and multithreading were considered high priorities, although 16-bit applications would still share a single memory space and would themselves be cooperatively multitasked, as other approaches would have introduced timing and compatibility issues with existing 16-bit applications

To untangle this thread, let’s look at how the Windows API is structured. There are three main components to the Windows API, namely, KERNEL, USER, and GDI. KERNEL provides services like file access and memory management to application processes running in CPU ring three (least privileged). USER provides support for the common GUI elements, such as windows, text boxes, list boxes, and radio buttons. GDI (for Graphics Device Interface), provides device-independent 2D drawing primitives, font rendering, color management, and support for printing. They are all implemented as DLLs, or Dynamic Link Libraries. This is analogous to shared libraries in the Linux/UNIX world. The main considerations for these APIs to consider in regard to Windows 95’s multitasking support are the “bit-ness” of the particular implementation, and their support for being reentrant. The thousands of API calls supported by these three libraries amounted to approximately an 800K memory footprint in Windows 3.11’s 16-bit implementations thereof. Given that 16-bit Windows applications still had to run unmodified on Windows 95, and given a maximum memory footprint of about 3MB for the entire operating system, providing both 16-bit and 32-bit versions of these libraries would have been clearly impossible. Instead, the designers chose to provide something called the “flat thunk layer”, which is essentially a thin set of shims allowing 32-bit code to call 16-bit code, and vice-versa. KERNEL did get the luxury of a fully 32-bit version, for many reasons (mostly involving memory management), but the 32-bit versions of USER and GDI were essentially shims calling into their 16-bit counterparts.

The difficulty at this point was that the 16-bit USER and GDI libraries were not reentrant code, and could not have been made reentrant without breaking backwards compatibility with existing 16-bit applications, and preemptive multitasking was a requirement for 32-bit applications.

Enter Win16Mutex

So, the process scheduler for Windows 95 was indeed preemptive for 32-bit applications, and it could preempt the entire backwards-compatibility subsystem for running 16-bit applications, although doing so would require halting all such applications running on the system. However, as I mentioned before, all applications in Windows 95 call into 16-bit code when using functions from the USER or GDI libraries, and this code is not reentrant. So, Windows 95’s solution is to set a mutex called Win16Mutex every time the scheduler hands control to a 16-bit process, so that 16-bit code can never be re-entered as a result of process preemption. When that call returned, the scheduler would clear the mutex.

This approach kept the memory requirements low, but also created an ugly hole where the benefits of preemptive multitasking could be completely lost: if any process crashed while Win16Mutex was set, no other processes could use the core APIs of the system, potentially causing just as bad of a freeze as a non-yielding application in a cooperative multitasking system. This was carried forward in Windows 98 and Windows Me, and is the cause of many of these three systems’ stability problems. However, preemptive multitasking has been a part of even these rather mediocre operating systems for some time.


Hopefully, this article has been an interesting view into some challenges faced by engineers working for “the enemy” back in the late 1990s, and perhaps into operating system design considerations in general.