MS-DOS
Introduction
We know that operating system is the most significant item of system software, which is present in almost every computer system except for a few very specialized applications. In this chapter we will study about operating systems – MS-DOS and WINDOWS. MS-DOS is the world’s most extensively used operating system and being a single-user system, it is probably a much simpler system than multi-user operating systems, such as UNIX. More specifically DOS provides a method for organizing and using the information stored on disks, application/system programs, and the computer system itself. DOS also locates the information on the disk or memory, instructs the computer how to read the information that we have stored on disk, without knowing the physical location of the data on the disk, and also controls other physical devices.
On the other hand, WINDOWS is a graphical user interface operating system. No doubt earlier versions of WINDOWS effectively sat ‘on top’ of MS-DOS, but the later versions of WINDOWS, such as WINDOWS 95/98 and WINDOWS NT, have removed the need for MS-DOS as a separate entity.
Brief History of MS-DOS
DOS is a single user operating system. The history of MS-DOS in intimately associated with that of IBM Personal Computer and of course ‘compatible’ computers. In later 1980s, IBM introduced a number of personal computers in the market based on 16-bit microprocessor chips, Intel 8088. But unfortunately IBM wanted to bring the PC to the market as early as possible and it did not have enough time to develop its operating system for their computer systems. Therefore IBM approached to Digital Research, but unfortunately no compromise took place.
In 1979, Tim Paterson, a partner of Seattle Computer products, wrote its own operating system, called 86-DOS, for its Intel based products. The 86-DOS was designed to be similar to CP/M. IBM purchased 86-DOS, that enlisted the help of Microsoft to develop it into a commercial product. And finally in 1981, IBM PC was introduced with PC-DOS 1.0(for IBM machines) and MS-DOS 1.0 (for non-IBM machines). After that in 1983, version 2.0 of MS-DOS appeared, which was a major advance in design of the system. This version fulfilled the requirements of a newly announced IBM computer, the PC/XT. This version introduced a hierarchical file directory structure based on the concept of UNIX model.
Since then several versions have appeared, the latest version of DOS is 7.1, which came in 1998 with various small amendments to 7.0. However a large number of PCs in India still use versions 3.0-6.0. Table-7.1 shows the step-by-step development of MS-DOS.
Actually DOS comes in two flavors:
PC-DOS (Personal Computer Disk Operating System) – This operating system was written by the Microsoft Corporation and is used on variety of IBM PC-Models and their compatible microcomputers.
MS-DOS (Microsoft Disk Operating System) – This operating system was also written by the Microsoft Corporation. This operating system is used for non-IBM systems or you can say the systems, which are based on Intel’s 80x86 family of microprocessors. The reason behind this is that MS-DOS was written in Intel 8088 assembly language.
For the further reference, you will see the term DOS rather than MS-DOS.
Version |
Year |
Main Features |
1.0 |
1981 |
Based on IBM PC, single-level directory structure, floppy disk storage only |
2.0 |
1983 |
Based on IBM PC/XT, built-in 10MB hard disk, hierarchical-level directory structure |
3.0 |
1984 |
Based on IBM PC/AT, built-in 20MB hard disk, RAM disks, Read-Only files |
3.1, 3.2, 3.3 |
1984 to 1987 |
Support for networks, 31/2 inch floppy disks, multiple 32Mbyte disk partition, support for new IBM PS/2 computers |
4.0 |
1988 |
Window-based command shell, upto 2Gbyte disk partition |
5.0 |
1991 |
Improved shell and extended commands |
6.0 |
1993 |
Improved efficiency and functionality of overall system, such as doubling the space available on your hard disk, or making more available to each application |
6.2 |
1993 |
Various small amendments to 6.0 |
7.0 |
1995 |
Included virtual memory management, process management and multiprogramming |
7.1 |
1998 |
Various small amendments to 7.0 |
Figure : Main Features of MS-DOS
- MS-DOS was initially designed as relatively simple single user system. Being essentially a single user system, multi-user applications have to be implemented using a network.
- MS-DOS is designed to provide us with commands to manage data on disks. It means that MS-DOS contains a host of commands and programs to enable it to store information on any disks connected to our computer system.
- MS-DOS also controls other devices that are connected to the computer system.
- User program memory space was increased to 32Mbyte of disk space in MS-DOS version 3.3, which was initially 640Kbyte.
- The standardization and open-ness of the MS-DOS architecture has greatly benefited software and hardware developers and computer users, resulting in a rich market of high quality products such as WordStar, Lotus 1-2-3, Borland’s Turbo Languages, etc.
- Many application programs are considered obsolete he moment a new release is announced, but this is not necessarily true with MS-DOS because earlier versions are often sold and supported for some time after the newest version is introduced. This is so because the earlier version still has all the capabilities required by some users and new versions are always downward compatible.
- MS-DOS provides a graphical shell interface, DOS Shell, which presents us with the means to visualize and perform complicated operations without having to concern ourselves with remembering every single step the system must take to accomplish its tasks.
- Since about 1983, various versions of window-based interfaces have been available as a partner to MS-DOS. Originally Windows was an application program that extended the capability of the underlying MS-DOS system, but now MS-DOS is run as an application program under the Windows system, to provide continuity for previous MS-DOS users.
How to Install DOS?
DOS can be loaded in the RAM of your PC - either from a floppy disk or from a hard disk drive of your PC. However whatever may be the method, when DOS is loaded into RAM then three system files, IO.SYS, MSDOS.SYS and COMMAND.COM, are loaded into RAM. The IO.SYS and MSDOS.SYS are hidden (that’s why you can not list them normally) and read only files, whereas the third one, that is COMMAND.COM, is a command processor.
Unfortunately if any of these three specific files is missing then the system reports the following message:
Non-system disks or disk error
Replace disk and strike any key when ready-
Thus you can say that the computer system gives a chance to the user to insert the proper diskette into the disk drive so that DOS could be loaded properly and the system responds in an interactive mode. However if every thing is right then DOS announces that it is ready for work by displaying the DOS prompt as:
A:\> in case if DOS is loaded from the floppy diskette
Or
C:\> in case if DOS is loaded from the hard disk
Thus if we see the DOS prompt A:\> or C:\> then we are talking to DOS and DOS wants us to tell what to do. Actually DOS wants a DOS command, an instruction of what to do next.
Terms You Should Know
Before the further discussion of DOS, you must know some basic terminology while working in DOS environment.
Program – Program is a set of instructions written in a computer language. The set of programs is called as software or application programs. These instructions are stored in files and tell your computer to perform a specific task.
Files/Directories – A file is a collection of related information. Files are usually stored on disks and could contain letters, reports, memos, and so on. Similarly a directory contains all the information about files, such names of files, sizes, and some other file related information. You will study details of files and directories in the subsequent section.
Volume Label – When you use a brand new disk, you can easily put a label on the outside of it to help you identify its contents. You can also give each of your disks an internal name, called a volume label. You can look at the volume label on a disk by displaying its directory.
Disk Drive and the DOS Prompt – Generally files and programs are stored on a floppy disk or a hard disk. Floppy disks are usually referred to as A drive and B drive, whereas a hard disk is usually referred to as the C drive. If you do not specify a drive name when you type a filename, DOS automatically searches for the file on the disk in the default drive. You can say that the default drive is where DOS searches first when you type a command. When DOS is ready to receive a command from user it displays a prompt that contains the default drive letter followed by a colon (:), forward slash (\) and greater-than sign (>) as:
C:\>
Now DOS searches for files and programs on a hard disk. However you can change the default drive by typing the letter of the desired drive, followed by a colon, and press the Enter key. For example if you want to work with files on drive A, you can easily change the default drive to A as:
C:\>A:
Command – DOS commands are special programs that let you work with entire files. when you type DOS commands, you are asking the computer to perform special tasks. For example, if you want to copy the files on the DOS disk, you use the diskcopy command. You will learn more about DOS commands in subsequent sections.
Devices & Device Names – When you interact with your computer system, you enter the information from any input device, say a keyboard, and expect the result from any output device, say monitor. In other words, computer uses pieces of hardware called devices to receive input & produce output. Device names are special names given to each device that your computer system knows about.
Directory Structure of DOS
We know that DOS organizes data in terms of files and directories. A file is basically a collection of similar type of information. And from operating system point of view, a directory is a file of files, that is a file containing detail information about other files belonging to that directory. Each directory entry contains the following information:
- Filename
- Extension
- Size
- Date last modified
- Time last modified
- Starting location of the file on the disk and
- File attributes
When a file is referenced, DOS uses this information to locate this file. Since directories reside on disk, they considered as special files.
Like other operating systems, DOS has a hierarchical directory structure. It means that a directory can have other directories as well. Generally every disk must have at least one directory, called the root directory, which must be located in the first track. The root directory can have a fixed number of entries, depending upon the capacity of disk. DOS installs the root directory when it prepares the disk. The root directory is identified in DOS by the symbol ‘\’. You can create your own directories, which are subdirectories of the root directory. In each subdirectory you can further create subdirectories which in turn can have subdirectories, as shown in figure-1. In other words you can create as many subdirectories as you want. Such an organized directory structure is called a multilevel or hierarchical directory system. So the first level in a multilevel directory is the root directory. As you create new directories for group of files, or for other users using the computer, the directory system grows. And within each new directory (subdirectory) you can add new files or create new subdirectories.
From this figure, it is clear that there are four subdirectories under root directory, named User1, User2, User3 and User4. The User1 directory contains one file – Fl1. The User2 directory contains one subdirectory, named Forms, and one file Fl2. The User3 directory has one file named – Fl3 and User4 contains one directory, named Letters, and one filename Fl4. Similarly subdirectories, Forms and Letters, have directories as well as files within themselves.
Figure 1
You can go to any directory by starting at the root directory. The directory that you are in is called the working or current directory. When your computer system is started first time, you start out in the root directory as the working directory. Similarly when you create a file, you create it in the working directory.
In a multilevel directory structure, you must tell DOS where the files are located in the directory system. To reach any other directory or file from any current directory, you can specify the complete path. For example if you are currently working in the root directory and you want to list the files in Form’s Account directory of User2 then you type the following command:
C:\>dir \User2\Forms\Accounts
In DOS a backslash is used to separate directories from other directories and files. Here first backslash includes the root directory. So this command travels from the root directory to the User2 directory to Forms directory and then to display all filenames in the Accounts directory.
Path and Pathnames
You can easily move from one directory to another by using a pathname. A pathname is a sequence of directory names followed by a filename. A path differs from a pathname in that it does not include a filename. Each directory name is separated from the previous one by using a backslash (\). For example,
\User2\Forms\Accounts\Fl9
is the pathname of Accounts’s Fl9 file and
\User4\Letters\Fl8
is the pathname of Letter’s Fl8 file. If you specify a pathname from the root directory then it is referred to as absolute pathname. On the other hand, if you are in working directory say, User2, then you can specify the pathname of Accounts’s Fl9 file as:
Forms\Accounts\Fl9
Here the path name does not start with a slash (/). The absence of a slash (/) in the path name tells the operating system command interpreter that the path is partial with respect to the current directory. Such a pathname is called as a relative pathname.
Merits of Hierarchical Directory Structure
- In hierarchical directory structure we can have two different files of having the same user-assigned name in different directories. It means that the full access paths of each such file are distinct. But remember that file names within a given directory should be unique.
- Hierarchical directories provide easy sharing of files, thereby avoiding the unnecessary need for copying the entire file or directory. This is achieved by using the concept of linking. In linking we create links to already existing directory or file.
- Hierarchical directories also provide better protection.
- In a hierarchical directory, the reviewing, listing, and general manipulation of shorter topical subdirectories is much more convenient than dealing with a single, large flat directory.
Implementation of a Directory Structure
Till now we have seen the different aspects and merits of the hierarchical directory structure. Now we will see how this is implemented internally by DOS.
Hierarchical directories are usually implemented by separating symbol tables of names from the physical description of files. Files names are kept in symbolic file directories (SFDs). For each directory and subdirectory there is a separate symbolic file directory in the system. The symbolic file directories shown in figure-7.3 describe the file system introduce in figure-7.2. The primary use of a symbolic file directory is to establish a correspondence between a user-assigned file name and an internal system-assigned file-ID. Each physical file is assigned a unique internal ID that is used by the system to reference the related file. It means that SFD gives only the file name and its ID number, which is used as an index into the basic file directory (BFD).
File ID |
Type |
Size |
Usage Count |
Access Rights |
Address (block no) |
Other Information |
1 |
BFD |
10 |
1 |
RW |
5 |
… |
2 |
DIR |
1 |
1 |
RO |
2 |
… |
3 |
DIR |
1 |
1 |
RO |
18 |
… |
4 |
DIR |
1 |
1 |
WO |
26 |
… |
5 |
DIR |
1 |
1 |
RW |
32 |
… |
6 |
TXT |
240 |
1 |
RO |
9012 |
… |
7 |
OBJ |
46 |
1 |
RW |
456 |
… |
8 |
DIR |
1 |
1 |
RO |
208 |
… |
9 |
TXT |
126 |
1 |
WO |
1404 |
… |
10 |
DIR |
1 |
2 |
RW |
102 |
… |
11 |
TXT |
4028 |
1 |
WO |
56 |
… |
12 |
DIR |
1 |
1 |
RO |
78 |
… |
13 |
TXT |
312 |
1 |
WO |
4226 |
… |
14 |
OBJ |
76 |
1 |
RO |
2556 |
… |
15 |
… |
… |
… |
… |
… |
… |
16 |
DIR |
1 |
1 |
RO |
190 |
… |
17 |
… |
… |
… |
… |
… |
… |
18 |
… |
… |
… |
… |
… |
… |
19 |
DIR |
1 |
1 |
RW |
288 |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
Figure 2 Basic File Directory (BFD)
Figure 2 Symbolic File Directories (SFDs)
The BFD gives the information about each physical file including its type, size, access rights, usage links, and a lot of other file related information, except its name. For the sake of simplicity and efficiency, the internal file ID is used as an index into the BFD. This is shown in figure-2. From this figure it is clear that if the same physical file with the same name or different names in two different directories is shared by these directories, say directory Project in Rahul and Pooja directories, then there will be two SFD entries in their respective SFDs with appropriate (same or different) file names but both containing the same file ID number. The usage count in BFD entry will be 2 in this case. Note that the SFD entry does not have a usage count.
Since directories are stored and accessed as files, the BFD in figure-2.2 contains entries for all system files as well as user-defined files, including directories, subdirectories and the BFD itself. Another main point to note that the root SFD and the BFD are assigned fixed Ids and stored at known disk addresses. Figure-7.3 shows that the BFD has file ID – 1 and starts from block number 5 and the root SFD has file ID – 2 and starts from block number 2.
We know that directories may be of varying length, so EOF (End Of File) is used to mark the end of directory. And whenever a new file or directory is created then the operating system looks for an empty directory entry. Now look at the entries marked in SFDs. Each SFD has also two special entries marked as “.” and “..”. The entry with single dot “.” is used to facilitate reference to itself and the entry with double dots “..” is used to facilitate reference to its parent directory. In other words we can say that the single dot is used to know – where am I now? and double dots is used to go to one level up from current working directory.
File System Implementation in MS-DOS
We know that disks are normally divided into 512-byte physical sectors and in MS-DOS, space is allocated in terms of a cluster. A cluster is simply a multiple of the disk physical sector size, usually 512 bytes. Typical sizes are 512, 1024 and 2048 bytes. Therefore a cluster is the basic unit of allocation. In MS-DOS when a file is created it consists solely of one cluster (the minimum space assigned to a file even though it consists only a few bytes). But in future if the file is extended then the cluster will eventually be filled and additionally more clusters can be assigned to the file, if needed. In MS-DOS when disk space is allocated to a file then they employ chained allocation strategy in which a table of pointers, separate from the data, is maintained. This table is called the File Allocation Table (FAT).
When we format a disk with the format command, MS-DOS copies this FAT table onto the disk and creates an empty directory, called the root directory. MS-DOS does not maintain file’s disk addresses in its directory entry, rather it keeps track of file blocks via a FAT. This table has one entry for each cluster on its enclosing volume. The values in FAT are pointers to other FAT entries. Thus instead of the clusters being chained together, the appropriate linked are defined by means of a chain of entries within the FAT. The directory entry contains the number of the first file block (starting cluster address) and this number is used as an index into a FAT. Addresses of other clusters of a file can be obtained by repeating this process and following the chain of pointers contained in FAT.
Here one should remember that two identical copies of the FAT are kept on each volume for security. FAT is stored at predefined location. In earlier versions of MS-DOS a complete copy of FAT was kept in main memory in order to speed up disk processing. But in later versions, as the hard disk size grew, only actively used portions of FAT are brought into memory as needed.
One main point to remember here is that each entry in FAT (except the first two) corresponds to one cluster of disk space. The FAT also allocates the free space on your disk so that you have enough room to create new files. Initially all FAT entries are set to zero, indicating a free cluster. First entry of FAT indicates the disk system type, while the second entry is always filled with the value hexadecimal. Special reserved characters are used in FAT entries to mark the end of file, free (unused clusters) and bad clusters.
Let we have two files file-A and file-B . It shows that file-A uses 3 blocks, say block number 3, 5 and 9, and file-B uses 4 blocks, say block number 10, 8,4 and 15. It shows that the file-A starts with block 3 and follow the chain all the way to end. The same task can be done starting from block number 10. The end of the chain pointers is identified by the code FFFF. The unused entries (free clusters) are marked by 0000.
The Fat file system comes in three versions – FAT-12, FAT-16 and FAT-32, depending on how many bits a disk address contains. For all FATs, the size of the disk block can be set to some multiple of 512 bytes. We know that the first version of MS-DOS used FAT-12 with 512 byte blocks, thus giving a maximum partition size of 212*512 = 2097152 bytes. Out of these, 10 blocks of disk address were used as special markers, such as end of file, unused blocks, bad blocks, etc. In other words we can say that only 4086*512 bytes were used for file contents. With these parameters the maximum disk partition size was about 2MB. For FAT-12 if we increase the block size to 1Kbyte then the maximum disk size would be 4MB. The new FA-12 file system worked up to 64MB. This worked well for floppy disk and for hard disks up to 64MB. If we go beyond this then it became a problem. That’s why FAT-16 was introduced. In this 16-bit address is used and additionally the block size of 8KB, 15KB and 32 KB. Thus the largest disk partition that that can be supported by FAT-16 is 216*32Kbyte = 2GB. So if the disk size is 8GB then it can support for four partitions of 2GB each. However if the file is very-very large, say digital video file, then 2GB file holds just over 9 to 10 minutes, so if there are 4 partitions then it can store the video of maximum 40 minutes. This limitation was overcome by FAT-32 file system. Actually FAT-32 uses only low-order 28 bits of the disk address rather than all 32 bits. FAT-32 was introduced with second release of Windows 95. In this the partition could theoretically be 228*215 bytes, but unfortunately this is actually limited to 2TB (2048GB). It is so because the operating system keeps track of partition size in 512-byte sectors using a 32-bit number and 29*232 = 2TB.