Unit – 2
System architecture and memory management
Four registers of the 80386 locate the data structures that control segmented memory management:
1. GDTR Global Descriptor Table Register
2. LDTR Local Descriptor Table Register
3. IDTR Interrupt Descriptor Table Register
4. TR Task Register
2.1.1 GDTR Global Descriptor Table Register
1. A descriptor table is simply a memory array of 8-byte entries that contain descriptors
2. A descriptor table is variable in length and may contain up to 8192 (2^(13)) descriptors. The first entry of the GDT (INDEX=0) is not used by the processor
3. The processor locates the GDT and the current LDT in memory by means of the GDTR and LDTR registers. These registers store the base addresses of the tables in the linear address space and store the segment limits. The instructions LGDT and SGDT give access to the GDTR; the instructions LLDT and SLDT give access to the LDTR.
4. The LGDT and SGDT instructions load and store the GDTR register, respectively. On power up or reset of the processor, the base address is set to the default value of 0 and the limit is set to 0FFFFH. A new base address must be loaded into the GDTR as part of the processor initialization process for protected-mode operation.
2.1.2 LDTR Local Descriptor Table Register
1. The LDTR register holds the 16-bit segment selector, base address (32 bits in protected mode; 64 bits in IA-32e mode), segment limit, and descriptor attributes for the LDT. The base address specifies the linear address of byte 0 of the LDT segment; the segment limit specifies the number of bytes in the segment.
2. “Segment Descriptor Tables.”
The LLDT and SLDT instructions load and store the segment selector part of the LDTR register, respectively. The segment that contains the LDT must have a segment descriptor in the GDT. When the LLDT instruction loads a segment selector in the LDTR: the base address, limit, and descriptor attributes from the LDT descriptor are auto-matically loaded in the LDTR.
3. When a task switch occurs, the LDTR is automatically loaded with the segment selector and descriptor for the LDT for the new task. The contents of the LDTR are not automatically saved prior to writing the new LDT information into the register.
4. On power up or reset of the processor, the segment selector and base address are set to the default value of 0 and the limit is set to 0FFFFH.
2.1.3 IDTR Interrupt Descriptor Table Register
1. The IDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and 16-bit table limit for the IDT. The base address specifies the linear address of byte 0 of the IDT; the table limit specifies the number of bytes in the table.
2. The LIDT and SIDT instructions load and store the IDTR register, respectively. On power up or reset of the processor, the base address is set to the default value of 0 and the limit is set to 0FFFFH.
3. The base Address and limit in the register can then be changed as part of the processor initialization process
2.1.4 TR Task Register
1. The task register (TR) identifies the currently executing task by pointing to the TSS. Figure 7-3 shows the path by which the processor accesses the current TSS.
2. The task register has both a "visible" portion (i.e., can be read and changed by instructions) and an "invisible" portion (maintained by the processor to correspond to the visible portion; cannot be read by any instruction). The selector in the visible portion selects a TSS descriptor in the GDT.
3. The processor uses the invisible portion to cache the base and limit values from the TSS descriptor.
4. Holding the base and limit in a register makes execution of the task more efficient, because the processor does not need to repeatedly fetch these values from memory when it references the TSS of the current task.
5. The instructions LTR and STR are used to modify and read the visible portion of the task register. Both instructions take one operand, a 16-bit selector located in memory or in a general register.
6. LTR (Load task register) loads the visible portion of the task register with the selector operand, which must select a TSS descriptor in the GDT.
7. LTR also loads the invisible portion with information from the TSS descriptor selected by the operand. LTR is a privileged instruction; it may be executed only when CPL is zero.
8. LTR is generally used during system initialization to give an initial value to the task register; thereafter, the contents of TR are changed by task switch operations.
9. STR (Store task register) stores the visible portion of the task register in a general register or memory word. STR is not privileged.
Key takeaways
- The processor locates the GDT and the current LDT in memory by means of the GDTR and LDTR registers. These registers store the base addresses of the tables in the linear address space and store the segment limits.
- The base address specifies the linear address of byte 0 of the LDT segment; the segment limit specifies the number of bytes in the segment.
- The base Address and limit in the register can then be changed as part of the processor initialization process
- The instructions LTR and STR are used to modify and read the visible portion of the task register. Both instructions take one operand, a 16-bit selector located in memory or in a general register.
Systems instructions deal with such functions as:
1. Verification of pointer parameters
1.1 ARPL> -- Adjust RPL
1.2 LAR -- Load Access Rights
1.3 LSL -- Load Segment Limit
1.4 VERR -- Verify for Reading
1.5 VERW -- Verify for Writing
2. Addressing descriptor tables
2.1 LLDT -- Load LDT Register
2.2 SLDT -- Store LDT Register
2.3 LGDT -- Load GDT Register
2.4 SGDT -- Store GDT Register
3. Multitasking
3.1 LTR -- Load Task Register
3.2 STR -- Store Task Register
4. Coprocessing and Multiprocessing
4.1 CLTS -- Clear Task-Switched Flag
4.2 ESC -- Escape instructions
4.3 WAIT -- Wait until Coprocessor not Busy
4.4 LOCK -- Assert Bus-Lock Signal
5. Input and Output
5.1 IN -- Input
5.2 OUT -- Output
5.3 INS -- Input String
5.5 OUTS -- Output String
6. Interrupt control
6.1 CLI -- Clear Interrupt-Enable Flag
6.2 STI -- Set Interrupt-Enable Flag
6.3 LIDT -- Load IDT Register
6.4 SIDT -- Store IDT Register
7. Debugging
7.1 MOV -- Move to and from debug registers
8. TLB testing
8.1 MOV -- Move to and from test registers
9. System Control:
9.1 SMSW -- Set MSW
9.2 LMSW -- Load MSW
9.3 HLT -- Halt Processor
9.4 MOV -- Move to and from control registers
The instructions SMSW and LMSW are provided for compatibility with the 80286 processor. 80386 programs access the MSW in CR0 via variants of the MOV instruction. HLT stops the processor until receipt of an INTR or RESET signal.
Key takeaways
- The instructions SMSW and LMSW are provided for compatibility with the 80286 processor. 80386 programs access the MSW in CR0 via variants of the MOV instruction. HLT stops the processor until receipt of an INTR or RESET signal.
To perform this translation, the processor uses the following data structures:
2.3.1 Descriptors
2.3.2 Descriptor tables
2.3.3 Selectors
2.3.4 Segment Registers
2.3.1 Descriptors
1. The segment descriptor provides the processor with the data it needs to map a logical address into a linear address. Descriptors are created by compilers, linkers, loaders, or the operating system, not by applications programmers. All types of segment descriptors take one of these formats. Segment-descriptor fields are:
2. BASE: Defines the location of the segment within the 4 gigabyte linear address space. The processor concatenates the three fragments of the base address to form a single 32-bit value.
3. LIMIT: Defines the size of the segment. When the processor concatenates the two parts of the limit field, a 20-bit value results. The processor interprets the limit field in one of two ways, depending on the setting of the granularity bit.
4. in units of one byte, to define a limit of up to 1 megabyte.
5. in units of 4 Kilobytes, to define a limit of up to 4 gigabytes. The limit is shifted left by 12 bits when loaded, and low-order one-bits are inserted.
6. Granularity bit: Specifies the units with which the LIMIT field is interpreted. When the bit is clear, the limit is interpreted in units of one byte; when set, the limit is interpreted in units of 4 Kilobytes.
TYPE: Distinguishes between various kinds of descriptors.
DPL (Descriptor Privilege Level): Used by the protection mechanism
7. Segment-Present bit: If this bit is zero, the descriptor is not valid for use in address transformation; the processor will signal an exception when a selector for the descriptor is loaded into a segment register. The operating system is free to use the locations marked AVAILABLE. Operating systems that implement segment-based virtual memory clear the present bit in either of these cases:
7.1 When the linear space spanned by the segment is not mapped by the paging mechanism.
7.2 When the segment is not present in memory.
8. Accessed bit: The processor sets this bit when the segment is accessed; i.e., a selector for the descriptor is loaded into a segment register or used by a selector test instruction. Operating systems that implement virtual memory at the segment level may, by periodically testing and clearing this bit, monitor frequency of segment usage.
9. Creation and maintenance of descriptors is the responsibility of systems software, usually requiring the cooperation of compilers, program loaders or system builders, and threating system.
2.3.2 Descriptor Tables
1. Segment descriptors are stored in either of two kinds of descriptor table:
1.1 The global descriptor table (GDT)
1.2 A local descriptor table (LDT)
3. A descriptor table is simply a memory array of 8-byte entries that contain descriptors, A descriptor table is variable in length and may contain up to 8192 (2^ (13)) descriptors. The first entry of the GDT (INDEX=0) is not used by the processor,
4. The processor locates the GDT and the current LDT in memory by means of the GDTR and LDTR registers. These registers store the base addresses of the tables in the linear address space and store the segment limits.
5. The instructions LGDT and SGDT give access to the GDTR; the instructions LLDT and SLDT give access to the LDTR.
2.3.3 Selectors
1. The selector portion of a logical address identifies a descriptor by specifying a descriptor table and indexing a descriptor within that table.
2. Selectors may be visible to applications programs as a field within a pointer variable, but the values of selectors are usually assigned (fixed up) by linkers or linking loaders.
3. Index: Selects one of 8192 descriptors in a descriptor table. The processor simply multiplies this index value by 8 (the length of a descriptor), and adds the result to the base address of the descriptor table in order to access the appropriate segment descriptor in the table.
4. Table Indicator: Specifies to which descriptor table the selector refers. A zero indicates the GDT; a one indicates the current LDT
5. Because the first entry of the GDT is not used by the processor, a selector that has an index of zero and a table indicator of zero (i.e., a selector that points to the first entry of the GDT), can be used as a null selector.
6. The processor does not cause an exception when a segment register (other than CS or SS) is loaded with a null selector. It will, however, cause an exception when the segment register is used to access memory. This feature is useful for initializing unused segment registers so as to trap accidental references.
2.3.4 Segment Registers
1. The 80386 stores information from descriptors in segment registers, thereby avoiding the need to consult a descriptor table every time it accesses memory.
2. Every segment register has a "visible" portion and an "invisible" portion, as The visible portions of these segment address registers are manipulated by programs as if they were simply 16-bit registers. The invisible portions are manipulated by the processor.
3. The operations that load these registers are normal program instructions . These instructions are of two classes:
3.1 Direct load instructions; for example, MOV, POP, LDS, LSS, LGS, LFS. These instructions explicitly reference the segment registers.
3.2 Implied load instructions; for example, far CALL and JMP. These instructions implicitly reference the CS register, and load it with a new value.
4. Using these instructions, a program loads the visible part of the segment register with a 16-bit selector. The processor automatically fetches the base address, limit, type, and other information from a descriptor table and loads them into the invisible part of the segment register.
5. Because most instructions refer to data in segments whose selectors have already been loaded into segment registers, the processor can add the segment-relative offset supplied by the instruction to the segment base address with no additional overhead.
Key takeaways
- Descriptors are created by compilers, linkers, loaders, or the operating system, not by applications programmers.
- The processor locates the GDT and the current LDT in memory by means of the GDTR and LDTR registers. These registers store the base addresses of the tables in the linear address space and store the segment limits.
- The processor does not cause an exception when a segment register (other than CS or SS) is loaded with a null selector. It will, however, cause an exception when the segment register is used to access memory. This feature is useful for initializing unused segment registers so as to trap accidental references.
- The 80386 stores information from descriptors in segment registers, thereby avoiding the need to consult a descriptor table every time it accesses memory.
1. In the second phase of address transformation, the 80386 transforms a linear address into a physical address. This phase of address transformation implements the basic features needed for page-oriented virtual-memory systems and page-level protection.
2. The page-translation step is optional. Page translation is in effect only when the PG bit of CR0 is set. This bit is typically set by the operating system during software initialization. The PG bit must be set if the operating system is to implement multiple virtual 8086 tasks, page-oriented protection, or page-oriented virtual memory.
2.4.1 Page Frame
1. A page frame is a 4K-byte unit of contiguous addresses of physical memory. Pages begin onbyte boundaries and are fixed in size.
2.4.2 Linear Address
1. A linear address refers indirectly to a physical address by specifying a page table, a page within that table, and an offset within that page.
2. The addressing mechanism uses the DIR field as an index into a page directory, uses the PAGE field as an index into the page table determined by the page directory, and uses the OFFSET field to address a byte within the page determined by the page table.
2.4.3 Page Tables
1. A page table is simply an array of 32-bit page specifiers. A page table is itself a page, and therefore contains 4 Kilobytes of memory or at most 1K 32-bit entries.
2. Two levels of tables are used to address a page of memory. At the higher level is a page directory. The page directory addresses up to 1K page tables of the second level. A page table of the second level addresses up to 1K pages.
3. All the tables addressed by one page directory, therefore, can address 1M pages (2^(20)). Because each page contains 4K bytes 2^(12) bytes), the tables of one page directory can span the entire physical address space of the 80386 (2^(20) times 2^(12) = 2^(32)).
4. The physical address of the current page directory is stored in the CPU register CR3, also called the page directory base register (PDBR).
2.4.4 Page Translation Cache
1. For greatest efficiency in address translation, the processor stores the most recently used page-table data in an on-chip cache. Only if the necessary paging information is not in the cache must both levels of page tables be referenced.
2. The existence of the page-translation cache is invisible to applications programmers but not to systems programmers; operating-system programmers must flush the cache whenever the page tables are changed. The page-translation cache can be flushed by either of two methods:
3. by reloading CR3 with a MOV instruction; for example:
MOV CR3, EAX
By performing a task switch to a TSS that has a different CR3 image than the current TSS.
Key takeaways
- The PG bit must be set if the operating system is to implement multiple virtual 8086 tasks, page-oriented protection, or page-oriented virtual memory.
- The addressing mechanism uses the DIR field as an index into a page directory, uses the PAGE field as an index into the page table determined by the page directory, and uses the OFFSET field to address a byte within the page determined by the page table.
- All the tables addressed by one page directory, therefore, can address 1M pages (2^(20)). Because each page contains 4K bytes 2^(12) bytes), the tables of one page directory can span the entire physical address space of the 80386 (2^(20) times 2^(12) = 2^(32)).
- For greatest efficiency in address translation, the processor stores the most recently used page-table data in an on-chip cache.
1. To summarize both phases of the transformation from a logical address to a physical address when paging is enabled. By appropriate choice of options and parameters to both phases, memory-management software can implement several different styles of memory management.
2.5.1 "Flat" Architecture
1. When the 80386 is used to execute software designed for architectures that don't have segments, it may be expedient to effectively "turn off" the segmentation features of the 80386.
2. The 80386 does not have a mode that disables segmentation, but the same effect can be achieved by initially loading the segment registers with selectors for descriptors that encompass the entire 32-bit linear address space
3. Once loaded, the segment registers don't need to be changed. The 32-bit offsets used by 80386 instructions are adequate to address the entire linear-address space.
2.5.2 Segments Spanning Several Pages
1. The architecture of the 80386 permits segments to be larger or smaller than the size of a page (4 Kilobytes). For example, suppose a segment is used to address and protect a large data structure that spans 132 Kilobytes.
2. In a software system that supports paged virtual memory, it is not necessary for the entire structure to be in physical memory at once.
3. The structure is divided into 33 pages, any number of which may not be present. The applications programmer does not need to be aware that the virtual memory subsystem is paging the structure in this manner.
2.5.3 Pages Spanning Several Segments
1. On the other hand, segments may be smaller than the size of a page. For example, consider a small data structure such as a semaphore.
2. Because of the protection and sharing provided by segments it may be useful to create a separate segment for each semaphore. But, because a system may need many semaphores, it is not efficient to allocate a page for each. Therefore, it may be useful to cluster many related segments within a page.
2.5.4 Non-Aligned Page and Segment Boundaries
1. The architecture of the 80386 does not enforce any correspondence between the boundaries of pages and segments. It is perfectly permissible for a page to contain the end of one segment and the beginning of another. Likewise, a segment may contain the end of one page and the beginning of another.
2.5.5 Aligned Page and Segment Boundaries
1. Memory-management software may be simpler, however, if it enforces some correspondence between page and segment boundaries. For example, if segments are allocated only in units of one page, the logic for segment and page allocation can be combined. There is no need for logic to account for partially used pages.
2.5.6 Page-Table per Segment
1. An approach to space management that provides even further simplification of space-management software is to maintain a one-to-one correspondence between segment descriptors and page-directory entries, Each descriptor has a base address in which the low-order 22 bits are zero; in other words, the base address is mapped by the first entry of a page table.
2. A segment may have any limit from 1 to 4 megabytes. Depending on the limit, the segment is contained in from 1 to 1K page frames. A task is thus limited to 1K segments (a sufficient number for many applications), each containing up to 4 Mbytes.
3. The descriptor, the corresponding page-directory entry, and the corresponding page table can be allocated and deallocated simultaneously.
Key takeaways
- When the 80386 is used to execute software designed for architectures that don't have segments, it may be expedient to effectively "turn off" the segmentation features of the 80386.
- The structure is divided into 33 pages, any number of which may not be present. The applications programmer does not need to be aware that the virtual memory subsystem is paging the structure in this manner.
- The architecture of the 80386 does not enforce any correspondence between the boundaries of pages and segments.
- There is no need for logic to account for partially used pages.
- A segment may have any limit from 1 to 4 megabytes. Depending on the limit, the segment is contained in from 1 to 1K page frames.
- The descriptor, the corresponding page-directory entry, and the corresponding page table can be allocated and deallocated simultaneously.
References
- A.Ray, K.Bhurchandi, ”Advanced Microprocessors and peripherals: Arch, Programming &
- Interfacing”, Tata McGraw Hill,2004 ISBN 0-07-463841-6
- Intel 80386 Programmer's Reference Manual 1986, Intel Corporation, Order no.: 231630-011,
- December 1995.
- James Turley, “Advanced 80386 Programming Techniques”, McGraw-Hill, ISBN: 10:0078813425, 13: 978-0078813429.