Segmentation has been included in 80 × 86 microprocessors to encourage programmers to split their applications into logically related entities, such as subroutines or global and local data areas. However, Linux uses segmentation in a very limited way. In fact, segmentation and paging are somewhat redundant since both can be used to separate the physical address spaces of processes: segmentation can assign a different linear address space to each process, while paging can map the same linear address space into different physical address spaces. Linux prefers paging to segmentation for the following reasons:
Memory management is simpler when all processes use the same segment register values — that is, when they share the same set of linear addresses.
One of the design objectives of Linux is portability to a wide range of architectures; RISC architectures in particular have limited support for segmentation.
The 2.4 version of Linux uses segmentation only when required by the
80 × 86 architecture. In particular, all processes use the
same logical addresses, so the total number of segments to be defined
is quite limited, and it is possible to store all Segment Descriptors
in the Global Descriptor Table (GDT). This table is implemented by
the array gdt_table
referred to by the
gdt
variable.
Local Descriptor Tables are not used by the kernel, although a system
call called modify_ldt( )
exists that allows
processes to create their own LDTs. This turns out to be useful to
applications (such as Wine) that execute segment-oriented Microsoft
Windows applications.
Here are the segments used by Linux:
A kernel code segment. The fields of the corresponding Segment Descriptor in the GDT have the following values:
Base
= 0x00000000
Limit
= 0xfffff
G
(granularity flag) = 1
, for
segment size expressed in pages
S
(system flag) = 1
, for normal
code or data segment
Type
= 0xa
, for code segment
that can be read and executed
DPL
(Descriptor Privilege Level) = 0
, for Kernel Mode
D/B
(32-bit address flag) = 1
,
for 32-bit offset addresses
Thus, the linear addresses associated with that segment start at 0
and reach the addressing limit of 232 -1.
The S
and Type
fields specify
that the segment is a code segment that can be read and executed. Its
DPL
value is 0, so it can be accessed only in
Kernel Mode. The corresponding Segment Selector is defined by the
_ _KERNEL_CS
macro. To address the segment, the
kernel just loads the value yielded by the macro into the
cs
register.
A kernel data segment. The fields of the corresponding Segment Descriptor in the GDT have the following values:
Base
= 0x00000000
Limit
= 0xfffff
G
(granularity flag) = 1
, for
segment size expressed in pages
S
(system flag) = 1
, for normal
code or data segment
Type
= 2
, for data segment that
can be read and written
DPL
(Descriptor Privilege Level) = 0
, for Kernel Mode
D/B
(32-bit address flag) = 1
,
for 32-bit offset addresses
This segment is identical to the previous one (in fact, they overlap
in the linear address space), except for the value of the
Type
field, which specifies that it is a data
segment that can be read and written. The corresponding Segment
Selector is defined by the _ _KERNEL_DS
macro.
A user code segment shared by all processes in User Mode. The fields of the corresponding Segment Descriptor in the GDT have the following values:
Base
= 0x00000000
Limit
= 0xfffff
G
(granularity flag) = 1
, for
segment size expressed in pages
S
(system flag) = 1
, for normal
code or data segment
Type
= 0xa
, for code segment
that can be read and executed
DPL
(Descriptor Privilege Level) = 3
, for User Mode
D/B
(32-bit address flag) = 1
,
for 32-bit offset addresses
The S
and DPL
fields specify
that the segment is not a system segment and its privilege level is
equal to 3; it can thus be accessed both in Kernel Mode and in User
Mode. The corresponding Segment Selector is defined by the _ _USER_CS
macro.
A user data segment shared by all processes in User Mode. The fields of the corresponding Segment Descriptor in the GDT have the following values:
Base
= 0x00000000
Limit
= 0xfffff
G
(granularity flag) = 1
, for
segment size expressed in pages
S
(system flag) = 1
, for normal
code or data segment
Type
= 2
, for data segment that
can be read and written
DPL
(Descriptor Privilege Level) = 3
, for User Mode
D/B
(32-bit address flag) = 1
,
for 32-bit offset addresses
This segment overlaps the previous one: they are identical, except
for the value of Type
. The corresponding Segment
Selector is defined by the _ _USER_DS
macro.
A Task State Segment (TSS) for each
processor. The linear address space corresponding to each TSS is a
small subset of the linear address space corresponding to the kernel
data segment. All the Task State Segments are sequentially stored in
the init_tss
array; in particular, the
Base
field of the TSS descriptor for the
n
th CPU points to the n
th
component of the init_tss
array. The
G
(granularity) flag is cleared, while the
Limit
field is set to 0xeb
,
since the TSS segment is 236 bytes long. The Type
field is set to 9 or 11 (available 32-bit TSS), and the
DPL
is set to 0, since processes in User Mode are
not allowed to access TSS segments. You will find details on how
Linux uses TSSs in Section 3.3.2.
A default
Local Descriptor Table (LDT) that
is usually shared by all processes. This segment is stored in the
default_ldt
variable. The default LDT includes a
single entry consisting of a null Segment Descriptor. Each processor
has its own LDT Segment Descriptor, which usually points to the
common default LDT segment; its Base
field is set
to the address of default_ldt
and its
Limit
field is set to 7. When a process requiring
a nonempty LDT is running, the LDT descriptor in the GDT
corresponding to the executing CPU is replaced by the descriptor
associated with the LDT that was built by the process. You will find
more details of this mechanism in Chapter 3.
Four segments related to the Advanced Power Management (APM) support. APM consists of a set of BIOS routines devoted to the management of the power states of the system. If the kernel supports APM, four entries in the GDT store the descriptors of two data segments and two code segments containing APM-related kernel functions.
In conclusion, as shown in Figure 2-5, the GDT includes a set of common descriptors plus a pair of segment descriptors for each existing CPU — one for the TSS segment and one for the LDT segment. For efficiency, some entries in the GDT table are left unused, so that segment descriptors usually accessed together are kept in the same 32-byte line of the hardware cache (see Section 2.4.7 later in this chapter).
As stated earlier, the Current Privilege Level of the CPU indicates
whether the processor is in User or Kernel Mode and is specified by
the RPL
field of the Segment Selector stored in
the cs
register. Whenever the
CPL is changed, some segmentation
registers must be correspondingly updated. For instance, when the
CPL
is equal to 3 (User Mode), the
ds
register must contain the Segment Selector of
the user data segment, but when the CPL
is equal
to 0, the ds
register must contain the Segment
Selector of the kernel data segment.
A similar situation occurs for the ss
register. It
must refer to a User Mode stack inside the user data segment when the
CPL
is 3, and it must refer to a Kernel Mode stack
inside the kernel data segment when the CPL
is 0.
When switching from User Mode to Kernel Mode, Linux always makes sure
that the ss
register contains the Segment Selector
of the kernel data segment.