Chapter 16. What CPUID?

There are multiple manufacturers all making different models of the 80×86 type microprocessors. Some are variations of the Intel processors and are highly specialized, but most are not. They are clones of the Intel processor family but with their own designs that require alternate optimization methods. Most of these manufacturers have technical manuals usually available in a PDF format that can be downloaded from the Internet and used for all your custom optimization needs. If the project you're coding for uses custom hardware, then you are probably using a custom processor such as National Semiconductor's NS486SXF under an operating system such as pSOS. When you are designing code for a specific processor, your code can be highly optimized and tuned accordingly.

When the hardware you are writing code for is a little more generic, the programmer needs a method to identify the exact model of processor that the code is running on. Each manufacturer has written a sample CPU code detection algorithm that uses the CPUID instruction. This is great, but these code samples are not exactly compatible with each other. Since it is ridiculous to write code that encapsulates all of these samples I have written this chapter to help you. You can find all sorts of variations of the following program on the Internet, but the following is designed to be expandable and versatile.

Most of these Intel processors are deviations of each other but if we take a closer look at their "family type" we will note a pattern of 80(x)86, where the x represents a family number. A 3 would be the 80386, etc. So using this family type number we can actually group the processor into a category of functionality, as each "group" actually has its individual subset of instructions that it could execute.

Other manufacturers have second sourced various models of the 80×86 processor line. Intel and AMD are the primary manufacturers, but other manufacturers have brought to market their modified or less expensive versions of these same processors.

Workbench Files:Benchx86chap03projectplatform

 

project

platform

CPU ID

cpuid

vc6

  

vc.net

CPUID

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

CPUID

CPUID

CPUID

CPUID

CPUID

CPUID

CPUID

CPUID

CPUID

CPUID

CPUID

cpuid

This instruction uses the value stored in the EAX register as a function identifier and returns the related requested information in the various associated registers.

With the release of the Pentium chip, Intel instituted the CPUID instruction, which gives detailed information of the capabilities of the individual processor. This was also introduced into the re-release of the Intel 80486 processor. AMD has implemented it in all models since the Am486. This makes it easier to identify the capabilities of the CPU being tested.

Before trying to use this instruction, bit #21 of the EFLAGS/ RFLAGS must be tested to see if it is writable. If it is, the CPUID instruction exists and therefore can be called. The application code uses mainly the PUSHFD/PUSHFQ and POPFD/POPFQ instructions to manipulate the EFLAGS/RFLAGS register.

        pushfd                  ; push EFLAGS register
        pop    eax              ; pop those flags into EAX
        xor    eax,EFLAGS_ID    ; flip ID bit#21 in EFLAGS
        push   eax              ; push modified flags on stack
        popfd                   ; pop flags back into EFLAGS
        pushfd                  ; Push resulting EFLAGS on stack
        pop    ecx              ; pop those flags into ECX
        xor    eax,ecx          ; See if bit stayed flipped
        jz     $nope            ; Jump if bit not flipped

   ; If here then bit flipped so CPUID exists
           cpuid

At a very minimum, all CPUs that support the CPUID instruction support both functions #0 and #1.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

-

-

-

-

-

Flags: None are altered by this opcode.

Function

Returned Data

EAX=0

EAX = The highest CPUID function number this CPU can handle. The Intel Pentium and 486 return a 1 in EAX. The Pentium Pro returns a 2 in EAX. The EBX, EDX, ECX registers contain a text identifier.

           ebx   edx   ecx
Amd     = Auth, enti, cAMD
Centaur = Cent, aurH, auls
Cyrix   = Cyri, xIns, tead
Intel   = Genu, ineI, ntel

EAX=1

EAX = Version Information.

   Bits 0...3 – Stepping ID

   Bits 4...7 – Model

   Bits 8...11 – Generation / family

   Bits 12...15 – Reserved

   Bits 16...19 – Extended model

   Bits 20...27 – Extended family

   Bits 28...31 – Reserved.

EBX =

   Bits 0...7 – Brand Index

   Bits 8...15 – CLFLUSH line size

   Bits 16...23 – (Intel) # of logical processors (AMD) Reserved

   Bits 24...31 – Processor's initial local APIC ID

ECX = (Intel) Feature info. (AMD) Reserved

EDX= Feature info

Intel EAX=2

EAX, EBX, ECX, EDX = Cache and TLB information

Intel EAX=3

EAX, EBX, ECX, EDX = Reserved

Intel EAX=4

EAX =

   Bits 0...4 – Cache type

   Bits 5...7 – Cache level

   Bit 8 – Self-initializing cache

   Bit 9 – Fully associative cache

   Bits 10...13 – Reserved

   Bits 14-25 – Number of threads sharing cache

   Bits 26...31 – Number of processor cores on the die

EBX =

   Bits 0...11 L = System coherency line size

   Bits 12...21 P = Physical line partitions

   Bits 22...31 W = Ways of associativity

ECX = 0...31 Number of sets

EDX = Reserved

Intel EAX=5

EAX =

   Bits 0...15 – Smallest monitor-line byte size

   Bits 16...31 – Reserved

EBX, ECX, EDX = Reserved

AMD, Cyrix, and WinChip

EAX= 80000000h

If string identifier with function #0 matches for AMD, Cyrix, or WinChip, test for this function. If a non-zero value is returned in EAX, an extended function set is supported, just like function #0. The EAX register contains the highest extended function that the CPU can handle.

Intel

EAX= 80000000h

EAX = Maximum input value for extended CPUIDs

EBX, ECX, EDX = Reserved

AMD, Cyrix, and WinChip

EAX= 80000001h

See the Intel – Standard CPUID ECX-Feature Flags section.

EAX = Processor signature

EBX, ECX = Reserved

See the AMD – Extended #1 CPUID EDX-Feature Flags section.

Intel

EAX= 80000001h

Extended processor signature and extended feature bits. See the Intel – Extended #1 CPUID EDX-Feature Flags section.

AMD, Cyrix, WinChip, and Intel

EAX= 80000002h 80000003h 80000004h

EAX, EBX, ECX, EDX = 4 * 4 * 3 = 48 byte text string

AMD

EAX= 80000005h

TLB and L1 cache information

Intel

EAX= 80000005h

EAX, EBX, ECX, EDX = Reserved

AMD, Cyrix, WinChip, and Intel

EAX= 80000006h

L2 Cache bits

ECX =

   Bits 0...7 – Cache line size

   Bits 8...11 – Lines per tag

   Bits 12...15 – L2 Associativity

   Bits 16...31 – Number of 1K cache blocks

EAX, EBX, EDX = Reserved

AMD

EAX= 80000007h

EDX = Advanced power management

EAX, EBX, ECX = Reserved

Intel

EAX= 80000007h

EAX, EBX, ECX, EDX = Reserved

AMD, Intel

EAX= 80000008h

EAX =

   Bits 0...7 – Physical address bits

   Bits 8...15 – Virtual address bits

   Bits 16...31 – Reserved

EBX, ECX, EDX = Reserved

The initial CPUID call gives us the manufacturer ID string.

  Intel:        db       "GenuineIntel"

                mov      eax,0            ; Function #0
                cpuid

                cmp      ebx,dword ptr Intel
                jne      $Nope            ; Jump if not a match
                cmp      edx,dword ptr Intel+4
                jne      $Nope            ; Jump if not a match
                cmp      ecx,dword ptr Intel+8
                jne      $Nope            ; Jump if not a match

  ; We have a match!!! (If an Intel chip!)

Standard CPUID EDX-Feature Flags

  ; CPUID (EDX= flags)  <<< Command EAX=1

CPUIDFLG_

Code

Bit

Flag Descriptions

FPU

000000001h

0

Floating-point support

VME

000000002h

1

Virtual Mode Extensions

DE

000000004h

2

Debugging Extensions

PSE

000000008h

3

Page Size Extension

TSC

000000010h

4

RDTSC supported

MSR

000000020h

5

RDMSR and WRMSR

PAE

000000040h

6

Physical Address Extensions

MCE

000000080h

7

Machine Check Exception

CX8

000000100h

8

CMPXCHG8B supported

APIC

000000200h

9

Advanced Programmable Interrupt Controller

---

000000400h

10

Reserved

SEP

000000800h

11

SYSCALL, SYSRET enable

MTRR

000004000h

12

Memory-type Range Reg

PGE

000002000h

13

Page Global Enable

MCA

000004000h

14

Machine Check Architecture

CMOV

000008000h

15

CMOV supported

PAT

000010000h

16

Page Attribute Table

PSE

000020000h

17

36-bit Page-Size Extensions

PSN

000040000h

18

(Intel) Processor Serial # (AMD) Reserved

CLFLUSH

000080000h

19

CLFlush enabled

---

000100000h

20

Reserved

DS

000200000h

21

(Intel) Debug Store (AMD) Reserved

ACPI

000400000h

22

(Intel) Thermal Monitor (AMD) Reserved

MMX

000800000h

23

MMX supported

FXSR

001000000h

24

Fast floating-point save and load

SSE

002000000h

25

SSE supported

SSE2

004000000h

26

SSE2 supported

SS

008000000h

27

(Intel) Self Snoop (AMD) Reserved

HTT

010000000h

28

(Intel) HTT (HyperThread) (AMD) Reserved

TM

020000000h

29

(Intel) Thermal Monitor (AMD) Reserved

---

040000000h

30

Reserved

PBE

080000000h

31

(Intel) Pending Break (AMD) Reserved

Intel — Standard CPUID ECX-Feature Flags

   ; CPUID (ECX= flags) <<< Command EAX=1

CPUIDFLG_

Code

Bit

Flag Descriptions

SSE3

000000001h

0

SSE3 supported

---

00000000xh

1, 2

Reserved

MONITOR

000000008h

3

MONITOR,WAIT supported

DS_CPL

000000010h

4

CPL Qualified Debug Store

---

0000000x0h

5, 6

Reserved

EIST

000000080h

7

Enhanced Intel SpeedStep

TM2

000000100h

8

Thermal Monitor 2

---

000000200h

9

Reserved

CID

000000400h

10

Context ID

---

00000xx00h

11-13

Reserved

xTPR

000004000h

14

Send Task Priority Messages

---

 

15-31

Reserved

Intel — Extended #1 CPUID EDX-Feature Flags

   ; CPUID (EDX= flags) <<< Command EAX=8000:0001h

CPUIDFLG_

Code

Bit

Flag Descriptions

---

 

0-28

Reserved

VME

020000000h

29

EM64T supported

---

 

30-31

Reserved

AMD — Extended #1 CPUID EDX-Feature Flags

   ; CPUID (EDX= flags) <<< Command EAX= 8000:0001h

AMD_EFLG

Code

Bit

Flag Descriptions

FPU

000000001h

0

Floating Point support

VME

000000002h

1

Virtual Mode Extensions

DE

000000004h

2

Debugging Extensions

PSE

000000008h

3

Page Size Extension

TSC

000000010h

4

RDTSC supported

MSR

000000020h

5

RDMSR and WRMSR

PAE

000000040h

6

Physical Address Extensions

MCE

000000080h

7

Machine Check Exception

CX8

000000100h

8

CMPXCHG8B supported

APIC

000000200h

9

Advanced Programmable Interrupt Controller

---

000000400h

10

Reserved

SEP

000000800h

11

SYSCALL, SYSRET enabled

MTRR

000004000h

12

Memory-type Range Reg

PGE

000002000h

13

Global Page Extension

MCA

000004000h

14

Machine Check Architecture

CMOV

000008000h

15

CMOV supported

PAT

000010000h

16

Page Attribute Table

PSE

000020000h

17

Page-Size Extensions

---

0000x0000h

18, 19

Reserved

NEPP

000100000h

20

No-Execute Page Protection

---

000200000h

21

Reserved

MMXEXT

000400000h

22

MMX Extensions supported

MMX

000800000h

23

MMX supported

FXSAVE

001000000h

24

FXSAVE, FXRSTOR enable

FFXSAVE

002000000h

25

Fast FXSAVE, FXRSTOR

---

 

26, 28

Reserved

EM64T

020000000h

29

EM64T / AMD64 (long)

3DNOWX

040000000h

30

3DNow! MMX+ supported

3DNOW

080000000h

31

3DNow! supported

PIII Serial License

Intel created a feature for the PIII processor in the original SSE instruction set, but due to a political uproar as an infringement upon privacy it was removed in successive processors. In some respects it was a good thing to be able to track a particular computer, such as a violator of an online gaming network. An exact machine could be banned due to its fingerprint. However, others felt that people would lose their anonymity while on the Internet.

    mov eax,1
    cpuid

    test  edx,CPUIDFLG_PSN
    jz $xit

      ; CPUID serial number is supported and enabled!

    push eax
    mov eax,3
    cpuid
    pop eax

      ; eax:edx:ecx = 96-bit serial number in capitalized hex digits.
      ;               XXXX-XXXX-XXXX-XXXX-XXXX-XXXX
    $xit:

Sample CPU Detection Code

There are a lot of features in the CPUID, but most of them are not needed for what we are doing here. I have documented some of what this instruction does (a lot more than what I normally need), but I strongly recommend that if you are truly interested in this instruction that you download the manufacturer's technical manuals.

Most programs being written these days are primarily written for a Protected Mode environment and so we only need to deal with, at a minimum, the first processor capable of truly running in Protected Mode — the 386 processor. (The 80286 does not count!) This CPU detection algorithm detects the model, manufacturer, and capabilities, and sets flags as such. As we really only deal with 32-bit modes in this book, we do not bother detecting for an 8086, 80186, or an 80286. We do, however, detect for a 386 or above. In our algorithm we use the following CPU IDs.

This instruction has been enhanced since I wrote Vector Game Math Processors as newer instructions have been added to the processor. It has been used throughout the book, but let us examine it a bit closer.

 ;      CPU Detect - definition IDs

 CPU_386         = 3       ; 80386
 CPU_486         = 4       ; 80486
 CPU_PENTIUM     = 5       ; P5 (Pentium)
 CPU_PENTIUM_PRO = 6       ; Pentium Pro
 CPU_PII         = 6       ; PII

Prior to the Pentium processor, a computer system would optionally have a floating-point chip, which contained a FPU. In the case of CPUs, no functionality is lost as one upgrades to a more advanced processor; they are all downward compatible. This is not the case with the FPU. Some functionality was lost; so if writing any floating-point instructions, you should know which FPU you are coding for. Some external FP chips did not exactly match the processor but were compatible.

 ; Legacy CPUs and compatible FPU coprocessors
 ;               CPU_086         NONE, FPU_087
 ;               CPU_186         NONE, FPU_087
 ;               CPU_286         NONE, FPU_287
 ;               CPU_386         NONE, FPU_287, FPU_387
 ;               CPU_486         NONE, FPU_387, FPU_487

 ;        FPU Detect - definition IDs

 FPU_NONE        = 0             ; No FPU chip
 FPU_087         = 1             ; 8087
 FPU_287         = 2             ; 80287
 FPU_387         = 3             ; 80387
 FPU_487         = CPU_486
 FPU_PENTIUM     = CPU_PENTIUM
 FPU_PII         = CPU_PII

The various manufacturers implemented the same functionality as Intel but recently have begun to do their own. Due to this, unions and intersections can be drawn, and so we use individual flags to indicate CPU capability.

x86 CPU Detect — Bit Flags

  typedef enum
  {
     CPUBITS_FPU       = 0x0001, // FPU flag
     CPUBITS_MMX       = 0x0002, // MMX flag
     CPUBITS_3DNOW     = 0x0004, // 3DNow! flag
     CPUBITS_FXSR      = 0x0008, // Fast FP Store
     CPUBITS_SSE       = 0x0010, // SSE
     CPUBITS_SSE2      = 0x0020, // SSE (Ext 2)
     CPUBITS_3DNOW_MMX = 0x0040, // 3DNow! (MMX Ext)
     CPUBITS_3DNOW_EXT = 0x0080, // 3DNow! (Ext)
     CPUBITS_3DNOW_SSE = 0x0100, // 3DNow! Professional
     CPUBITS_HTT       = 0x0200, // Hyperthreading Tech
     CPUBITS_SSE3      = 0x0400, // Prescott NI
     CPUBITS_EM64T     = 0x0800, // EM64T supported
     CPUBITS_AMD64     = 0x1000, // AMD Long Mode
  } CPUBITS;

Each manufacturer has its own unique optimization methods and so we get a vendor name.

x86 CPU Detect — Vendors

Example 16-1. ...inc???CpuAsm.h

  typedef enum
  {
      CPUVEN_UNKNOWN   = 0, // Unknown
      CPUVEN_INTEL     = 1, // Intel
      CPUVEN_AMD       = 2, // AMD
      CPUVEN_CYRIX     = 3, // Cyrix
      CPUVEN_CENTAUR   = 4, // IDT Centaur (WinChip)
      CPUVEN_NATIONAL  = 5, // National Semiconductor
      CPUVEN_UMC       = 6, // UMC
      CPUVEN_NEXGEN    = 7, // NexGen
      CPUVEN_RISE      = 8, // Rise
      CPUVEN_TRANSMETA = 9  // Transmeta
  } CPUVEN;

We use the following data structure to reference the extracted CPU information.

Cpu Detect — Information

  typedef struct CpuInfoType
  {
      uint  nCpuId;   // CPU type identifier
      uint  nFpuId;   // floating-point Unit  ID
      uint  nBits;    // Feature bits
      uint  nMfg;     // Manufacturer
      byte  nProcCnt; // # of logical processors
      byte  pad[3];
 } CpuInfo;
 CpuInfo struct 4
         nCpuId   dd  0 ; CPU type identifier
         nFpuId   dd  0 ; Floating-point unit identifier
         nBits    dd  0 ; Feature bits
         nMfg     dd  0 ; Manufacturer
         nProcCnt db  0 ; # of logical processors
         pad      db  0,0,0
 CpuInfo ends

This book's CPU detection uses the following data structure for finding matching vendor information. Each microprocessor that supports the CPUID instruction has encoded a 12-byte text string identifying the manufacturer.

  ;      Vendor Data Structure

  VENDOR STRUCT 4
         vname  BYTE   '------------'
         Id     DWORD  CPUVEN_UNKNOWN
  VENDOR ENDS

  VENDOR { "AMD ISBETTER", CPUVEN_AMD }       ; AMD Proto
  VENDOR { "AuthenticAMD", CPUVEN_AMD }       ; AMD
  VENDOR { "CyrixInstead", CPUVEN_CYRIX }     ; Cyrix & IBM
  VENDOR { "GenuineIntel", CPUVEN_INTEL }     ; Intel
  VENDOR { "CentaurHauls", CPUVEN_CENTAUR }   ; Centaur
  VENDOR { "UMC UMC UMC ", CPUVEN_UMC }       ; UMC (retired)
  VENDOR { "NexGenDriver", CPUVEN_NEXGEN }    ; NexGen (retired)
  VENDOR { "RiseRiseRise", CPUVEN_RISE }      ; Rise
  VENDOR { "GenuineTMx86", CPUVEN_TRANSMETA } ; Transmeta

Example 16-2. ...RootApp.cpp

  #include "CpuAsm.h"             // CPU module

      CpuInfo cinfo;
      char szBuf[ CPU_SZBUF_MAX ];

      CpuDetect( &cinfo );       // Detect CPU

      cout << "
CPU Detection Code Snippet

";
            // Fills in buffer 'szBuf' with CPU information!
      cout << CpuInfoStr( szBuf, &cinfo ) << endl;

      CpuSetup( &cinfo );         // Now set up function pointers

This is an example of what gets filled into the ASCII buffer with a call to the function CpuInfoStr().

 "CpuId:15 'INTEL' FPU MMX FXSR SSE SSE2 SSE3 HTT"

That took care of the initial detection code. Now comes the fun part —function mapping. Every function you write should have a set of slower default code written in a high-level language such as C. This is really very simple. First there are the private definitions:

        void FmdSetup(const CpuInfo * const pcinfo);

        void vmp_FMulGeneric(float * const pfD, float fA, float fB);
        void vmp_FMulAsm3DNow(float * const pfD, float fA, float fB);
        void vmp_FMulAsmSSE(float * const pfD, float fA, float fB);

        void vmp_FDivGeneric(float * const pfD, float fA, float fB);
        void vmp_FDivAsm3DNow(float * const pfD, float fA, float fB);
        void vmp_FDivAsmSSE(float * const pfD, float fA, float fB);

        void vmp_FDivFastAsm3DNow(float * const pfD, float fA, float fB);
        void vmp_FDivFastAsmSSE(float * const pfD, float fA, float fB);

Then there are the public application definitions:

   // Multiplication
   typedef void (*vmp_FMulProc)(float * const pfD, float fA, float fB);
   extern vmp_FMulProc vmp_FMul;

   // Division
   typedef void (*vmp_FDivProc)(float * const pfD, float fA, float fB);
   extern vmp_FDivProc vmp_FDiv;
   extern vmp_FDivProc vmp_FDivFast;

There are the generic as well as processor-based functions such as:

   // Multiplication

   void vmp_FMulGeneric(float * const pfD, float fA, float fB)
   {
       ASSERT_PTR4(pfD);

       *pfD = fA * fB;
   }

The initialization code assigns the appropriate processor-based function to the public function pointer:

   void CpuSetup(const CpuInfo * const pcinfo)
   {
       ASSERT_PTR4(pcinfo);

       if (CPUBITS_SSE & pcinfo->nBits)
         {
           vmp_FMul =              vmp_FMulAsmSSE;
           vmp_FDiv =              vmp_FDivAsmSSE;
           vmp_FDivFast =          vmp_FDivFastAsmSSE; // ***FAST***
         }
       else if (CPUBITS_3DNOW & pcinfo->nBits)
         {
           vmp_FMul =              vmp_FMulAsm3DNow;
           vmp_FDiv =              vmp_FDivAsm3DNow;
           vmp_FDivFast =          vmp_FDivFastAsm3DNow; //***FAST***
         }

      else
         {
           vmp_FMul =              vmp_FMulGeneric;
           vmp_FDiv =              vmp_FDivGeneric;
           vmp_FDivFast =          vmp_FDivGeneric;
         }
  }

You will probably need to play with the mapping until you get used to it. You could use case statements, function table lookups, or other methods, but due to similarity of processor types I find the conditional branching with Boolean logic seems to work best.

What is supplied should be thought of as a starting point. It should be included with most applications, even those that do not use any custom assembly code, as it will compile a breakdown of the computer that ran the application. With custom assembly code, it is the building block of writing cross processor code. There is one more bit of "diagnostic" information that you can use — the processor speed. It can give you an idea of why your application is not running well. (Sometimes processors do not run at their marked speed either through misconfiguration or overheating.) This is discussed in Chapter 18, "System."

The listed information can be obtained by using the included function CpuDetect(); however, from your point of view, who manufactured the CPU is not nearly as important as to the bits CPUBITS listed above! Each of those bits being set indicates the existence of the associated functionality. Your program would merely check the bit and correlate the correct set of code. If the processor sets the CPUBITS_3DNOW bit, then it would need to vector to the 3DNow!-based algorithm. If the CPUBITS_SSE bit is set, then it would vector to that set of code. Keep in mind that when I first started writing this book neither existed on the same CPU, but while I was writing it, AMD came out with 3DNow! Professional. This is a union of the two superset families (excluding the SSE3) for which there is also a CPU bit definition. However, that can easily change in the future. My recommendation would be to rate their priority from highest to lowest performance in the initialization logic of your program based upon your applications' criteria.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset