Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 16. What CPUID?

There are multiple manufacturers all making different models of the 80×86 type microprocessors. Some are variations of the Intel processors and are highly specialized, but most are not. They are clones of the Intel processor family but with their own designs that require alternate optimization methods. Most of these manufacturers have technical manuals usually available in a PDF format that can be downloaded from the Internet and used for all your custom optimization needs. If the project you're coding for uses custom hardware, then you are probably using a custom processor such as National Semiconductor's NS486SXF under an operating system such as pSOS. When you are designing code for a specific processor, your code can be highly optimized and tuned accordingly.

When the hardware you are writing code for is a little more generic, the programmer needs a method to identify the exact model of processor that the code is running on. Each manufacturer has written a sample CPU code detection algorithm that uses the CPUID instruction. This is great, but these code samples are not exactly compatible with each other. Since it is ridiculous to write code that encapsulates all of these samples I have written this chapter to help you. You can find all sorts of variations of the following program on the Internet, but the following is designed to be expandable and versatile.

Most of these Intel processors are deviations of each other but if we take a closer look at their "family type" we will note a pattern of 80(x)86, where the x represents a family number. A 3 would be the 80386, etc. So using this family type number we can actually group the processor into a category of functionality, as each "group" actually has its individual subset of instructions that it could execute.

Other manufacturers have second sourced various models of the 80×86 processor line. Intel and AMD are the primary manufacturers, but other manufacturers have brought to market their modified or less expensive versions of these same processors.

Workbench Files:Benchx86chap03projectplatform

	project	platform
CPU ID	cpuid	vc6
		vc.net

CPUID

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
CPUID

cpuid

This instruction uses the value stored in the EAX register as a function identifier and returns the related requested information in the various associated registers.

With the release of the Pentium chip, Intel instituted the CPUID instruction, which gives detailed information of the capabilities of the individual processor. This was also introduced into the re-release of the Intel 80486 processor. AMD has implemented it in all models since the Am486. This makes it easier to identify the capabilities of the CPU being tested.

Before trying to use this instruction, bit #21 of the EFLAGS/ RFLAGS must be tested to see if it is writable. If it is, the CPUID instruction exists and therefore can be called. The application code uses mainly the PUSHFD/PUSHFQ and POPFD/POPFQ instructions to manipulate the EFLAGS/RFLAGS register.

        pushfd                  ; push EFLAGS register
        pop    eax              ; pop those flags into EAX
        xor    eax,EFLAGS_ID    ; flip ID bit#21 in EFLAGS
        push   eax              ; push modified flags on stack
        popfd                   ; pop flags back into EFLAGS
        pushfd                  ; Push resulting EFLAGS on stack
        pop    ecx              ; pop those flags into ECX
        xor    eax,ecx          ; See if bit stayed flipped
        jz     $nope            ; Jump if bit not flipped

   ; If here then bit flipped so CPUID exists
           cpuid

At a very minimum, all CPUs that support the CPUID instruction support both functions #0 and #1.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	-	-	-	-	-

Flags: None are altered by this opcode.

Function	Returned Data
EAX=0	EAX = The highest CPUID function number this CPU can handle. The Intel Pentium and 486 return a 1 in EAX. The Pentium Pro returns a 2 in EAX. The EBX, EDX, ECX registers contain a text identifier. ebx edx ecx Amd = Auth, enti, cAMD Centaur = Cent, aurH, auls Cyrix = Cyri, xIns, tead Intel = Genu, ineI, ntel
EAX=1	EAX = Version Information. Bits 0...3 – Stepping ID Bits 4...7 – Model Bits 8...11 – Generation / family Bits 12...15 – Reserved Bits 16...19 – Extended model Bits 20...27 – Extended family Bits 28...31 – Reserved. EBX = Bits 0...7 – Brand Index Bits 8...15 – CLFLUSH line size Bits 16...23 – (Intel) # of logical processors (AMD) Reserved Bits 24...31 – Processor's initial local APIC ID ECX = (Intel) Feature info. (AMD) Reserved EDX= Feature info
Intel EAX=2	EAX, EBX, ECX, EDX = Cache and TLB information
Intel EAX=3	EAX, EBX, ECX, EDX = Reserved
Intel EAX=4	EAX = Bits 0...4 – Cache type Bits 5...7 – Cache level Bit 8 – Self-initializing cache Bit 9 – Fully associative cache Bits 10...13 – Reserved Bits 14-25 – Number of threads sharing cache Bits 26...31 – Number of processor cores on the die EBX = Bits 0...11 L = System coherency line size Bits 12...21 P = Physical line partitions Bits 22...31 W = Ways of associativity ECX = 0...31 Number of sets EDX = Reserved
Intel EAX=5	EAX = Bits 0...15 – Smallest monitor-line byte size Bits 16...31 – Reserved EBX, ECX, EDX = Reserved

AMD, Cyrix, and WinChip

EAX= 80000000h

If string identifier with function #0 matches for AMD, Cyrix, or WinChip, test for this function. If a non-zero value is returned in EAX, an extended function set is supported, just like function #0. The EAX register contains the highest extended function that the CPU can handle.

Intel

EAX= 80000000h

EAX = Maximum input value for extended CPUIDs

EBX, ECX, EDX = Reserved

AMD, Cyrix, and WinChip

EAX= 80000001h

See the Intel – Standard CPUID ECX-Feature Flags section.

EAX = Processor signature

EBX, ECX = Reserved

See the AMD – Extended #1 CPUID EDX-Feature Flags section.

Intel

EAX= 80000001h

Extended processor signature and extended feature bits. See the Intel – Extended #1 CPUID EDX-Feature Flags section.

AMD, Cyrix, WinChip, and Intel

EAX= 80000002h 80000003h 80000004h

EAX, EBX, ECX, EDX = 4 * 4 * 3 = 48 byte text string

AMD

EAX= 80000005h

TLB and L1 cache information

Intel

EAX= 80000005h

EAX, EBX, ECX, EDX = Reserved

AMD, Cyrix, WinChip, and Intel

EAX= 80000006h

L2 Cache bits

ECX =

Bits 0...7 – Cache line size

Bits 8...11 – Lines per tag

Bits 12...15 – L2 Associativity

Bits 16...31 – Number of 1K cache blocks

EAX, EBX, EDX = Reserved

AMD

EAX= 80000007h

EDX = Advanced power management

EAX, EBX, ECX = Reserved

Intel

EAX= 80000007h

EAX, EBX, ECX, EDX = Reserved

AMD, Intel

EAX= 80000008h

EAX =

Bits 0...7 – Physical address bits

Bits 8...15 – Virtual address bits

Bits 16...31 – Reserved

EBX, ECX, EDX = Reserved

The initial CPUID call gives us the manufacturer ID string.

  Intel:        db       "GenuineIntel"

                mov      eax,0            ; Function #0
                cpuid

                cmp      ebx,dword ptr Intel
                jne      $Nope            ; Jump if not a match
                cmp      edx,dword ptr Intel+4
                jne      $Nope            ; Jump if not a match
                cmp      ecx,dword ptr Intel+8
                jne      $Nope            ; Jump if not a match

  ; We have a match!!! (If an Intel chip!)

Standard CPUID EDX-Feature Flags

  ; CPUID (EDX= flags)  <<< Command EAX=1

CPUIDFLG_	Code	Bit	Flag Descriptions
FPU	000000001h	0	Floating-point support
VME	000000002h	1	Virtual Mode Extensions
DE	000000004h	2	Debugging Extensions
PSE	000000008h	3	Page Size Extension
TSC	000000010h	4	RDTSC supported
MSR	000000020h	5	RDMSR and WRMSR
PAE	000000040h	6	Physical Address Extensions
MCE	000000080h	7	Machine Check Exception
CX8	000000100h	8	CMPXCHG8B supported
APIC	000000200h	9	Advanced Programmable Interrupt Controller
---	000000400h	10	Reserved
SEP	000000800h	11	SYSCALL, SYSRET enable
MTRR	000004000h	12	Memory-type Range Reg
PGE	000002000h	13	Page Global Enable
MCA	000004000h	14	Machine Check Architecture
CMOV	000008000h	15	CMOV supported
PAT	000010000h	16	Page Attribute Table
PSE	000020000h	17	36-bit Page-Size Extensions
PSN	000040000h	18	(Intel) Processor Serial # (AMD) Reserved
CLFLUSH	000080000h	19	CLFlush enabled
---	000100000h	20	Reserved
DS	000200000h	21	(Intel) Debug Store (AMD) Reserved
ACPI	000400000h	22	(Intel) Thermal Monitor (AMD) Reserved
MMX	000800000h	23	MMX supported
FXSR	001000000h	24	Fast floating-point save and load
SSE	002000000h	25	SSE supported
SSE2	004000000h	26	SSE2 supported
SS	008000000h	27	(Intel) Self Snoop (AMD) Reserved
HTT	010000000h	28	(Intel) HTT (HyperThread) (AMD) Reserved
TM	020000000h	29	(Intel) Thermal Monitor (AMD) Reserved
---	040000000h	30	Reserved
PBE	080000000h	31	(Intel) Pending Break (AMD) Reserved

Intel — Standard CPUID ECX-Feature Flags

   ; CPUID (ECX= flags) <<< Command EAX=1

CPUIDFLG_	Code	Bit	Flag Descriptions
SSE3	000000001h	0	SSE3 supported
---	00000000xh	1, 2	Reserved
MONITOR	000000008h	3	MONITOR,WAIT supported
DS_CPL	000000010h	4	CPL Qualified Debug Store
---	0000000x0h	5, 6	Reserved
EIST	000000080h	7	Enhanced Intel SpeedStep
TM2	000000100h	8	Thermal Monitor 2
---	000000200h	9	Reserved
CID	000000400h	10	Context ID
---	00000xx00h	11-13	Reserved
xTPR	000004000h	14	Send Task Priority Messages
---		15-31	Reserved

Intel — Extended #1 CPUID EDX-Feature Flags

   ; CPUID (EDX= flags) <<< Command EAX=8000:0001h

CPUIDFLG_	Code	Bit	Flag Descriptions
---		0-28	Reserved
VME	020000000h	29	EM64T supported
---		30-31	Reserved

AMD — Extended #1 CPUID EDX-Feature Flags

   ; CPUID (EDX= flags) <<< Command EAX= 8000:0001h

AMD_EFLG	Code	Bit	Flag Descriptions
FPU	000000001h	0	Floating Point support
VME	000000002h	1	Virtual Mode Extensions
DE	000000004h	2	Debugging Extensions
PSE	000000008h	3	Page Size Extension
TSC	000000010h	4	RDTSC supported
MSR	000000020h	5	RDMSR and WRMSR
PAE	000000040h	6	Physical Address Extensions
MCE	000000080h	7	Machine Check Exception
CX8	000000100h	8	CMPXCHG8B supported
APIC	000000200h	9	Advanced Programmable Interrupt Controller
---	000000400h	10	Reserved
SEP	000000800h	11	SYSCALL, SYSRET enabled
MTRR	000004000h	12	Memory-type Range Reg
PGE	000002000h	13	Global Page Extension
MCA	000004000h	14	Machine Check Architecture
CMOV	000008000h	15	CMOV supported
PAT	000010000h	16	Page Attribute Table
PSE	000020000h	17	Page-Size Extensions
---	0000x0000h	18, 19	Reserved
NEPP	000100000h	20	No-Execute Page Protection
---	000200000h	21	Reserved
MMXEXT	000400000h	22	MMX Extensions supported
MMX	000800000h	23	MMX supported
FXSAVE	001000000h	24	FXSAVE, FXRSTOR enable
FFXSAVE	002000000h	25	Fast FXSAVE, FXRSTOR
---		26, 28	Reserved
EM64T	020000000h	29	EM64T / AMD64 (long)
3DNOWX	040000000h	30	3DNow! MMX+ supported
3DNOW	080000000h	31	3DNow! supported

PIII Serial License

Intel created a feature for the PIII processor in the original SSE instruction set, but due to a political uproar as an infringement upon privacy it was removed in successive processors. In some respects it was a good thing to be able to track a particular computer, such as a violator of an online gaming network. An exact machine could be banned due to its fingerprint. However, others felt that people would lose their anonymity while on the Internet.

    mov eax,1
    cpuid

    test  edx,CPUIDFLG_PSN
    jz $xit

      ; CPUID serial number is supported and enabled!

    push eax
    mov eax,3
    cpuid
    pop eax

      ; eax:edx:ecx = 96-bit serial number in capitalized hex digits.
      ;               XXXX-XXXX-XXXX-XXXX-XXXX-XXXX
    $xit:

Sample CPU Detection Code

There are a lot of features in the CPUID, but most of them are not needed for what we are doing here. I have documented some of what this instruction does (a lot more than what I normally need), but I strongly recommend that if you are truly interested in this instruction that you download the manufacturer's technical manuals.

Most programs being written these days are primarily written for a Protected Mode environment and so we only need to deal with, at a minimum, the first processor capable of truly running in Protected Mode — the 386 processor. (The 80286 does not count!) This CPU detection algorithm detects the model, manufacturer, and capabilities, and sets flags as such. As we really only deal with 32-bit modes in this book, we do not bother detecting for an 8086, 80186, or an 80286. We do, however, detect for a 386 or above. In our algorithm we use the following CPU IDs.

This instruction has been enhanced since I wrote Vector Game Math Processors as newer instructions have been added to the processor. It has been used throughout the book, but let us examine it a bit closer.

 ;      CPU Detect - definition IDs

 CPU_386         = 3       ; 80386
 CPU_486         = 4       ; 80486
 CPU_PENTIUM     = 5       ; P5 (Pentium)
 CPU_PENTIUM_PRO = 6       ; Pentium Pro
 CPU_PII         = 6       ; PII

Prior to the Pentium processor, a computer system would optionally have a floating-point chip, which contained a FPU. In the case of CPUs, no functionality is lost as one upgrades to a more advanced processor; they are all downward compatible. This is not the case with the FPU. Some functionality was lost; so if writing any floating-point instructions, you should know which FPU you are coding for. Some external FP chips did not exactly match the processor but were compatible.

 ; Legacy CPUs and compatible FPU coprocessors
 ;               CPU_086         NONE, FPU_087
 ;               CPU_186         NONE, FPU_087
 ;               CPU_286         NONE, FPU_287
 ;               CPU_386         NONE, FPU_287, FPU_387
 ;               CPU_486         NONE, FPU_387, FPU_487

 ;        FPU Detect - definition IDs

 FPU_NONE        = 0             ; No FPU chip
 FPU_087         = 1             ; 8087
 FPU_287         = 2             ; 80287
 FPU_387         = 3             ; 80387
 FPU_487         = CPU_486
 FPU_PENTIUM     = CPU_PENTIUM
 FPU_PII         = CPU_PII

The various manufacturers implemented the same functionality as Intel but recently have begun to do their own. Due to this, unions and intersections can be drawn, and so we use individual flags to indicate CPU capability.

x86 CPU Detect — Bit Flags

  typedef enum
  {
     CPUBITS_FPU       = 0x0001, // FPU flag
     CPUBITS_MMX       = 0x0002, // MMX flag
     CPUBITS_3DNOW     = 0x0004, // 3DNow! flag
     CPUBITS_FXSR      = 0x0008, // Fast FP Store
     CPUBITS_SSE       = 0x0010, // SSE
     CPUBITS_SSE2      = 0x0020, // SSE (Ext 2)
     CPUBITS_3DNOW_MMX = 0x0040, // 3DNow! (MMX Ext)
     CPUBITS_3DNOW_EXT = 0x0080, // 3DNow! (Ext)
     CPUBITS_3DNOW_SSE = 0x0100, // 3DNow! Professional
     CPUBITS_HTT       = 0x0200, // Hyperthreading Tech
     CPUBITS_SSE3      = 0x0400, // Prescott NI
     CPUBITS_EM64T     = 0x0800, // EM64T supported
     CPUBITS_AMD64     = 0x1000, // AMD Long Mode
  } CPUBITS;

Each manufacturer has its own unique optimization methods and so we get a vendor name.

x86 CPU Detect — Vendors

Example 16-1. ...inc???CpuAsm.h

  typedef enum
  {
      CPUVEN_UNKNOWN   = 0, // Unknown
      CPUVEN_INTEL     = 1, // Intel
      CPUVEN_AMD       = 2, // AMD
      CPUVEN_CYRIX     = 3, // Cyrix
      CPUVEN_CENTAUR   = 4, // IDT Centaur (WinChip)
      CPUVEN_NATIONAL  = 5, // National Semiconductor
      CPUVEN_UMC       = 6, // UMC
      CPUVEN_NEXGEN    = 7, // NexGen
      CPUVEN_RISE      = 8, // Rise
      CPUVEN_TRANSMETA = 9  // Transmeta
  } CPUVEN;

We use the following data structure to reference the extracted CPU information.

Cpu Detect — Information

  typedef struct CpuInfoType
  {
      uint  nCpuId;   // CPU type identifier
      uint  nFpuId;   // floating-point Unit  ID
      uint  nBits;    // Feature bits
      uint  nMfg;     // Manufacturer
      byte  nProcCnt; // # of logical processors
      byte  pad[3];

 } CpuInfo;
 CpuInfo struct 4
         nCpuId   dd  0 ; CPU type identifier
         nFpuId   dd  0 ; Floating-point unit identifier
         nBits    dd  0 ; Feature bits
         nMfg     dd  0 ; Manufacturer
         nProcCnt db  0 ; # of logical processors
         pad      db  0,0,0
 CpuInfo ends

This book's CPU detection uses the following data structure for finding matching vendor information. Each microprocessor that supports the CPUID instruction has encoded a 12-byte text string identifying the manufacturer.

  ;      Vendor Data Structure

  VENDOR STRUCT 4
         vname  BYTE   '------------'
         Id     DWORD  CPUVEN_UNKNOWN
  VENDOR ENDS

  VENDOR { "AMD ISBETTER", CPUVEN_AMD }       ; AMD Proto
  VENDOR { "AuthenticAMD", CPUVEN_AMD }       ; AMD
  VENDOR { "CyrixInstead", CPUVEN_CYRIX }     ; Cyrix & IBM
  VENDOR { "GenuineIntel", CPUVEN_INTEL }     ; Intel
  VENDOR { "CentaurHauls", CPUVEN_CENTAUR }   ; Centaur
  VENDOR { "UMC UMC UMC ", CPUVEN_UMC }       ; UMC (retired)
  VENDOR { "NexGenDriver", CPUVEN_NEXGEN }    ; NexGen (retired)
  VENDOR { "RiseRiseRise", CPUVEN_RISE }      ; Rise
  VENDOR { "GenuineTMx86", CPUVEN_TRANSMETA } ; Transmeta

Example 16-2. ...RootApp.cpp

  #include "CpuAsm.h"             // CPU module

      CpuInfo cinfo;
      char szBuf[ CPU_SZBUF_MAX ];

      CpuDetect( &cinfo );       // Detect CPU

      cout << "
CPU Detection Code Snippet

";
            // Fills in buffer 'szBuf' with CPU information!
      cout << CpuInfoStr( szBuf, &cinfo ) << endl;

      CpuSetup( &cinfo );         // Now set up function pointers

This is an example of what gets filled into the ASCII buffer with a call to the function CpuInfoStr().

 "CpuId:15 'INTEL' FPU MMX FXSR SSE SSE2 SSE3 HTT"

That took care of the initial detection code. Now comes the fun part —function mapping. Every function you write should have a set of slower default code written in a high-level language such as C. This is really very simple. First there are the private definitions:

        void FmdSetup(const CpuInfo * const pcinfo);

        void vmp_FMulGeneric(float * const pfD, float fA, float fB);
        void vmp_FMulAsm3DNow(float * const pfD, float fA, float fB);
        void vmp_FMulAsmSSE(float * const pfD, float fA, float fB);

        void vmp_FDivGeneric(float * const pfD, float fA, float fB);
        void vmp_FDivAsm3DNow(float * const pfD, float fA, float fB);
        void vmp_FDivAsmSSE(float * const pfD, float fA, float fB);

        void vmp_FDivFastAsm3DNow(float * const pfD, float fA, float fB);
        void vmp_FDivFastAsmSSE(float * const pfD, float fA, float fB);

Then there are the public application definitions:

   // Multiplication
   typedef void (*vmp_FMulProc)(float * const pfD, float fA, float fB);
   extern vmp_FMulProc vmp_FMul;

   // Division
   typedef void (*vmp_FDivProc)(float * const pfD, float fA, float fB);
   extern vmp_FDivProc vmp_FDiv;
   extern vmp_FDivProc vmp_FDivFast;

There are the generic as well as processor-based functions such as:

   // Multiplication

   void vmp_FMulGeneric(float * const pfD, float fA, float fB)
   {
       ASSERT_PTR4(pfD);

       *pfD = fA * fB;
   }

The initialization code assigns the appropriate processor-based function to the public function pointer:

   void CpuSetup(const CpuInfo * const pcinfo)
   {
       ASSERT_PTR4(pcinfo);

       if (CPUBITS_SSE & pcinfo->nBits)
         {
           vmp_FMul =              vmp_FMulAsmSSE;
           vmp_FDiv =              vmp_FDivAsmSSE;
           vmp_FDivFast =          vmp_FDivFastAsmSSE; // ***FAST***
         }

       else if (CPUBITS_3DNOW & pcinfo->nBits)
         {
           vmp_FMul =              vmp_FMulAsm3DNow;
           vmp_FDiv =              vmp_FDivAsm3DNow;
           vmp_FDivFast =          vmp_FDivFastAsm3DNow; //***FAST***
         }

      else
         {
           vmp_FMul =              vmp_FMulGeneric;
           vmp_FDiv =              vmp_FDivGeneric;
           vmp_FDivFast =          vmp_FDivGeneric;
         }
  }

You will probably need to play with the mapping until you get used to it. You could use case statements, function table lookups, or other methods, but due to similarity of processor types I find the conditional branching with Boolean logic seems to work best.

What is supplied should be thought of as a starting point. It should be included with most applications, even those that do not use any custom assembly code, as it will compile a breakdown of the computer that ran the application. With custom assembly code, it is the building block of writing cross processor code. There is one more bit of "diagnostic" information that you can use — the processor speed. It can give you an idea of why your application is not running well. (Sometimes processors do not run at their marked speed either through misconfiguration or overheating.) This is discussed in Chapter 18, "System."

The listed information can be obtained by using the included function CpuDetect(); however, from your point of view, who manufactured the CPU is not nearly as important as to the bits CPUBITS listed above! Each of those bits being set indicates the existence of the associated functionality. Your program would merely check the bit and correlate the correct set of code. If the processor sets the CPUBITS_3DNOW bit, then it would need to vector to the 3DNow!-based algorithm. If the CPUBITS_SSE bit is set, then it would vector to that set of code. Keep in mind that when I first started writing this book neither existed on the same CPU, but while I was writing it, AMD came out with 3DNow! Professional. This is a union of the two superset families (excluding the SSE3) for which there is also a CPU bit definition. However, that can easily change in the future. My recommendation would be to rate their priority from highest to lowest performance in the initialization logic of your program based upon your applications' criteria.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 16. What CPUID?

Create new playlist

Sign In

Sign Up

Chapter 16. What CPUID?

CPUID

Standard CPUID EDX-Feature Flags

Intel — Standard CPUID ECX-Feature Flags

Intel — Extended #1 CPUID EDX-Feature Flags

AMD — Extended #1 CPUID EDX-Feature Flags

PIII Serial License

Sample CPU Detection Code

x86 CPU Detect — Bit Flags

x86 CPU Detect — Vendors

Cpu Detect — Information

Table of Contents for
16. What CPUID?