Chapter 15. Binary-Coded Decimal (BCD)

Converting an ASCII string to binary-coded decimal is as easy as pie (or is it a piece of cake?). In BCD, for every byte, the lower 4-bit nibble and upper 4-bit nibble each store a value from 0 to 9 (think double-digit hex only the upper six values A through F are ignored).

Workbench Files:Benchx86chap15projectplatform

 

project

platform

ASE to VMP

ase2vmp

vc6

BCD 2N

cd

vc.net

BCD

Table 15-1. ASCII numerical digit to hex and decimal values

ASCII

0

1

2

3

4

5

6

7

8

9

Hex

0x30

0x31

0x32

0x33

0x34

0x35

0x36

0x37

0x38

0x39

Decimal

48

49

50

51

52

53

54

55

56

57

BCD

0

1

2

3

4

5

6

7

8

9

Binary

0000

0001

0010

0011

0100

0101

0110

0111

1000

1001

Converting a BCD value from ASCII to a nibble is as easy as subtracting the hex value of 0x30, '0', or 48 decimal from the ASCII numerical value and get the resulting value with a range of {0...9}.

   byte ASCIItoBCD(char c)
   {
     ASSERT(('0' <= c) && (c <= '9'));

     return (byte)(c - '0'),
   }

When the 8086 processor was first manufactured the FPU was a separate optional chip (8087). There was a need for some BCD operations similar to other processors and so it was incorporated into the CPU. The 8087 had some BCD support as well. When the 64-bit processor was developed, it was decided that BCD support was not required anymore as the FPU was an alternative method.

The FPU uses the first nine bytes to support 18 BCD digits. The uppermost bit of the 10th byte indicates the value is negative if set or positive if the bit is clear.

Ten-byte BCD data storage. MSB in far left byte (byte #9) is the sign bit and the rightmost eight bytes (#8...0) contain the BCD value pairs. The 18th BCD digit resides in the upper nibble of byte #8 and the 1st BCD digit resides in the lower nibble of byte #0.

Figure 15-1. Ten-byte BCD data storage. MSB in far left byte (byte #9) is the sign bit and the rightmost eight bytes (#8...0) contain the BCD value pairs. The 18th BCD digit resides in the upper nibble of byte #8 and the 1st BCD digit resides in the lower nibble of byte #0.

Setting the upper nibble of a byte is merely the shifting left of a BCD digit by four bits, then logical ORing (or suming) the lower nibble.

   byte BCDtoByte(byte lo, byte hi)
   {
     return (hi << 4) | lo;
   }

DAA — Decimal Adjust AL (After) Addition

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

DAA

DAA — Decimal Adjust AL (After) Addition

DAA — Decimal Adjust AL (After) Addition

DAA — Decimal Adjust AL (After) Addition

DAA — Decimal Adjust AL (After) Addition

DAA — Decimal Adjust AL (After) Addition

DAA — Decimal Adjust AL (After) Addition

DAA — Decimal Adjust AL (After) Addition

32

DAA — Decimal Adjust AL (After) Addition

32

daa

Signed

The DAA general-purpose instruction adjusts the EFLAGS for a decimal carry after an addition.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

-

-

X

-

X

Flags: The Aux and Carry flags are set to 1 if an addition resulted in a decimal carry in their associated 4-bit nibble; otherwise they are cleared to 0.

        xor     eax,eax         ; Reset Carry(s)
   $L1: mov     al,[edi]        ; D = D + A
        adc     al,[esi]
        daa
        mov     [edi],al        ; Store result
        dec     esi
        dec     edi
        dec     ecx
        jne     $L1             ; Loop for n BCD bytes

Note that this function steps through memory in reverse byte order, which is not processor efficient. High digits are in low offset bytes, and low digits are in high offset bytes: {N...0}. So the operation must go to the end of the buffer and traverse memory backward from low-digit pairs to high-digit pairs. If not working with the FPU to handle BCD, then each nibble pair could be stored in reverse order: {0...N}. Only when they need to be displayed or printed would there be a reverse increment through memory. Note this is backward to the ordering of the FPU! The sample code uses this method.

DAS — Decimal Adjust AL (After) Subtraction

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

DAS

DAS — Decimal Adjust AL (After) Subtraction

DAS — Decimal Adjust AL (After) Subtraction

DAS — Decimal Adjust AL (After) Subtraction

DAS — Decimal Adjust AL (After) Subtraction

DAS — Decimal Adjust AL (After) Subtraction

DAS — Decimal Adjust AL (After) Subtraction

DAS — Decimal Adjust AL (After) Subtraction

32

DAS — Decimal Adjust AL (After) Subtraction

32

das

Signed

The DAS general-purpose instruction adjusts the EFLAGS for a decimal borrow after a subtraction.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

-

-

X

-

X

Flags: The Aux and Carry flags are set to 1 if a subtraction resulted in a decimal carry set due to a borrow in their associated 4-bit nibble; otherwise they are cleared to 0.

        xor  eax,eax     ; Reset Carry(s)
   $L1: mov  al,[edi]    ; D = D + A
        sbb  al,[esi]
        das
        mov  [edi],al    ; Store result
        dec  esi
        dec  edi
        dec  ecx
        jne  $L1         ; Loop for n BCD bytes

AAA — ASCII Adjust (After) Addition

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

AAA

AAA — ASCII Adjust (After) Addition

AAA — ASCII Adjust (After) Addition

AAA — ASCII Adjust (After) Addition

AAA — ASCII Adjust (After) Addition

AAA — ASCII Adjust (After) Addition

AAA — ASCII Adjust (After) Addition

AAA — ASCII Adjust (After) Addition

32

AAA — ASCII Adjust (After) Addition

32

aaa

Signed

The AAA general-purpose instruction adjusts the EFLAGS for a decimal carry. If a resulting calculation is greater than 9, then AL is set to the remainder between (0...9) and AH is incremented.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

-

-

X

-

X

Flags: The Aux and Carry flags are set to 1 if a decimal carry resulted; otherwise they are cleared to 0.

        add     al,ah
        aaa
        or      al,'0'    ; '0' + {0...9} = ASCII '0...9'

AAS — ASCII Adjust AL (After) Subtraction

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

AAS

AAS — ASCII Adjust AL (After) Subtraction

AAS — ASCII Adjust AL (After) Subtraction

AAS — ASCII Adjust AL (After) Subtraction

AAS — ASCII Adjust AL (After) Subtraction

AAS — ASCII Adjust AL (After) Subtraction

AAS — ASCII Adjust AL (After) Subtraction

AAS — ASCII Adjust AL (After) Subtraction

32

AAS — ASCII Adjust AL (After) Subtraction

32

aas

Signed

The AAS general-purpose instruction adjusts the EFLAGS depending on the results of the AL register after a multiplication operation. If a resulting calculation sets the carry indicating a borrow has occurred, then AL is set to the remainder between (0...9) and AH is decremented.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

-

-

X

-

X

Flags: The Aux and Carry flags are set to 1 if a decimal borrow resulted; otherwise they are cleared to 0.

        sub     al,'7'
        aas
        or      al,'0'        ; '0' + {0...9} = ASCII '0...9'

AAM — ASCII Adjust AX (After) Multiplication

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

AAM

AAM — ASCII Adjust AX (After) Multiplication

AAM — ASCII Adjust AX (After) Multiplication

AAM — ASCII Adjust AX (After) Multiplication

AAM — ASCII Adjust AX (After) Multiplication

AAM — ASCII Adjust AX (After) Multiplication

AAM — ASCII Adjust AX (After) Multiplication

AAM — ASCII Adjust AX (After) Multiplication

32

AAM — ASCII Adjust AX (After) Multiplication

32

aam

Signed

The AAM general-purpose instruction adjusts the EFLAGS depending on the results of the AL register after a multiplication operation.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

X

X

-

X

-

Flags: The Sign, Zero, and Parity flags are set to the resulting value in theAL register.

        mul     al,bh
        aam

AAD — ASCII Adjust AX (Before) Division

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

AAD

AAD — ASCII Adjust AX (Before) Division

AAD — ASCII Adjust AX (Before) Division

AAD — ASCII Adjust AX (Before) Division

AAD — ASCII Adjust AX (Before) Division

AAD — ASCII Adjust AX (Before) Division

AAD — ASCII Adjust AX (Before) Division

AAD — ASCII Adjust AX (Before) Division

32

AAD — ASCII Adjust AX (Before) Division

32

aad

Signed

The AAD general-purpose instruction adjusts the EFLAGS in preparation for a division operation.

Flags

O.flow

Sign

Zero

Aux

Parity

Carry

 

-

X

X

-

X

-

Flags: The Sign, Zero, and Parity flags are set to the resulting value in the AL register.

        and     eax,0000111100001111b
        aad

FBLD — FPU (BCD Load)

Mnemonic

P

PII

K6

3D!

3Mx+

SSE

SSE2

A64

SSE3

E64T

FBLD

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FBLD — FPU (BCD Load)

FPU

fbld

source

BCD

80

How does this all work? Well, the FPU has a single instruction that loads a BCD value and converts it to an 80-bit (10-byte) double extended precision floating-point value that it stores on the FPU stack. This can then be written back to computer memory as double-precision floating-point. Simple, fast, and minimal excess code and nothing time intensive.

Example 15-1. ...chap15ase2vmputil.cpp

    unsigned char bcd[10];
    double f;

    __asm {
      fbld tbyte ptr bcd        ; Load (80-bit) BCD
      fstp f                    ; Write 64-bit double-precision
    }

The returned floating-point value contains the BCD number as an integer with no fractional component. For example:

   byte bcd[10] = {0x68, 0x23, 0x45, 0x67, 0x89, 0x98, 0x87, 0x76, 0x65,
                   0x80};

The float returned is –657,687,988,967,452,368.0

At this point the decimal place needs to be adjusted to its correct position using the product of an exponential 10-n. This can be done with either a simple table lookup or a call to the function pow(10,-e), but the table lookup is faster. And speed is what it is all about.

Graphics 101

All of you who start a processing tool to convert art resources or game resources into a game database and then leave to have lunch, get a soda, have a snack, go to the bathroom, pick up your kids from school, or go home, all yell, "ME!"

WOW! That was loud! It could be heard reverberating across the planet.

Those of you who have worked on games in the past, did you meet your timelines? Did you find yourself working lots of extra (crunch) time to meet a milestone? (We will ignore E3 and the final milestones!) How often do you have to wait for a tool to complete a data conversion? Add up all that "waiting" time. What did your tally come to?

You don't really know? Here is a thought: Add a wee bit of code to your program and write the results to an accumulative log file. Then check it from time to time to see where some of that time is going.

Some people believe in optimizing the game only if there is time somewhere in the schedule. Management quite often counts the time beans and decides that getting the milestone met is much more important than early ongoing debugging or optimization. But just think of that time savings if your tools are written with optimization. Just do not tell management about it or they will think they can ship the product early.

3D rendering tools are expensive and so programmers typically do not have ready access to a live tool. They sometimes write plug-ins, but quite often they will merely write an ASCII scene exporter (ASE) file parser to import the 3D data into their tools that generate the game databases. With this method, programmers do not have to have a licensed copy of a very expensive tool sitting on their desks.

This little item brings up a trivial item of artist versus programmer wars. It all comes down to who will have the task of running the tools to export and convert data into a form loaded and used by a game application. Neither typically wants the task and both consider it mundane, but it is nevertheless required. Artists need to run the tools occasionally so as to check results of their changes to art resources. Programmers occasionally need to run the tools to test changes to database designs, etc. But nobody wants to do it all the time. So my suggestion is to automate the tools and incorporate the who and what into the game design, technical design, and art bibles for the project. In that way there will be no misperception.

Let's talk about something else but related to assembly.

In this particular case, an ASE file is an ASCII export from 3D Studio MAX. How many of you have actually written a parser and have wondered where all your processing time had gone? Did you use streaming file reads to load a line at a time, or a block read to read the entire file into memory?

I personally write ASE parsers by loading the entire file into memory even when they are 20MB or larger in size. The core ASE parser code included with this book can actually parse an entire 20MB file and convert about 1.15 million floating-point values from ASCII to doubles in a few seconds. But here is where it really gets interesting!

ASCII String to Double-Precision Float

Calling the standard C language function atof() to convert an ASCII floating-point value to single or double-precision will add significant time onto your processing time for those large ASE files.

But I have good news for you. The following function will carve those hours back to something a lot more reasonable. What it does is take advantage of a little-known functionality within the floating-point unit of the 80×86 processor.

As discussed in Chapter 8, the FPU loads and handles the following data types:

  • (4-byte) single-precision floating-point

  • (8-byte) double-precision floating-point

  • (10-byte) double extended-precision floating-point

  • (10-byte) binary-coded decimal (BCD)

ASCII to Double

Note that the following code sample expects a normal floating-point number and no exponential. The ASE files do not contain exponential, just really long ASCII floating-point numbers; thus, the reason this code traps for more than 18 digits.

Example 15-2. ...chap15ase2vmputil.cpp

   double exptbl[] = // -e
   {
     1.0,                 0.1,
     0.01,                0.001,
     0.0001,              0.00001,
     0.000001,            0.0000001,
     0.00000001,          0.000000001,
     0.0000000001,        0.00000000001,
     0.000000000001,      0.0000000000001,
     0.00000000000001,    0.000000000000001,
     0.0000000000000001,  0.00000000000000001,
     0.000000000000000001
   };                             // Limit 18 places

   double ASCIItoDouble(const char *pStr)
   {
   #ifdef CC_VMP_WIN32
     unsigned int dig[80], *pd;
     unsigned char bcd[10+2], *pb;
     double f;
     int n, e;
     const char *p;

     ASSERT_PTR(pStr);

     *(((uint32*)bcd)+0) = 0;   // Clear (12 bytes)
     *(((uint32*)bcd)+1) = 0;
     *(((uint32*)bcd)+2) = 0;   // 2 + 2 spare bytes

        // Collect negative/positive – and delimiters are pre-stripped.

     p = pStr;
     if ('-' == *p)
   {
     *(bcd+9) = 0x80; // Set the negative bit into the BCD
     p++;
   }
      // Collect digits and remember position of decimal point

   *dig = 0;             // Prepend a leading zero
   e = n = 0;
   pd = dig+1;

   while (('0' <= *p) && (*p <= '9'))
   {
     *pd++ = (*p++ - '0'), // Collect a digit
     n++;

     // The decimal place is checked after the first digit as no
     // floating-point value should start with a decimal point.
     // Even values between 0 and 1 should have a leading zero! 0.1
     if ('.' == *p)      // Decimal place?
     {                   // Remember its position
       e = n;
       p++;
     }
   }

   // Check for a really BIG (and thus ridiculous) number

   if (n > 18)          // More than 18 digits?
   {
     return atof(pStr);
   }

   if (e)               // 0=1.0 1=0.1 2=0.01 3=0.001, etc.
   {
     e = n - e;         // Get correct exponent
   }

      // repack into BCD (preset lead zeros)
      // last to first digit

   n = (n+1)>>1;          // Start in middle of BCD buffer
   pb = bcd;              // Calc. 1st BCD character position

   while(n--)             // loop for digit pairs
   {
     pd-=2;               // Roll back to last 2 digits
     *pb++ = ((*(pd+0)<<4) | *(pd+1)); // blend two digits
   }
    __asm {
      fbld tbyte ptr bcd    ; Load (10-byte) BCD
      fstp f                ; Write 64-bit double-precision
    }

      return f * exptbl[e];                  // FASTER
  //  return f * pow( 10.0, (double) -e );   // FAST
  #else
    return atof(p);                          // Really SLOW
  #endif
  }

If you do not believe me about the speed, then replace all the atof() functions in your current tool with a macro to assign 0.0 and measure the difference in speed. Or better yet, embed the atof() function within this function and then do a float comparison with the precision slop factor since by now you should be very aware that you never ever compare two floating-point numbers to each other to test for equivalence unless a precision slop factor (accuracy) is utilized.

Tip

One should always test optimized code (vector based or not) in conjunction with slow scalar code written in C to ensure that the code is functioning as required.

One more thing: If you insist on using atof() or sscanf(), copy the ASCII number to a scratch buffer before processing it with either of these two functions because processing them within a 20MB file dramatically increases the processing time by hours. Apparently these conversion functions scan the string until they reach the terminator, which in the case of an ASE file can be a few megabytes away instead of a few bytes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset