Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 15. Binary-Coded Decimal (BCD)

Converting an ASCII string to binary-coded decimal is as easy as pie (or is it a piece of cake?). In BCD, for every byte, the lower 4-bit nibble and upper 4-bit nibble each store a value from 0 to 9 (think double-digit hex only the upper six values A through F are ignored).

Workbench Files:Benchx86chap15projectplatform

	project	platform
ASE to VMP	ase2vmp	vc6
BCD 2^N	cd	vc.net

BCD

Table 15-1. ASCII numerical digit to hex and decimal values

ASCII	0	1	2	3	4	5	6	7	8	9
Hex	0x30	0x31	0x32	0x33	0x34	0x35	0x36	0x37	0x38	0x39
Decimal	48	49	50	51	52	53	54	55	56	57
BCD	0	1	2	3	4	5	6	7	8	9
Binary	0000	0001	0010	0011	0100	0101	0110	0111	1000	1001

Converting a BCD value from ASCII to a nibble is as easy as subtracting the hex value of 0x30, '0', or 48 decimal from the ASCII numerical value and get the resulting value with a range of {0...9}.

   byte ASCIItoBCD(char c)
   {
     ASSERT(('0' <= c) && (c <= '9'));

     return (byte)(c - '0'),
   }

When the 8086 processor was first manufactured the FPU was a separate optional chip (8087). There was a need for some BCD operations similar to other processors and so it was incorporated into the CPU. The 8087 had some BCD support as well. When the 64-bit processor was developed, it was decided that BCD support was not required anymore as the FPU was an alternative method.

The FPU uses the first nine bytes to support 18 BCD digits. The uppermost bit of the 10^th byte indicates the value is negative if set or positive if the bit is clear.

Figure 15-1. Ten-byte BCD data storage. MSB in far left byte (byte #9) is the sign bit and the rightmost eight bytes (#8...0) contain the BCD value pairs. The 18^th BCD digit resides in the upper nibble of byte #8 and the 1^st BCD digit resides in the lower nibble of byte #0.

Setting the upper nibble of a byte is merely the shifting left of a BCD digit by four bits, then logical ORing (or suming) the lower nibble.

   byte BCDtoByte(byte lo, byte hi)
   {
     return (hi << 4) | lo;
   }

DAA — Decimal Adjust AL (After) Addition

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
DAA								32		32

daa

Signed

The DAA general-purpose instruction adjusts the EFLAGS for a decimal carry after an addition.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	-	-	X	-	X

Flags: The Aux and Carry flags are set to 1 if an addition resulted in a decimal carry in their associated 4-bit nibble; otherwise they are cleared to 0.

        xor     eax,eax         ; Reset Carry(s)
   $L1: mov     al,[edi]        ; D = D + A
        adc     al,[esi]
        daa
        mov     [edi],al        ; Store result
        dec     esi
        dec     edi
        dec     ecx
        jne     $L1             ; Loop for n BCD bytes

Note that this function steps through memory in reverse byte order, which is not processor efficient. High digits are in low offset bytes, and low digits are in high offset bytes: {N...0}. So the operation must go to the end of the buffer and traverse memory backward from low-digit pairs to high-digit pairs. If not working with the FPU to handle BCD, then each nibble pair could be stored in reverse order: {0...N}. Only when they need to be displayed or printed would there be a reverse increment through memory. Note this is backward to the ordering of the FPU! The sample code uses this method.

DAS — Decimal Adjust AL (After) Subtraction

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
DAS								32		32

das

Signed

The DAS general-purpose instruction adjusts the EFLAGS for a decimal borrow after a subtraction.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	-	-	X	-	X

Flags: The Aux and Carry flags are set to 1 if a subtraction resulted in a decimal carry set due to a borrow in their associated 4-bit nibble; otherwise they are cleared to 0.

        xor  eax,eax     ; Reset Carry(s)
   $L1: mov  al,[edi]    ; D = D + A
        sbb  al,[esi]
        das
        mov  [edi],al    ; Store result
        dec  esi
        dec  edi
        dec  ecx
        jne  $L1         ; Loop for n BCD bytes

AAA — ASCII Adjust (After) Addition

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
AAA								32		32

aaa

Signed

The AAA general-purpose instruction adjusts the EFLAGS for a decimal carry. If a resulting calculation is greater than 9, then AL is set to the remainder between (0...9) and AH is incremented.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	-	-	X	-	X

Flags: The Aux and Carry flags are set to 1 if a decimal carry resulted; otherwise they are cleared to 0.

        add     al,ah
        aaa
        or      al,'0'    ; '0' + {0...9} = ASCII '0...9'

AAS — ASCII Adjust AL (After) Subtraction

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
AAS								32		32

aas

Signed

The AAS general-purpose instruction adjusts the EFLAGS depending on the results of the AL register after a multiplication operation. If a resulting calculation sets the carry indicating a borrow has occurred, then AL is set to the remainder between (0...9) and AH is decremented.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	-	-	X	-	X

Flags: The Aux and Carry flags are set to 1 if a decimal borrow resulted; otherwise they are cleared to 0.

        sub     al,'7'
        aas
        or      al,'0'        ; '0' + {0...9} = ASCII '0...9'

AAM — ASCII Adjust AX (After) Multiplication

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
AAM								32		32

aam

Signed

The AAM general-purpose instruction adjusts the EFLAGS depending on the results of the AL register after a multiplication operation.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	X	X	-	X	-

Flags: The Sign, Zero, and Parity flags are set to the resulting value in theAL register.

        mul     al,bh
        aam

AAD — ASCII Adjust AX (Before) Division

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
AAD								32		32

aad

Signed

The AAD general-purpose instruction adjusts the EFLAGS in preparation for a division operation.

Flags	O.flow	Sign	Zero	Aux	Parity	Carry
	-	X	X	-	X	-

Flags: The Sign, Zero, and Parity flags are set to the resulting value in the AL register.

        and     eax,0000111100001111b
        aad

FBLD — FPU (BCD Load)

Mnemonic	P	PII	K6	3D!	3Mx+	SSE	SSE2	A64	SSE3	E64T
FBLD

FPU

fbld

source

BCD

How does this all work? Well, the FPU has a single instruction that loads a BCD value and converts it to an 80-bit (10-byte) double extended precision floating-point value that it stores on the FPU stack. This can then be written back to computer memory as double-precision floating-point. Simple, fast, and minimal excess code and nothing time intensive.

Example 15-1. ...chap15ase2vmputil.cpp

    unsigned char bcd[10];
    double f;

    __asm {
      fbld tbyte ptr bcd        ; Load (80-bit) BCD
      fstp f                    ; Write 64-bit double-precision
    }

The returned floating-point value contains the BCD number as an integer with no fractional component. For example:

   byte bcd[10] = {0x68, 0x23, 0x45, 0x67, 0x89, 0x98, 0x87, 0x76, 0x65,
                   0x80};

The float returned is –657,687,988,967,452,368.0

At this point the decimal place needs to be adjusted to its correct position using the product of an exponential 10^-n. This can be done with either a simple table lookup or a call to the function pow(10,-e), but the table lookup is faster. And speed is what it is all about.

Graphics 101

All of you who start a processing tool to convert art resources or game resources into a game database and then leave to have lunch, get a soda, have a snack, go to the bathroom, pick up your kids from school, or go home, all yell, "ME!"

WOW! That was loud! It could be heard reverberating across the planet.

Those of you who have worked on games in the past, did you meet your timelines? Did you find yourself working lots of extra (crunch) time to meet a milestone? (We will ignore E³ and the final milestones!) How often do you have to wait for a tool to complete a data conversion? Add up all that "waiting" time. What did your tally come to?

You don't really know? Here is a thought: Add a wee bit of code to your program and write the results to an accumulative log file. Then check it from time to time to see where some of that time is going.

Some people believe in optimizing the game only if there is time somewhere in the schedule. Management quite often counts the time beans and decides that getting the milestone met is much more important than early ongoing debugging or optimization. But just think of that time savings if your tools are written with optimization. Just do not tell management about it or they will think they can ship the product early.

3D rendering tools are expensive and so programmers typically do not have ready access to a live tool. They sometimes write plug-ins, but quite often they will merely write an ASCII scene exporter (ASE) file parser to import the 3D data into their tools that generate the game databases. With this method, programmers do not have to have a licensed copy of a very expensive tool sitting on their desks.

This little item brings up a trivial item of artist versus programmer wars. It all comes down to who will have the task of running the tools to export and convert data into a form loaded and used by a game application. Neither typically wants the task and both consider it mundane, but it is nevertheless required. Artists need to run the tools occasionally so as to check results of their changes to art resources. Programmers occasionally need to run the tools to test changes to database designs, etc. But nobody wants to do it all the time. So my suggestion is to automate the tools and incorporate the who and what into the game design, technical design, and art bibles for the project. In that way there will be no misperception.

Let's talk about something else but related to assembly.

In this particular case, an ASE file is an ASCII export from 3D Studio MAX. How many of you have actually written a parser and have wondered where all your processing time had gone? Did you use streaming file reads to load a line at a time, or a block read to read the entire file into memory?

I personally write ASE parsers by loading the entire file into memory even when they are 20MB or larger in size. The core ASE parser code included with this book can actually parse an entire 20MB file and convert about 1.15 million floating-point values from ASCII to doubles in a few seconds. But here is where it really gets interesting!

ASCII String to Double-Precision Float

Calling the standard C language function atof() to convert an ASCII floating-point value to single or double-precision will add significant time onto your processing time for those large ASE files.

But I have good news for you. The following function will carve those hours back to something a lot more reasonable. What it does is take advantage of a little-known functionality within the floating-point unit of the 80×86 processor.

As discussed in Chapter 8, the FPU loads and handles the following data types:

(4-byte) single-precision floating-point
(8-byte) double-precision floating-point
(10-byte) double extended-precision floating-point
(10-byte) binary-coded decimal (BCD)

ASCII to Double

Note that the following code sample expects a normal floating-point number and no exponential. The ASE files do not contain exponential, just really long ASCII floating-point numbers; thus, the reason this code traps for more than 18 digits.

Example 15-2. ...chap15ase2vmputil.cpp

   double exptbl[] = // -e
   {
     1.0,                 0.1,
     0.01,                0.001,
     0.0001,              0.00001,
     0.000001,            0.0000001,
     0.00000001,          0.000000001,
     0.0000000001,        0.00000000001,
     0.000000000001,      0.0000000000001,
     0.00000000000001,    0.000000000000001,
     0.0000000000000001,  0.00000000000000001,
     0.000000000000000001
   };                             // Limit 18 places

   double ASCIItoDouble(const char *pStr)
   {
   #ifdef CC_VMP_WIN32
     unsigned int dig[80], *pd;
     unsigned char bcd[10+2], *pb;
     double f;
     int n, e;
     const char *p;

     ASSERT_PTR(pStr);

     *(((uint32*)bcd)+0) = 0;   // Clear (12 bytes)
     *(((uint32*)bcd)+1) = 0;
     *(((uint32*)bcd)+2) = 0;   // 2 + 2 spare bytes

        // Collect negative/positive – and delimiters are pre-stripped.

     p = pStr;
     if ('-' == *p)

   {
     *(bcd+9) = 0x80; // Set the negative bit into the BCD
     p++;
   }
      // Collect digits and remember position of decimal point

   *dig = 0;             // Prepend a leading zero
   e = n = 0;
   pd = dig+1;

   while (('0' <= *p) && (*p <= '9'))
   {
     *pd++ = (*p++ - '0'), // Collect a digit
     n++;

     // The decimal place is checked after the first digit as no
     // floating-point value should start with a decimal point.
     // Even values between 0 and 1 should have a leading zero! 0.1
     if ('.' == *p)      // Decimal place?
     {                   // Remember its position
       e = n;
       p++;
     }
   }

   // Check for a really BIG (and thus ridiculous) number

   if (n > 18)          // More than 18 digits?
   {
     return atof(pStr);
   }

   if (e)               // 0=1.0 1=0.1 2=0.01 3=0.001, etc.
   {
     e = n - e;         // Get correct exponent
   }

      // repack into BCD (preset lead zeros)
      // last to first digit

   n = (n+1)>>1;          // Start in middle of BCD buffer
   pb = bcd;              // Calc. 1st BCD character position

   while(n--)             // loop for digit pairs
   {
     pd-=2;               // Roll back to last 2 digits
     *pb++ = ((*(pd+0)<<4) | *(pd+1)); // blend two digits
   }

    __asm {
      fbld tbyte ptr bcd    ; Load (10-byte) BCD
      fstp f                ; Write 64-bit double-precision
    }

      return f * exptbl[e];                  // FASTER
  //  return f * pow( 10.0, (double) -e );   // FAST
  #else
    return atof(p);                          // Really SLOW
  #endif
  }

If you do not believe me about the speed, then replace all the atof() functions in your current tool with a macro to assign 0.0 and measure the difference in speed. Or better yet, embed the atof() function within this function and then do a float comparison with the precision slop factor since by now you should be very aware that you never ever compare two floating-point numbers to each other to test for equivalence unless a precision slop factor (accuracy) is utilized.

Tip

One should always test optimized code (vector based or not) in conjunction with slow scalar code written in C to ensure that the code is functioning as required.

One more thing: If you insist on using atof() or sscanf(), copy the ASCII number to a scratch buffer before processing it with either of these two functions because processing them within a 20MB file dramatically increases the processing time by hours. Apparently these conversion functions scan the string until they reach the terminator, which in the case of an ASE file can be a few megabytes away instead of a few bytes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 15. Binary-Coded Decimal (BCD)

Create new playlist

Sign In

Sign Up

Chapter 15. Binary-Coded Decimal (BCD)

BCD

DAA — Decimal Adjust AL (After) Addition

DAS — Decimal Adjust AL (After) Subtraction

AAA — ASCII Adjust (After) Addition

AAS — ASCII Adjust AL (After) Subtraction

AAM — ASCII Adjust AX (After) Multiplication

AAD — ASCII Adjust AX (Before) Division

FBLD — FPU (BCD Load)

Graphics 101

ASCII String to Double-Precision Float

ASCII to Double

Tip

Table of Contents for
15. Binary-Coded Decimal (BCD)