Chapter 20. MASM vs. NASM vs. TASM vs. WASM

This chapter reminds me of the old Batman and Robin show where in a fight scene we see sound effect words flash on screen such as "OOF," "KABONG," "ZING," "MASM," "ZOT," "TASM," "ZANG," "WASM," and "POW." These, by an amazing coincidence, are the war cry of the assembly language programmer. Wars have started for far less than trying to get one of these programmers to use a different C/C++ compiler or assembler. At one company I spent almost two years writing a good portion of the application core libraries and all the documentation for their SDK that allowed internal and external programmers to write online computer games using dedicated game servers. For the last year and a half there I worked on their Win32 Network API SDK (not to be confused with Microsoft's GameSDK). For the first four months there I wrote the DOS SDK, which uses 16-bit and various forms of 32-bit Extended DOS. It was a mix and match of C/C++ compilers, assemblers, linkers, and DOS extenders. It seemed every company had their own flavor. They had gotten used to their favorite combination and nothing was better! So every now and then I had to create libraries for that new flavor to entice new clients. Some of the code I currently write also uses the High C/C++ compiler with MASM or Pharlap's 386ASM. I do not use it these days, but there is also the Watcom C/C++ with their WASM Assembler. Occasionally on software I write today, I get inquiries if my libraries are compatible with the Borland TASM Assembler. I have used all of these and a few others, and to date my favorite is MASM by Microsoft.

There is a form of assembly that we should not forget: in-line assembly. Some people swear by it. I, on the other hand, swear at it! I rarely use it and only for some specific type of data conversion that I need to be fast without the penalty of a stack call to call a pure assembly function. It is akin to programming with one arm tied behind one's back. A lot of macro assembler functions are not available.

I have read book reviews in which advocates of non-MASM assemblers indicate a book could have been a lot better if the author had used TASM instead of MASM. Again, a personal bias! Although I have a few apprehensions about MASM, I have a personal bias for it. In writing this book I have tried to appease the critics by keeping the examples as generic as possible, and if this was not good enough for you, " MASM vs. NASM vs. TASM vs. WASM RASPBERRIES! MASM vs. NASM vs. TASM vs. WASM " MASM is only available separately by download but it's built into the Visual C++ 6 and VC .NET compilers.

You should always use the latest and greatest version of your favorite assembler because if you do not, your version could have bugs (I find them all the time) or be too old to support some of the newer instructions. Back when MMX first came out I had to use the IAMMX.INC by Intel with the MASM as a workaround just to support MMX instructions. Since then it has been built into MASM. Now for SSE3 support you need to either get the latest VC .NET or download the ia_pni.inc file to get assembly instruction macro emulation. With one other company's assembler I had to hand-code the opcodes to make sure I had the appropriate JMP instruction. There was a bug and the jump instruction that I had coded in assembly code was not the jump instruction being encoded into machine code. A bug was being introduced into compiled code because of a bug in the assembler itself!

With the latest instruction sets there seem to be two assemblers at the forefront with recently introduced assembly instructions: MASM and NASM. No matter whose assembler you're using, I use the following as placeholders for the arguments being passed into the example code used in this book:

arg1    equ     8           ; Argument #1
arg2    equ     (arg1+4)    ; Argument #2
arg3    equ     (arg2+4)    ; Argument #3
arg4    equ     (arg3+4)    ; Argument #4
arg5    equ     (arg4+4)    ; Argument #5
arg6    equ     (arg5+4)    ; Argument #6
arg7    equ     (arg6+4)    ; Argument #7
arg8    equ     (arg7+4)    ; Argument #8


; void unzip(byte *pRaw, byte *pZip, uint nWidth);


         public   unzip
unzip    proc     near
         push     ebp
         mov      ebp,esp
         push     ebx
         push     esi
         push     edi


         mov      esi,[ebp+arg1]    ; pRaw
         mov      edi,[ebp+arg2]    ; pZip
         mov      ecx,[ebp+arg3]    ; nWidth


      ;
      ;
      ;


         pop      edi
         pop      esi
         pop      ebx
         pop      ebp
         ret
unzip    endp

You will note that I used arg1 instead of 8 as shown below:

mov     eax,[ebp+8]

As an alternative to the arg1 you could use a define to make the argument name make more sense.

pRaw = arg1
mov    eax,[ebp+pRaw]

The following information is a brief overview and you should refer to your assembler's documentation for specific information.

MASM — Microsoft Macro Assembler

When using Macro Assembler by Microsoft you should use the latest and greatest version because of the extended instruction sets. However, to use those new instructions you need to turn on functionality.

When using this assembler, the first thing you need to do is activate the appropriate CPU target by using one of the following assembler directives depending on what processor will be executing that section of code. There are several of these directives such as .386, .486, .586, etc. If the target is for an embedded 486, then obviously the .586 directive would not be used, as instructions would be allowed that the 486 would not understand. When you write your code for a single processor you can merely set the appropriate directive(s) at the top of the file, but quite often a single file will contain sets of code unique to individual processors.

  • .686 — This allows Model 6 type x86 code to be assembled. The next line actually allows MMX instructions to be assembled. You can pretty much have this directive as most processors being released these days support MMX. You do need to make sure that the code is only going to be executed by one of those processors, however.

    The directives not only target processors but certain instruction sets, and so care must be used when setting the appropriate directives.

  • .MMX — An alternate method is to set the supported instruction sets such as this directive for enabling MMX code.

  • .K3D — This is the directive for the 3DNow! instruction set. As you do not want an Intel processor trying to execute one of these instructions, only insert this above 3DNow! instruction code. These are also order dependent and so this must occur after the .MMX directive.

  • .XMM — Use this if you are using any SSE-based instructions requiring XMM registers.

There are other legacy declarations such as .387, .286, .386, .386P, .486, .486P, etc. The suffix "P" indicates an enabling of privileged instructions.

For more information, see http://msdn2.microsoft.com/library/afzk3475(en-us,vs.80).aspx.

Here is a sample file that you should be able to drop into a Win32 application in conjunction with the Visual C++ compiler:

        TITLE zipX86M.asm – My x86 (MASM) Assembly
        PAGE    53,160
;       This module is designed to Blah Blah Blah!
;
;       Created - 20 May 98 - J.Leiterman
;       Tabs = 8

        .686
        .MMX
        .K3D
        .model flat, C

        .data
        ALIGN 4
foo     dd       0                     ; Data value

zipX86M SEGMENT USE32 PUBLIC 'CODE'

;
; void unzip(byte *pRaw, byte *pZip, uint nWidth);
;
         align     16
unzip    PROC C PUBLIC USES ebx esi edi pRaw:PTR, pZip:PTR,
           nWidth:DWORD


         mov      esi,pRaw
         mov      edi,pZip
         mov      ecx,nWidth
         ;
         ;
         ;
         ret
unzip    endp


zipX86M  ends
         end

The function is declared PUBLIC, meaning it's global in definition and can therefore be accessed by functions in other files.

unzip  PROC C PUBLIC USES ebx esi edi pRaw:PTR, pZip:PTR,
         nWidth:DWORD

For convenience, you can specify the registers to push onto the stack and in what order. The RET instruction is actually a macro when used within this PROC, and therefore the registers are popped automatically in a reverse order wherever a RET instruction is encountered. The coup de grâce? No more pesky code like:

mov   esi,[ebp+arg1]        ; pRaw

Instead, you just use:

mov    esi,pRaw

The assembler expands the PROC macro and takes care of everything for you, making your code a little more readable.

You will notice that I used the default data segment (.data) as this is a flat memory model, but I declared a 32-bit Protected Mode code segment. The reasoning is that I tend to group my assembly files using an object-oriented approach and as such all my decompression functions/ procedures would reside within this segment. Other assembly code related to other functionality would be contained in a different file with a different segment name. They can occur with the same segment name but they wouldn't appear very organized, especially in the application address/data map.

zipX86M SEGMENT USE32 PUBLIC 'CODE'
:
:
zipX86M ends

Since segments are being mentioned I am going to give you a snapshot of segments back in the days of DOS and DOS extenders. Code and data was differentiated by 16-bit code/data versus 32-bit code/data addressing. The following is a snippet of code from those days.

;      Segment Ordering Sequence


       INN_CODE32   segment para USE32 'CODE'
       INN_CODE32   ends
       _TEXT        segment
       _TEXT        ends
       INN_DATA32   segment para USE32 'DATA'
       INN_DATA32   ends


       DGROUP       GROUP INN_DATA32
       CGROUP       GROUP _TEXT
       CGROUP       GROUP INN_CODE32

We also must not forget the (end) signal to the assembler that it has reached the end of the file:

End

I personally think this is just a carryover from the good old days of much simpler assemblers. With the advent of macros such as the following, you can turn on or off various sections of code and not just the bottom portion of your file:

if 0
else
endif

Visual C++ has never really had a peaceful coexistence with its own MASM Assembler. In the early days of around version 3.x you had to assemble your files using batch files or external make files and only link the *.obj files into your project files. Microsoft has fortunately made this a little simpler, but in my opinion it still seems shortsighted. My assumption is that they would prefer you to use either inline assembly or none at all. (But I've been known to be wrong before!)

The first thing you need to do is add the MASM hooks into your version 5.0 or above Visual C++ environment. Select the Tools|Options menu item, and then select the Directories tab. Set the following to the path of your MASM installation:

Executable Files:   c:masmin
                    c:masminr
Include Files:      c:masminclude

With your project loaded in your FileView tab, just right-click on the project files folder, and select the pop-up menu item Add Files to Project. The Insert Files into Project dialog box will be displayed. That dialog seems to support almost every file type known except for assembly! What you need to do is select the All Files (*.*) option, select the assembly file you desire, and then press the OK button.

Now that the file occurs in your list of files in your project, right-click on that file and select the Settings item from the pop-up menu. In the Commands edit box insert the following:

ml @MyGame.amk ..utilunzipx86.asm

This will execute the assembler using the option switches defined in the MyGame.amk file. In the Outputs edit box insert the following:

unzipx86.obj

Then press the OK button.

To make my life simpler I use a file, such as the following, that I refer to as my assembly make file. I clone it from project to project, as you'll never know when you'll need to tweak it.

File: MYGAME.AMK
      /L../util
      /c
      /coff
      /Cp
      /Fl
      /Fm
      /FR
      /Sg
      /Zd
      /Zi

For those of you who would prefer to use in-line assembly or just plain don't have an assembler, you can do the same thing with the following from within your C/C++ code.

VC6 assembler configuration display

Figure 20-1. VC6 assembler configuration display

void unzip(byte *pRaw, byte *pZip, uint nWidth)
{
    __asm {
        mov      esi,pRaw
        mov      edi,pZip
        mov      ecx,nWidth

      };
}

You should be very careful if you mix C/C++ and in-line assembly code unless you push the registers to save them. Setting a breakpoint at the beginning of your function and then examining the source code during run time can help point out any register use conflicts.

MASM is my favorite macro assembler as it has an excellent macro expansion ability. Not only can new instructions be incorporated by use of macros but the predefined macro expansion can be taken to advantage as they are C like. In some cases, I find it better than C. In fact, in-line assembly sucks! (Another technical term!) (Note: I only said in some cases!) The following are some of the highlights. For details, read the technical manuals. For example, the MASM toolset has the following manuals:

  • Environment and Tools

  • Programmers Guide

  • Reference

In the following charts, notice the C method on the left and the MASM method on the right.

Defines are pretty similar; however, enums do not exist and so must be emulated with a standard equate.

VC6 assembler configuration display
#define FOO 3                  FOO = 3


typedef enum
{
    CPUVEN_UNKNOWN = 0,        CPUVEN_UNKNOWN = 0
}  CPUVEN;

MASM can contain a structure definition just like C:

VC6 assembler configuration display
typedef struct CpuInfoType {            CpuInfo struct 4
   uint   nCpuId;  // CPU Id               nCpuId  dd 0   ; CPU Id
   uint   nFpuId;  // FPU Id               nFpuId  dd 0   ; FPU Id
   uint   nBits;   // Feature              nBits   dd 0   ; Feature
   uint   nMfg;    // Mfg                  nMfg    dd 0   ; Mfg.
   uint16 SerialNum[6];                    SerialNum dw 0,0,0,0,0,0
   uint   nSpeed;  // Speed                nSpeed  dd 0   ; Speed
} CpuInfo;                              CpuInfo ends

In C there is no looping macro expansion; there is only one-shot (a definition gets expanded). However, some special macro functionality is available when using a MACRO assembler.

REPEAT

MASM supports the repeat declaration when used in conjunction with a local temporary argument.

i = 0
   REPEAT 5
mov [i + ebx],eax
i = i + 4
   ENDM

The i is temporary and expands the code. For this example, the REPEAT macro is replicated five times and adds 4 (the size of the write) onto every iteration. So the code is unrolled:

mov [0 + ebx],eax
mov [4 + ebx],eax
mov [8 + ebx],eax
mov [12 + ebx],eax
mov [16 + ebx],eax

WHILE

MASM also supports a while loop.

i = 0
   WHILE i LE 20    ; <
mov [i + ebx],eax
i = i + 4
   ENDM

This is essentially similar code. The example was a simple loop, but while loops are typically used in loops of more complexity.

FOR

MASM also supports a for loop.

   FOR arg, <1,3,5,7,9,11,13,17,19,23>
out dx,arg
   ENDM

As mentioned, these are examples of MASM related code. Those assemble-time loops are something not available to a C compiler. Other items are available including access to data/code segment specification and all assembly instructions, while inline assembly has only a limited set of instructions available. The macro assembler allows code/data intermixed, while a C compiler does not. The IF-ELSE-ENDIF conditionals are also available, along with other features available in a standard C compiler.

Compiler Intrinsics

The more recent Visual C++ and Intel compilers support a method of programming in assembly language referred to as intrinsics. This is where the functionality of SIMD instructions has been wrapped within C wrappers and compiled into code as inline code. Let us examine the following example:

void test(float *c, float a, float b)
{
   *c = a + b;
}

Not to oversimplify the power of using intrinsics to get code up and running quickly, the following code uses intrinisics in conjunction with (__m128) XMM registers with SSE single-precision floating-point instructions. Note that it looks more complicated, but I chose a simple scalar expression to resolve.

#include <xmmintrin.h>


void test(float *c, float a, float b)
{
  __m128 ta, tb;

   ta = _mm_load_ps(&a);
   tb = _mm_load_ps(&b);
   ta = _mm_add_ps(ta, tb);
   _mm_store_ps(c, ta);
}

But underneath in the pure assembly code generated by the compiler this breaks down to something similar to the following:

push     ebx
mov      ebx,esp
sub      esp,8
and      esp,0FFFFFFF0h   ; 16-byte align stack
add      esp,4
push     ebp
mov      ebp,dword ptr [ebx+4]
mov      dword ptr [esp+4],ebp
mov      ebp,esp
sub      esp,98h
push     esi
push     edi
  ; __m128 ta, tb
  ; ta = _mm_load_ps(&a);
lea      eax,[ebx+0Ch]
movaps   xmm0,xmmword ptr [eax]
movaps   xmmword ptr [ebp-30h],xmm0
movaps   xmm0,xmmword ptr [ebp-30h]
movaps   xmmword ptr [ebp-10h],xmm0
  ; tb = _mm_load_ps(&b);
lea      eax,[ebx+10h]
movaps   xmm0,xmmword ptr [eax]
movaps   xmmword ptr [ebp-40h],xmm0
movaps   xmm0,xmmword ptr [ebp-40h]
movaps   xmmword ptr [ebp-20h],xmm0
  ; ta = _mm_add_ps(ta, tb);
movaps   xmm0,xmmword ptr [ebp-20h]
movaps   xmm1,xmmword ptr [ebp-10h]
addps    xmm1,xmm0
movaps   xmmword ptr [ebp-50h],xmm1
movaps   xmm0,xmmword ptr [ebp-50h]
movaps   xmmword ptr [ebp-10h],xmm0
  ; _mm_store_ps(c, ta);
movaps   xmm0,xmmword ptr [ebp-10h]
mov      eax,dword ptr [ebx+8]
movaps   xmmword ptr [eax],xmm0


pop      edi
pop      esi
mov      esp,ebp
pop      ebp
mov      esp,ebx
pop      ebx
ret
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset