 -=( ---------------------------------------------------------------------- )=-
 -=( Natural Selection Issue #1 -------------- Using Advanced MASM Features )=-
 -=( ---------------------------------------------------------------------- )=-

 -=( 0 : Contents --------------------------------------------------------- )=-

 0 : Contents
 1 : Why Use MASM
 2 : Debugging with SoftICE
 3 : Using Anonymous Labels
 4 : Flow Control Directives
 5 : Proper Procedure Usage
 6 : Calling API with INVOKE
 7 : Final Thoughts On MASM

 -=( 1 : Why Use MASM ----------------------------------------------------- )=-

 The choice  of MASM  over TASM  is a  personal one,  however there are various
 advanced features  that MASM  provides (some similar  functions are  found  in
 TASM) that can sway your choice, depending on your programming style.

 For those that prefer SoftICE as a debugger, MASM debugging symbols allow  you
 to do source-level debugging.

 It has advanced support for some high level commands like IF and WHILE, and it
 can do automatic type checking when  invoking API and PROCs, along with  stack
 management to create and destroy local variables automatically.

 Finally, it doesn't require a .DATA section, and it is FREE!

 -=( 2 : Debugging with SoftICE ------------------------------------------- )=-

 SoftICE  can  do  source-level debugging  on  MASM  executables compiled  with
 Codeview information, by using the below command line switches.  It's so  much
 easier stepping through your own .ASM than staring at a screen full of opcodes
 and no labels.

    For Compiling
        /c                          : Compile only we link ourself
        /Cp                         : Preserve case on all symbols
        /coff                       : COFF format
        /Zi                         : Include symbolic debugging information
    For Linking
        /debug                      : Include symbolic debugging information
        /debugtype:cv               : Codeview format
        /pdb:none                   : Do not generate a PDB, it's not needed
        /subsystem:windows          : Win32 application
    Start SoftICE
        nmsym.exe /translate:source,package,always /source:Virus.ASM
                                       /load:execute,break Virus.EXE

 If  your virus  relocates in  memory,  it's  possible to  map your  source and
 symbols to a new address with  the SYMLOC command.  From within SoftICE,  type
 MAP32 Virus.  It will print a list of sections for that executable and  assign
 a number to each.

 Now type SYMLOC CS NUMBER BASE.  NUMBER is next to the name of the main  Virus
 Code section, BASE is  the delta offset of  your virus start.  Now  type SRC a
 few times and you'll be source level debugging again.

 -=( 3 : Using Anonymous Labels ------------------------------------------- )=-

 MASM doesn't support TASM's Local Labels, instead it has Anonymous Labels that
 have no tag name.  You  declare one as @@:, and  use @F to reference the  next
 @@:,  and  @B to  reference  the last  @@:.   The only  problem  is that  they
 obviously cannot be nested, and if you need to get their offset, you may  have
 to put it into a register separately, MASM doesn't do arithmetic on them.

    Example:        XOR     EAX, EAX
                    MOV     ECX, 10
                @@: CMP     EAX, 5
                    JE      @F
                    INC     EAX
                    LOOP    @B
                    JMP     DROP_OUT
                @@: ...

 -=( 4 : Flow Control Directives ------------------------------------------ )=-

 MASM allows you to block code within directives like a HLL, reducing the  need
 for many obscure local within small loops and tests.

 .IF  blocks take  simple conditional  expressions and  execute code  up to  an
 .ENDIF  directive.  It's  limited to  register tests,  it cannot  do tests  on
 conditions involving  memory operands.   .ELSEIF and  .ELSE directives  can be
 used inside an .IF block.  Note that the  end of each block has a jump to  the
 .ENDIF automatically inserted.

    Example:        .IF     EAX == 1
                            MOV     EBX, 1
                    .ELSEIF EAX == 2
                            MOV     EBX, 2
                    .ELSE
                            MOV     EBX, 3
                    .ENDIF

 .WHILE/.WEND and .REPEAT/.UNTIL and .REPEAT/.UNTILCXZ are directives for  loop
 blocks.  Each evaluates  the condition at  the beginning or  end of the  loop,
 while .UNTILCXZ will also exit the loop if ECX == 0.

    Example:        .WHILE  EAX <> 12
                            INC     EAX
                    .WEND

                    .REPEAT
                            INC     EAX
                    .UNTIL  EAX == 12

                    MOV     ECX, 12
                    .REPEAT
                            INC EAX
                    .UNTILCXZ

 Loop directives also accept another directive, .BREAK and .CONTINUE, which are
 necessary to  exit the  loop as  there are  no labels  available.  .BREAK will
 completely exit the loop, .CONTINUE will  return to the condition test of  the
 loop.  Both allow  an alternate form,  with an .IF  as their parameter  with a
 condition.  No .ENDIF is necessary.

    Example:        .WHILE  TRUE
                            INC     EAX
                            .BREAK  .IF     EAX == 10
                    .WEND

    Example:        .REPEAT
                            INC     EAX
                            .CONTINUE .IF   EAX == 10
                            INC EBX
                    .UNTIL  EAX == 12

 Conditional expressions  for these  directives can  be more  complex using  OR
 (||), AND  (&&), and  NOT (!)  symbols.  You  can also  do single tests on the
 flags register using CARRY?, OVERFLOW?, PARITY?, SIGN?, ZERO?.

    Example:        DEC     EAX
                    .IF     (ZERO?)
                            DEC     EAX
                    .ELSEIF (EAX == 1 && EBX == 0)
                            DEC     EBX
                    .ENDIF

 -=( 5 : Proper Procedure Usage ------------------------------------------- )=-

 PROCs have been extended.  Firstly, a USES clause lets you specify any of  the
 registers your PROC  uses, to save  on entry and  restore on exit.   Note that
 restoration code replaces ALL RET instructions.


    Example:        MYPROC  PROC    USES EBX ECX
                            .IF     EBX == 1
                                    RET
                            .ELSE
                                    MOV     ECX, 2
                                    RET
                            .ENDIF
                    MYPROC  ENDP

    Compile:        MYPROC  PROC
                            PUSH    EBX
                            PUSH    ECX
                            CMP     EBX, 1
                            JNE     @F
                            POP     ECX
                            POP     EBX
                            RET
                        @@: MOV     ECX, 2
                            POP     ECX
                            POP     EBX
                            RET
                    MYPROC  ENDP

 Secondly, you can  specify names for  parameters passed on  the stack to  your
 PROC.  References  to these  parameters are  transparently converted  to [ebp]
 [offset] by MASM.

    Example:        MYPROC  PROC    USES EBX ECX,
                                    P1:DWORD, P2:DWORD
                            MOV     EBX, [P1]
                            MOV     ECX, [P2]
                            SUB     EBX, ECX
                            MOV     EAX, EBX
                            RET
                    MYPROC  ENDP

    Compile:        MYPROC  PROC    USES EBX ECX,
                                    P1:DWORD, P2:DWORD
                            PUSH    EBP
                            MOV     EBP, ESP
                            PUSH    EBX
                            PUSH    ECX

                            MOV     EBX, [EBP][8]
                            MOV     ECX, [EBP][0CH]
                            SUB     EBX, ECX
                            MOV     EAX, ECX

                            POP     ECX
                            POP     EBX
                            LEAVE
                            RET     8
                    MYPROC  ENDP

 Next, you can further set up the stack frame by declaring local variables onto
 the stack.  This is done with the LOCAL directive.  Note that these  variables
 are not initialized, you must do it manually.

    Example:        MYPROC  PROC    USES EBX ECX,
                                    P1:DWORD, P2:DWORD
                            LOCAL   CAT:DWORD
                            MOV     EAX, [CAT]
                            RET
                    MYPROC  ENDP

    Compile:        MYPROC  PROC
                            PUSH    EBP
                            MOV     EBP, ESP
                            ADD     ESP, -4
                            PUSH    EBX
                            PUSH    ECX

                            MOV     EAX, [EBP][-4]

                            POP     ECX
                            POP     EBX
                            LEAVE
                            RET     8
                    MYPROC  ENDP

 -=( 6 : Calling API with INVOKE ------------------------------------------ )=-

 INVOKE pushes a list of arguments onto the stack then CALLs a PROC.  It's very
 similar to the extended CALL in TASM.

    Example:                INVOKE  MYPROC, [ECX], ADDR CAT

    Compile:                LEA     EAX, [EBP][-4]
                            PUSH    EAX
                            PUSH    [ECX]
                            CALL    MYPROC

 There are a  few caveats with  INVOKE.  To forward  reference a procedure,  it
 requires a  PROTOtype declaration  earlier on  in the  file.  Secondly, if you
 want to forward reference a symbol as an argument, it will need to be put into
 a register first and then use the register in the INVOKE.

    Example:        MYPROC  PROTO   :DWORD, :DWORD
                            ...
                            MOV     EAX, [KITTEN]
                            INVOKE  MYPROC, [ECX], EAX
                            ...
                    KITTEN  DD 0
                            ...
                    MYPROC  PROC    USES EBX ECX,
                                    P1:DWORD, P2:DWORD

 Also, you cannot use an OFFSET directive in an INVOKE, instead use ADDR, which
 will result in EAX (and EDX if you provide something that is not a DWORD size)
 being overwritten as  shown below.  MASM  will warn you  if EAX is  used in an
 argument and has been overwritten.

    Example:                INVOKE  MYPROC, EAX, ADDR CAT

    Compile:                LEA     EAX, [EBP][-4]
                            PUSH    EAX
                            PUSH    EAX
                            CALL    MYPROC
                            Error A2133: register value overwritten by INVOKE

 However, if a register has been overwritten, and it's the argument to the CALL
 section of the Compile code, then MASM will not warn you even though your code
 will be incorrect.  It's  a small bug to  keep in mind.  Also,  if the TMYPROC
 PTR bit confuses you, keep reading, it's explained next.

    Example:                LEA     EAX, [MYPROC]
                            INVOKE  TMYPROC PTR EAX, EBX, ADDR CAT
    Compile:                LEA     EAX, [EBP][-4]
                            PUSH    EAX
                            PUSH    EBX
                            CALL    EAX
                            MASM will not give you an error for this bad code

 INVOKE can't be used 'as is' on Pointers to API/PROC, because it performs type
 checking on the arguments and doesn't  know what function it needs to  compare
 against.  You cannot use  a PROTO in this  case, instead you need  to create a
 TYPEDEF.  In  this way,  INVOKE can  be used  to call  API in  a virus through
 Pointers, and still do type checking.

    Example:                TMYPROC TYPEDEF PROTO :DWORD, :DWORD
                            LEA     EAX, [MYPROC]
                            INVOKE  TMYPROC PTR EAX, [ECX], ADDR CAT
                            ...

                    MYPROC  PROC    USES EBX ECX,
                                    P1:DWORD, P2:DWORD
                            LOCAL   CAT:DWORD
                            ...

 Our final  lesson with  INVOKE is  using it  to pass  data structures.  If you
 think about  it, it's  difficult to  pass a  massive 300B  structure using the
 stack.  Instead,  it passes  a Pointer  to the  STRUCT.  Note  that inside the
 PROC, you will need to load up and dereference properly, as shown below.

    Example:        TMSPROC TYPEDEF PROTO :DWORD, :DWORD

                    DSTRUCT STRUCT
                    ONE     DD      0
                    TWO     DD      0
                    ENDS
                    DATA    DSTRUCT {}
                            ...
                            LEA     EBX, [MSPROC]
                            INVOKE  TMSPROC PTR EBX, ADDR DATA, NULL
                            ...

                    MSPROC  PROC DATA:PTR DSTRUCT, OTHER:DWORD
                            MOV     EAX, [DATA]
                            INC     [EAX][DSTRUCT.TWO]
                            ...

 -=( 7 : Final Thoughts on MASM ------------------------------------------- )=-

 Using a  TYPEDEF on  every API  call reduces time spent chasing bugs caused by
 mismatched parameters.  It also forces good commenting technique by  declaring
 what each invocation refers to.

 People are generally reluctant to use  .IF and .REPEAT directives as they  are
 not pure  assembler code.   However, their  use expresses  the purpose of your
 code more clearly than  the most explicit comment  could, and they do  compile
 down to a fairly simple format.

 We've seen  the scene  go through  lots of  changes for favourite  assemblers,
 from MASM, to TASM, to A86, to NASM and TASM.  There's no reason to hold  back
 on MASM anymore.  It's everything you want and more.  Think about it.

 -=( ---------------------------------------------------------------------- )=-
 -=( Natural Selection Issue #1 --------------- (c) 2002 Feathered Serpents )=-
 -=( ---------------------------------------------------------------------- )=-
