Simple Instructions

From SkullSecurity
Revision as of 20:51, 12 March 2007 by Ron (talk | contribs)
Jump to navigation Jump to search
Assembly Language Tutorial
Please choose a tutorial page:

This section will go over some basic assembly commands that you'll likely see frequently. Some of the functions shown here are tricky, and some have special properties (such as the registers they use). Additionally, x86 assembly is comprised of hundreds of different instructions. As a result, you'll likely want to find a complete reference book or website to have alongside you. This page, however, will give enough of an introduction to get you started.

Pointers and Dereferencing

First, we'll start with the hard stuff. If you understood the pointers section, this shouldn't be too bad. If you didn't, you should probably go back and refresh your memory.

Recall that a pointer is a datatype that stores an address as its value. Since registers are simply 32-bit values with no actual types, any register may or may not be a pointer, depending on what is stored. It is the responsibility of the program to treat pointers and pointers and to treat non-pointers as non-pointers.

If a value is a pointer, it can be dereferenced. Recall that dereferencing a pointer retrieves the value stored at the address being pointed to. In assembly, this is generally done by putting square brackets ("[" and "]") around the register. For example:

  • eax -- is the value stored in eax
  • [eax] -- is the value pointed to by eax

This will be discussed more in upcoming sections.

Moving Data Around

The commands in this section deal with moving around numbers and pointers.

mov, movsx, movzx

mov is the command used for assignment, much like the "=" sign in most languages. mov can move data between a register and memory, two registers, or a constant to a register. Here are some examples:

mov eax, 1     ; set eax to 1 (eax = 1)
mov edx, ecx   ; set edx to whatever ecx is (edx = ecx)
mov eax, 18h   ; set eax to 0x18
mov eax, [ebx] ; set eax to the value in memory that ebx is pointing at
mov [ebx], 3   ; move the number 3 into the memory address that ebx is pointing at

movsx and movzx are special versions of mov which are designed to be used between signed (movsx) and unsigned (movzx) registers of different sizes.

movsx means move with sign extension. The data is moved from a smaller register into a bigger register, and the sign is preserved by either padding with 0's (for positive values) or F's (for negative values). Here are some examples:

  • 0x1000 becomes 0x00001000, since it was positive
  • 0x7FFF becomes 0x00007FFF, since it was positive
  • 0xFFFF becomes 0xFFFFFFFF, since it was negative (note that 0xFFFF is -1 in 16-bit signed, and 0xFFFFFFFF is -1 in 32-bit signed)
  • 0x8000 becomes 0xFFFF8000, since it was negative (note that 0x8000 is -32768 in 16-bit signed, and 0xFFFF8000 is -32768 in 32-bit signed)

movzx means move with zero extension. The data is moved from a smaller register into a bigger register, and the sign is ignored. Here are some examples:

  • 0x1000 becomes 0x00001000
  • 0x7FFF becomes 0x00007FFF
  • 0xFFFF becomes 0x0000FFFF
  • 0x8000 becomes 0x00008000

lea

lea is very similar to mov, except that math can be done on the original value before it's used. The "[" and "]" characters always surround the second parameter, but in this case they don't indicate dereferencing, it is easiest to think of them as just being part of the formula.

lea is generally used for calculating array offsets, since the address of an element of the array can be found with, [arraystart + offset*datasize]. lea can also be used for quickly doing math, often with an addition and a multiplication. Examples of both uses are below.

Here are some examples of using lea:

lea     eax, [eax+eax]   ; Double the value of eax -- eax = eax * 2
lea     edi, [esi+0Bh]   ; Add 11 to esi and store the result in edi
lea     eax, [esi+ecx*4] ; This is generally used for indexing an array of integers. esi is a pointer to the beginning of an array, and ecx is the index of the element that is to be retrieved. The index is multiplied by 4 because Integers are 4 bytes long. eax will end up storing the address of the ecx'th element of the array. 
lea     edi, [eax+eax*2] ; Triple the value of eax -- eax = eax * 3
lea     edi, [eax+ebx*2] ; This likely indicates that eax stores an array of 16-bit (2 byte) values, and that ebx is an offset into it. Note the similarities between this and the previous example: the same math is being done, but for a different reason. 

Math and Logic

The commands in this section deal with math and logic. Some are simple, and others (like multiplication and division) are pretty tricky.

add, sub

A register can have either another register, a constant value, or a pointer added to or subtracted from it. The syntax of addition and subtraction is fairly simple:

add eax, 3   ; Adds 3 to eax -- eax = eax + 3
add ebx, eax ; Adds the value of eax to ebx -- ebx = ebx + eax
sub ecx, 3   ; Subtracts 3 from ecx -- ecx = ecx - 3

inc, dec

and, or, xor

neg

mul, div

Jumping Around

jmp

call

cmp, test

=== jz/je, jnz/jne, jl, jg, jle, jge

Manipulating the Stack

push, pop

ret