IEEE Floating Point numbers (Part 3) + Assembly Language

gravatar
By Ranti
 · 
March 9, 2023
 · 
4 min read
Diagram showing the sign, exponent and mantissa represented as fields in the IEEE 32-bit notation

$$\text{If } N = -1.011 \times 2^{15}$$ $$\text{SIGN BIT: } 1$$ $$\text{EXPONENT BITS: } 15 + 127 = 142 \text{ in decimal, which is } 1000111 \text{ in binary}$$ $$\text{MANTISSA BITS: } 011 \space{} 000...0 \text{ (20 zeros)}$$

Diagram showing N represented as fields in the IEEE 32-bit notation
$$4 \space{} 2 \space{} 1 \space{} . \frac{1}{2} \space{} \frac{1}{4} \space{} \frac{1}{8} \space{} \frac{1}{16} \space{}$$ $$ 2^2 \space{} 2^1 \space{} 2^0 \space{} \space{} 2^{-1} \space{} 2^{-2} \space{} 2^{-3} \space{} 2^{-4}$$ $$ 1 \space{} 1 \space{} 0 . \space{} 0 \space{} 1 \space{} 0 \space{} 1$$ $$ = 4 + 2 + 0.25 + 0.625$$ $$ = 6.3125$$

Steps to encode a number in IEEE Floating Point

  1. Identify the sign: If positive, sign = 0. Negative, sign = 1
  2. Convert number to binary
  3. Normalise the binary number (1.001 x 25)
  4. Identify the exponent and add the bias (+127)
  5. Convert the exponent + bias to binary
  6. Identify the mantissa's numbers after the point
  7. Pad the mantissa with zero until the total number of bits = 23

Example Number = -287.5

  1. SIGN = 1
  2. Binary = 100011111.1
  3. Normalised = 1.000111111 x 28
  4. Exponent = 8 + 127 = 135
  5. Exponent + bias = 10000111
  6. Mantissa after point = 000111111
  7. 00011111100000000000000

Computer Architecture

  • How the hardware looks to an assembly language programmer, since assembly is the "human readable" version of machine code
  • Allows the programmer to see the machine's instructions and memory

Instruction-set Architecture (ISA) --> The instructions the programmer can use and how they look

  • The micro-architecture are the set of instructions not visible to the programmer (e.g the circuitry that makes up the computer)

Assembly Language

  • Consists of simple instructions: ADD, SUBTRACT, MULTIPLY, COMPARE, JUMP, MOVE DATA, AND, OR, XOR, (etc)
  • The data used for the instructions comes from the registers or from memory
Diagram showing relationship between CPU, Registers, Memory

Registers

  • Very fast
  • Holds less data
  • Accessed by name

Memory

  • Slow
  • Holds lots of data
  • Accessed via addresses

Intel x86-64 Architecture

  • 64-bit architecture (x64)

The x64 registers

  • The general purpose registers can hold 64 bits
    • 1 64 bit pointer
    • 1 64-bit integer
    • Can also hold smaller integers (32, 16, 8 bits)
  • There are 16 general-purpose registers that an hold a maximum of 64 bits: %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %r8, %r9, ..., %r15, %rsp, %rbp (special purpose registers)

For each register:

Diagram showing the number of bits each part of a register holds
  • %rax = The full 64 bit register
  • %eax = The lower 32 bits of the %rax register
  • %ax = The lower 16 bits of the %rax register
  • Lower half of the registers are referred to as %e_x so for %rax, the lower half is %eax
    • The r's are replacecd with e's
  • For %r8 ... %r15 it was referred to as %r8d ... %r15d
    • They end with d's

Instructions

Move instructions

MOV: source, destination (copies from source to destination, moves from left to right)

E.g. mov %rcx, %rsi (copies the data from %rcx into %rsi

The source doesn't have to be a register. It can be a constant. A constant starts with a $ sign

E.g. mov $23, %rdx

The source or destination can also be a memory address. But you cannot move a memory address to another memory address

Arithmetic Instructions

add: source, destination destination += source

sub: source, destination destination -= source

imul: source, destination integer multiplication

inc: d destination++

dec: d destination--

and

or

Examples

$$ \text{add } $1, \text{%rax} $$ $$ \text{sub %rcx, %rdx} \text{ (%rdx = %rdx - %rcx)}$$

Comparison Operations

cmp op2, op1 compare op1 with op2

The hardware remembers the result of the comparison, bit does do anything with the result of the comparison

JUMP Operations

jmp label always jump to the label

conditional jumps - comes after cmp

jg label jump if the result of the comparison is greater otherwise don't jump

je label = jump on equal

jge label = jump on >=

je label = jump on <

jle label = jump on <=

int fac(int n){
    int i, prod = 1;
        for(i = 1; i <= n; i++)
            prod = prod * i;
    return prod;
}
C
_fac:
  pushq %rbp ;always need these two lines
  movq %rsp, %rbp

  ; put prod in %eax, since it is the 32-bit return value
  ; we'll put i in %ecx

  movl $1, %eax ;prod = 1
  movl $1, %ecx ;i = 1
  
_LOOP_TOP:
  cmpl %edi, %ecx ;top of loop
  jg DONE ;compare i (%ecx) to n (%edi)
  imull %ecx, %eax ;if i > n, jump out of loop
  ; prod *= i
  incl %ecx ;i ++
  jmp LOOP_TOP ; jump to top of loop
; nothing more to do, result is already in %eax

DONE:
  popq %rbp ; always need these two lines
  retq
ASM

Other things to note about the assembly code

  • _fac is the name the function
  • _LOOP_TOP and DONE are labels
  • Return value is always put in the %rax register in Assembly

Generating assembly code: gcc -s fac.c

  • Often append q at the end of a 64-bit instruction
  • Append a l at the end of a 32-bit instruction
  • This is usually optional because the compiler can usually tell how many bits the instruction is
View