Week 7 notes

Comp 264-002, Spring 2019,MWF, 11:30-12:20, Cuneo 218

Readings (from BOH3)

Chapter 1 (though we haven't covered much of section 9 yet, on concurrency)

Section 2.1 (you can skip 2.1.7 for now, on bitwise operations, and 2.1.9 on shift operations)
Section 2.2 (don't sweat the B2Uw notation for now, though all it's doing is formally defining the conversion from strings of bits to integers)
Section 2.3 on integer arithmetic
Section 2.4 on floating-point arithmetic

Section 3.1
Section 3.2
Section 3.3
Section 3.4

Study guide

Machine code

x-64 cheat sheet

%rax                      Often used for function return value
%rcx arg4
%rdx arg3
%rsi "index" register, arg2
%rdi arg1; see BOH p 245
%rbp "base pointer", not always needed
%rsp stack pointer; some special hardware implications
%r8 arg5; r8-r15 were added with x86-64
%r9 arg6
%rip the instruction pointer; not directly available

Operand formats (cf BOH3 p 181)

Move instructions (and most others) must have at least one operand be a register; memory-to-memory moves are disallowed.

immediate movl    $13, %eax move decimal 13 into eax
hex immediate operands use $0xdeadbeef format
immediate movl    0x25,(%rdi) move 37 to address pointed to by %rdi
memory absolute movl  $0xdeadbeef,%eax seldom used, except to probe certain fixed addresses
memory indirect movl  (%rbx), %rax copy memory pointed to by %rbx to %rax
base+displacement movl 100(%rbx),%rax copy memory 100 bytes past where %rbx points
copy to %rax
indexed movl (%rbx,%rdi),%rax memory at address %rbx+%rdi
indexed with displacement movl 16(%rbx,%rdi), %rax
memory at address %rbx+%rdi + 16
scaled indirect movl (,%rdi,4),%rax memory address at 4*%rdi. Rare. Note comma.
scaled indexed movl (%rbx,%rdi, 4),%rax memory address at %rbx + 4*%rdi.
Common with arrays.
scaled indirect with displacement

scaled indexed with displacement


Practice 3.5: decode(long *xp, long *yp, long * zp)

// xp is in %rdi, yp is in %rsi, zp is in %rdx (this is the standard allocation for arg1, arg2, arg3)

movq    (%rdi), %r8        // what is moved?
movq    (%rsi), %rcx
movq    (%rdx), %rax
movq    %r8, (%rsi)
movq    %rcx, (%rdx)
movq    %rax, (%rdi)

Why don't we move (%rdi) directly to (%rsi)? (two reasons)

To what extent can these instructions be reordered?







mul37.c; multest.c

mul.c; muul.c; mulskel.c; mul.sh

3.6: Conditional execution (not on midterm)

Condition codes, set by add and sub (but not lea)

The last two are used in comparison operations:

Cmp S,D sets the ZF and SF condition codes as if it had calculated D-S (which is sort of backwards in ATT notation). If ZF is set, then D==S. If SF is set, then D<S. CF and OF are also set; CF is set if D<S via unsigned comparison.

There are also testb through testq, based on D & S instead of D-S.

There are a series of set instructions for setting a one-byte register to 1 if some combination of the condition codes is true. These are rarely used.

By far the most common user of condition codes are the conditional-jump instructions:

If we're writing code for unsigned comparison, we'll use j/jb. Signed comparison will use js/jg/jl

Friday: demo of that. Just finished absdiff_se and jl

Jump instructions are encoded using the relative offset to the destination. This value is added (via signed addition) to the program counter; hence the name PC-relative addressing. The offset can be 1, 2 or 4 bytes.

Intel disallows a jmp to a ret instruction, so a rep instruction is often inserted just before.

absdiff_se: does the goto version lead to the same code?


Why would conditional moves ever be an improvement?

    branch prediction

Conditional moves from memory, and segfaults

long cread(long *xp) {
    return (xp ? *xp : 0);

This must not use cmov. Why?


while loops:

dofact.c: there is only one label

while.c: Note that the condition has been moved to the bottom. Does this improve anything? This is an example of the jump-to-middle format.


whilefact.Og.s: jump-to-middle

whilefact.O1.s: guarded-do example

forfact.c: jump-to-middle.