Week 5 notes

Comp 264-002, Spring 2019,MWF, 11:30-12:20, Cuneo 218

Readings (from BOH3)

Chapter 1 (though we haven't covered much of section 9 yet, on concurrency)

Section 2.1 (you can skip 2.1.7 for now, on bitwise operations, and 2.1.9 on shift operations)
Section 2.2 (don't sweat the B2Uw notation for now, though all it's doing is formally defining the conversion from strings of bits to integers)
Section 2.3 on integer arithmetic (we'll save floating point, in 2.4, for later)

Programming assignment 2

Write a function endian(int x) that converts x from little-endian format to big-endian, and vice-versa. In other words, if the bytes of x are b0b1b2b3, then the function returns b3b2b1b0. If x = 0x0102a3b4, then the result of endian(x) is 0xb4a30201.

The function is the equivalent of ntohl(), defined in <arpa/inet.h>.

There are two general approaches:

1. Byte manipulation: cast &x to (unsigned char *). Now you have an array of four bytes, which you can easily reorder.

2. Numeric manipulation: use & and shifts to extract the bytes, and then reassemble them./
    int b0 = x & 0xff;
    int b1 = (x & (0xff << 8)) >> 8;
    ...
    result = (b0 << 24) + (b1 << 16) + ...

Test your function with some examples.


C: see c.html

rows_and_columns.c: an illustration of the effect of caches. Row order and column order visit exactly the same array components, but in different orders.

branchpredict.c: an illustration of the effect of branch prediction. The two array passes -- before and after sorting -- do exactly the same work, just in different orders. Also, what's with that 'µ'? Is it a byte?


BOH3 p 100:

In 2002, it was discovered that code supplied by Sun Microsystems to implement the XDR library, a widely used facility for sharing data structures between programs, had a security vulnerability arising from the fact that multiplication can overflow without any notice being given to the program. Code similar to that containing the vulnerability is shown below:

/*
 * Illustration of code vulnerability similar to that found in Sun’s XDR library.
 */
void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size) {
    /*
    * Allocate buffer for ele_cnt objects, each of ele_size bytes
    * and copy from locations designated by ele_src
    */
    void *result = malloc(ele_cnt * ele_size);
    if (result == NULL)      /* malloc failed */
        return NULL;
    void *next = result;
    int i;
    for (i = 0; i < ele_cnt; i++) {    /* Copy object i to destination */
        memcpy(next, ele_src[i], ele_size);
        /* Move pointer to next memory region */
        next += ele_size;
    }
    return result;
}

The user passes in ele_size, representing the amount of data to be copied.

What if ele_size = 4096 = 212, and ele_cnt = 1,048,577 = 220 + 1. Then, with 32-bit arithmetic, ele_cnt * ele_size is 4096 (why?).

That means that the malloc() call didn't allocate nearly enough memory, and so the memcpy() call copies waay too much data into next, causing memory corruption.


Fractional part is 23 bits and can represent any value n/223, 0<=n<223. If we add a 1 to the left of the binary-point, we get a number between 1 and 2.

exponent: represents an unsigned value 0 ≤ expfield ≤ 255. The actual exponent is e = expfield - 127, if expfield != 0 and expfield != 255, so the range for e is -126 ≤ e ≤ 127. For e in this range, the fractional part is assumed to have a 1. in front of it. This is called the normalized case.

For expfield == 0, the number is denormalized. We still assume e=-126, but there is not an assumed 1. in front of the fractional part. That is, we go from:

    0    0000 0001   0 ... 000    // frac is 1.00...0

to

    0    0000 0000    1 ... 111        // frac is 0.111  ... 11

and on downward until

    0    0000 0000     0...00001

Representation of 0.0
    what is -0.0??

What is the smallest 32-bit float > 1.0?

What is the largest 32-bit float?

What is the smallest 32-bit float > 0.0?

Monday:

Problem 2.71 on page 133: what is wrong with xbyte? Even before that, however, what is right with it?

Hint: note that the function is supposed to return signed values. That is, if the extracted byte is negative, the returned int should be sign-extended.

Observation: signed bytes are frustrating. If you are are using them, think about why.



Infinity and NaN

If expfield = 1111 1111, then if frac == 0... 000 the number represents +∞ or -∞, depending on the sign bit. If frac is any nonzero value, then the float represented is NaN: not a number.

Failure of associativity

Wednesday:
    When you're adding up a series of smaller and smaller numbers, you get the best accuracy starting at the small end.
    Example: sum1() and sum2()

Practice problem 2.43, p 107


Practice problem 2.54, p 125

Which are true:

A. x == (int) (double)x

B. x == (int) (float) x

C. d == (double)(float) d

D f == (float)(double) f

E f == -(-f)

BOH3 example: 1 bit sign, 4 bit expfield, 3 bit fraction (p 116, fig 2.35)

    Note especially the transition from smallest-normalized to largest-denormalized.

Friday:

Program in C

Comparability

Monotonicity: if u≤v, then x+u ≤ x+v. This is not true for integer arithmetic; one side may overflow and the numeric values wrap around. With float, both sides may become infinity (or -infinity if x == -infinity), but the inequality still holds.

Floating-point hardware

mysqrt() example
    how to scale for larger numbers: mysqrt(1e6) and larger
    best value for epsilon
    accuracy of last binary digit


What is the first int n for which (float) n == (float) (n+1)
    demo in float.c

float and ==: some people argue you should never use == with float.

Table on p 60: revisit two's-complement

XDR vuln


Machine code

processor history:
    core i7, 0.78 billion transistors
    core i7-Haswell    1.17 billion transistors
    core i7-Sandy Bridge    1.4 billion transistors

x86-64 (sometimes called "x64")

The register file

%rax                   %eax = 32 bit, %ax=16,%al=8
Often used for function return value
%rbx
%rcx arg4
%rdx arg3
%rsi "index" register, arg2
%rdi arg1
%rbp "base pointer", not always needed
%rsp stack pointer
%r8 added later, arg5
%r9 arg6
%r10
%r11
%r12
%r13
%r14
%r15
%rip the instruction pointer