Database week 12

Comp 353/453: Database Programming, Corboy 523, 7:00 Thursdays

Week 12, Apr 20

Read in Elmasri & Navathe (EN)

Chapter 17, File Structures
Chapter 18, Indexes
Chapter 19, Query Optimization
Chapter 21, Concurrency

Some notes on my version of works_on.php

I decided to come up with a reasonable version of works_on.php. Here are a few design notes for creating your own version. There is quite a bit more here than I am expecting you to do.

1. I've added a function simple_query($db, $query, arg_array) to lib353pdo.php. It returns a PDOstatement object if the query succeeds, or a string if the query fails. You can test the return type using the function is_string($res).

2. There are two general page categories involved. The first has a textbox for the ssn and a drop-down menu for the project, and a table of employees. You come here on first landing. There are two buttons: one to "submit SSN & project" and one to "look up employee projects". The second page is reached if the user has provided an SSN and project; there are two cases. If there is a works_on entry for that SSN & project, you get update and delete buttons; if not, you get a button to insert a new record.

This means in the main dispatch section of the works_on.php file, there are six cases:

if ($_POST['submit_ssn_proj']) {        // from form1
else if ($_POST['get_employee_projects']) // from form1
else if ($_POST['workson_new'])       // from form2
else if ($_POST['workson_update'])    // from form2
else if ($_POST['workson_delete'])    // from form 2
else // initial arrival; form 1

Should I have split this up into multiple php files? This is not clear. Unfortunately, after doing the validation checking sometimes I want to return to the same form, and other times to switch to the new form. For example, if one provides an illegal SSN in the first, 'submit_ssn_proj', case, then one ends up back on form 1 with a message. However, if the SSN was valid, we move to form 2. Thus, from within the php file we can't tell statically what form we're going to produce (see also #10, below, for an example where sometimes we call print_form1 and sometimes print_form2).

I ended up leaving it all in one file.

3. There is very little in the way of conditionals for the three workson_(insert/delete/update) cases. However, for submit_ssn_proj and get_employee_projects I had to verify that the SSN was present, and was the SSN of a valid employee.

4. A method called print_form1 has three parameters: $db, as usual, but also $message and $last_ssn. The $message is displayed towards the top of the page; it can be null or the empty string (but my code checks for that, and if it is, doesn't insert the <p> tags around it). The $last_ssn is the SSN value to display by default; I should probably use the SSN value last entered in almost every case, even if it is invalid. The $message parameter works for both error messages:
print_form1($db, "ERROR: you must choose an employee!", "")
or status messages (see #5 below).

5. In my original version of form 2, I cheated a bit. Depending on whether the record was to be a new insertion or an update, I had php variables create strings that defined different html buttons; my code looked something like this.

    $hours = get_hours_employee($db, $ssn, $pnumfile:///home/pld/353/phpdocs/features.persistent-connections.htmlber);    // returns 0 for no preexisting entry!
    if ($hours == 0) {    // new entry
        $wo_name= "workson_new";
        $wo_str = "new work assignment";
        $delbutton = "";             // no $delete button!
    } else {                // update existing entry
        $wo_name= "workson_update";
        $wo_str = "update work assignment";
        $delbutton = "<input type=\"submit\" name=\"workson_delete\" value=\"delete work assignment\">";
    }
print <<<END
   ...file:///home/pld/353/phpdocs/features.persistent-connections.html
    <input type="submit" name="$wo_name" value="$wo_str">
    $delbutton
END

I eventually decided this was a bad idea, and switched it to the conventional form: the html is in the form of static strings, but different strings are printed in the if and the else clauses, eg

if ($hours == 0) {
print '<input type="submit" name="workson_new" value="new work assignment">'
} else ....

This makes the code decidedly easier to read.

6. My form1 has a "look up employee projects" button. This generates a long string, which defines a horizontal table. I then insert this table into the page using the $message parameter to print_form1:
        $projects=get_projects($db, $ssn);
          .... error_checking
            $projstring = projs_to_table($projects);
            print_entry($db, "Employee $ssn: $projstring", $ssn);
The $projstring is of the form "<table><tr><td>proj 1: 15 hours</td><td>proj 10: 25 hours</td></tr></table>. I build it using a modification of the method of table_format_pdo() in lib353pdo.php.

7. Note how projects are listed in the drop-down menu: as pnumber ": " pname. Getting a query to return this is easy; uses SQL's concat() function. However, I then had to take the string and extract the project number. Here's sample code: $pnumber = strstr($pstring, ":", TRUE);

8. Note how I can verify that an insert/update/delete worked by using the "look up employee projects" button.

9. While I made an effort to keep track of the previously-entered SSN, I did nothing to preserve the pre-selected project. I probably should have.

10. When you add error checking, the complexity increases. For example, here's a sketch of my if ($_POST['employee_projects']) code:

    if ($ssn == null || $ssn == "") {    // no ssn!
        print_form1($db, "ERROR: you must choose an employee!", "");
    } else if (! is_valid_employee($db, $ssn)) {
        print_form1($db, "ERROR: no employee with SSN: $ssn!", $ssn);
    } else {
        look up $ssn, $pnumber in db
        if (query failed) {
            print_form1($db, "ERROR: employee-projects query failed!", $ssn);
        } else {
            do some setup
            print_form2($db, $message, $ssn);       // $message identifies $ssn & $proj for debugging
        }
    }

11. Earlier I gave you makeSelectMenuString, which takes a menu name and an array of options and returns the html for creating a drop-down menu. Suppose you want a drop-down menu for employee lnames. You can combine this with simple_query(), above, as follows:
    $qresult = simple_query($db, "select lname from employee", array());       // no prepared params
    if (is_string($qresult)) die($qresult);
    $menuentries = $qresult->fetchAll(PDO::FETCH_COLUMN);
    $menustring = makeSelectMenuString("lname_list", $menuentries);

12. Demo in class of some error-checking

Finally, note http://www.php.net/manual/en/features.persistent-connections.php. For our purposes, creating a db connection has negligible overhead.

Ch 18: indexing

It is common for databases to provide indexes for files. An index can be on either a key field or a non-key field; in the latter case it is called a clustering index. The index can either be on a field by which the file is sorted or not. An index can have an entry for every record, in which case it is called dense; if not, it is called sparse. An index on a nonkey field is always considered sparse, since if every record had a unique value for the field then it would in fact be a key after all.

A file can have multiple indexes, but the file itself can be structured only for one index. We'll start with that case. The simplest file structuring for the purpose of indexing is simply to keep the file sorted on the attribute being indexed; this allows binary search. For a while, we will also keep restrict attention to single-level indexes.

Note that keeping a file sorted by the primary key helps in lookup (and helps make range queries (such as invoicenum between 100000 and 120000) more efficient). However, a good set of indexes can make sorted files unnecessary, in which case the simple "heap of records" organization might be used.

Primary Index

A primary index is an index on the primary key of a sorted file (note that an index on the primary key, if the file is not maintained as sorted on that primary key, is thus not a "primary index"!). The index consists of an ordered list of pairs ⟨k,p⟩, where k is the first key value to appear in the block pointed to by p (this first record of each block is sometimes called the anchor record). To find a value k in the file, we find consecutive ⟨k1,p⟩ and ⟨k2,p+1⟩ where k1≤k<k2; in that case, the record with key k must be on block p. This is an example of a sparse index. A primary index is usually much smaller than the file itself. See Fig 18.1.

Example 1 on EN6 p 635: 30,000 records, 10 per block, for 3000 blocks. Direct binary search takes 12 block accesses. The index entries are 9+6 bytes long, so 1024/15 = 68 fit per 1024-byte block. The index has 3000/68 = 45 blocks; binary search requires 6 block accesses, plus one more for the actual data block itself.

Clustering index

We can also imagine the file is ordered on a nonkey field (think Employee.dno). In this case we create a clustering index. The index structure is the same as before, except now the block pointer points to the first block that contains any records with that value; see Fig 18.2. Because the file is ordered by the nonkey field in question, all the other records for the same field value will be adjacent. Clustering indexes are of necessity sparse. However, it is not necessary to include in the index every value of the attribute; we only need to include in the index the attribute values that appear first in each block. But there's a tradeoff; if we skip some index values then we likely will want an index entry for every block; for a non-key index this may mean many more entries than an index entry for every distinct value. In Fig 18.2, we have an entry for every distinct value; we could remove the entries for Dept_number=2 and Dept_number=4.

Another approach to clustering indexes may be seen in Fig 18.3, in which blocks do not contain records with different cluster values. Note that this is again a method of organizing the file for the purpose of the index.

Secondary Indexes

Now suppose we want to create an index for Employee by (fname, lname), assumed for the moment to be a secondary key. The record file itself is ordered by Ssn. An index on a secondary key will necessarily be dense: as the file won't be ordered by the secondary key; we cannot use block anchors. A common arrangement is simply to have the index list ⟨key,block⟩ pairs for every key value appearing; if there are N records in the file then there will be N in the index and the only savings is that the index records are smaller. See Fig 18.4. If B is the number of blocks in the original file, and BI is the number of blocks in the index, then BI ≤B, but not by much, and log(BI) ≃ log(B), the search time. But note that unindexed search is linear now, because the file is not ordered on the secondary key.

Example 2, EN6, p 640: 30,000 records, 10 per block. Without an index, searching takes 1500 blocks on average. Blocks in the index hold 68 records, as before, so the index needs 30,000/68 = 442 blocks; log₂(442) ≃ 9.

Secondary indexes (which are clustering indexes) can also be created on nonkey fields. We can create dense indexes as above, but now with multiple entries for the indexed attribute, one for each record (Option 1 in EN6 p 640). A second approach (option 2 in EN6 p 640) is to have the index consist of a single entry for each value of the indexed attribute, followed by a list of record pointers.

The third option, perhaps the most common, is for each index entry to point to blocks of record pointers, as in Fig 18.5. Think of the employee file ordered by SSN, and we are creating an index on dno. Given a value for dno, the records with that value will be scattered throughout the file. So what the index contains is pairs ⟨dno,pointer⟩, where the pointer is to a block that contains a listof pointers to the actual data blocks that contain employee records with that value for dno. We don't have to use an entire block for this purpose; one block might have a list of pointers for several different values of dno.

Hashing

Hashing can be used to create a form of index, even if we do not structure the file that way. Fig 18.15 illustrates an example. We can use hashing with equality comparisons, but not order comparisons, which is to say hashing can help us find a record with ssn=123456789, but not records with salary between 40000 and 50000. Hash indexes are, in theory, a good index for joins. Consider the join

select e.lname, w.pno, w.hours from employee e, works_on w where e.ssn = w.essn;

We might choose each record e from employee, and want to find all records in works_on with e.ssn = w.essn. A hash index for works_on, on the field essn, can be useful here. Note this can work even though works_on.essn is not a key field.

There is one problem: we are likely to be retrieiving blocks of works_on in semi-random order, which means one disk access for each record. However, this is still faster than some alternatives.

Multilevel indexes

Perhaps our primary sorted index grows so large that we'd like an index for it. At that point we're creating a multi-level index. To create the ISAM index, we start with the primary index, with an index entry for the anchor record of each block, or a secondary index, with an entry for each record. Call this the base level, or first level, of the index. We now create a second level index containing an index entry for the anchor record of each block of the first-level index. See Fig 18.6. This is called an indexed sequential file, or an ISAM file.

This technique works as well on secondary keys, except that the first level is now much larger.

What is involved in inserting new records into an ISAM structure? The first-level index has to be "pushed down"; unless we have left space ahead of time, most blocks will need to be restructured. Then the second and higher-level indexes will also need to be rebuilt, as anchor records have changed.

Example: we have two records per block, and the first level is

Data file: 1 2 3 5 7 9 20
1st-level index: 1 3 7 20

What happens when we insert 8? 4? 6?

What happens when we "push up a level", that is, add an entry that forces us to have one higher level of index?

EN6 Example 3 (p 644): 30,000 records and can fit 68 index entries per block. The first-level index is 30,000/68 = 442 blocks, second-level index is 442/68 = 7 blocks; third-level index is 1 block.

B-trees (Bayer trees)

Consider a binary search tree for the moment. We decide which of the two leaf nodes to pursue at each point based on comparison.

We can just as easily build N-ary search trees, with N leaf nodes and N-1 values stored in the node. Consider the ternary example for a moment.

Next, note that we can have a different N at each node!

Bayer trees: let's have a max of 4 data items per node.

1 4 8 9 20 22 25 28

How can we add nodes so as to remain balanced? We also want to minimize partial blocks. A B-tree of order p means that each block has at most p tree pointers, and p-1 key values. In addition, all but the top node has at least (p-1)/2 key values.

"push-up" algorithm: we add a new value to a leaf block. If there is room, we are done. If not, we split the block and push up the middle value to the parent block.

The parent block may now also have to be split.

B+ trees: these are a slight technical improvement where we replicate all the key values on the bottom level. Inside the tree we still use key values, but values only rather than full records; this allows for more values per block (because keys-only are smaller than full records), and thus a fatter tree. B+-trees are also better suited for standalone indexes; the original B-tree approach was fundamentally about a way of origanizing the file itself.

Any index in effect defines an iterator for traversing the set of records. B-trees allow access to the database in sorted order, though if the file is not kept organized by the key in question then we must load a separate block for each record.

Databases all support some mechanism of creating indexes, of specific types. For example, MySQL allows

    CREATE INDEX indexname ON employee(ssn) USING BTREE;    // or USING HASH

Example query (E&N6, p 660):
    select e.fname, e.lname from employee e where e.dno = 4 and e.age=59;

If we have an index on dno, then access that cluster using the index and search for age.
If we have an index on age, we can use that.
If we have an index on (dno, age), we have a single lookup! But we also have expensive indexes to maintain.

Indexes can be ordered or not; ordered indexes help us find employees with e.age >= 59. Ordered and nonordered indexes both support lookup; ordered indexes also in effect support return of an iterator yielding all DB records starting at the specified point. BTREE and ISAM indexes are ordered; HASH indexes are not.

Indexes v File Organization

What's the difference between creating a hashed file organization and a hashed index? Or a B-tree file organization versus a B-tree index? An index needs to stand alone, and allow finding any given record; a file-organization method generally allows us to retrieve blocks of records at a time.

A B-tree index technically has to be a B+-tree, as a B-tree includes file blocks at interior levels. The usual B/B+ tree also includes many related records in single leaf blocks, but that is straightforward to give up (at the cost of much less efficient traversal).

Query processing

First, note it is relatively easy to sort records. Typically we will use merge-sort, where we subdivide into blocks small enough to fit into memory. We sort those with, say, quicksort. We then make passes until each pair of consecutive blocks is merged.

Select

Some examples

select * from employee e where e.ssn = '123456789'
select * from department where dnumber >=5
select * from empoyee where dno = 5
select * from employee where dno = 5 and salary > 30000 and sex = 'F'
select * from works_on where essn='123456789' and pno=10

Methods of implementation:

Linear search: we read each disk block once.
Binary search: this is an option if the selection condition is an equality test on a key attribute, and the file is ordered by that attribute.
Primary index: same requirement as above, plus we need an index
Hash search: equality comparison on any attribute for which we have a hash index (need not be a key attribute)
Primary index and ordered comparison: use the index to find all departments with dnumber > 5
Clustered index (includes hash indexes)
B-tree index

If we are searching for "dno=5 AND salary=4000" we can use an index for either attribute, or a composite index if one exists.

In general, we'd like to choose the condition with the fewest results, and then continue with the remaining conditions.

For OR conditions, we just have to search for each.

Join

Mostly we focus on equijoins:

select * from employee e, department d where e.dno = d.dnumber
select * from department d, employee e where d.mgr_ssn = e.ssn

Methods of implementation:

1. Nested-loop join: this is where we think of a for-loop:
    for (e in employee) {
       for (d in department) {
          if (e.dno = d.dnumber) print(e,d);
       }
    }
This is quadratic. Though note that we can go through both files in block-by-block fashion.

2. index method: if we have an index on one of the attributes, we can use it:
    for (e in employee) {
       d = lookup(department, e.ssn);
       if (d != null) print_it(e,d);
    }

Note that any index will do, but that this may involve retrieving several disk blocks for each e and will almost certainly involve retrieving at least one disk block (from department) for every e in employee. Still, it's better than Method 1.

3. Sort-merge join: we sort both files on the attribute in question, and then do a join-merge. This takes a single linear pass. This is most efficient if the files are already sorted, but note that it's still faster than 1 (and possibly faster than 2) if we have to sort the files.

4. Partition-hash join: Let the relations (record sets) be R and S. We partition both files into Ri = {r in R | hash(r) = i}. Now we note that the join R ⋈ S is simply the disjoint union of the Ri ⋈ Si. In most cases, either Ri or Si will be small enough to fit in memory.

A good hash function helps; we want the buckets to be relatively uniform. We won't get that, but we should hope for Poisson distribution. With N things in K buckets, we expect an average of a=N/K per bucket. The probability of i things is aⁱe^-a/i! and the expected number of i-sized buckets is about N*aⁱe^-a/i!.

The join selection factor is the fraction of records that will participate in the join. In the example
    select * from department d, employee e where d.mgr_ssn = e.ssn
all departments will be involved in the join, but almost no employees. So we'd rather go through the departments, looking up managing employees, rather than the other way around. Book example, EN6 p 693: we have indexes on both fields. The employee.ssn index takes 4 accesses, and the dept.mgr_ssn index takes two. We have 50 departments (10 blocks) and 6000 employees (2000 blocks). To do the join by employee, we have 2000 accesses for the employees, plus, for each employee, 3 block accesses to find the dept manager. That's 2000 + 6000*3 = 20,000 accesses. In the other order, it's 10 blocks for all the departments plus for each dept 5 block accesses to find the manager, for a total of 10 + 5*50 = 260.

In the most common case of joins, the join field is a foreign key in one file and a primary key in the other. Suppose we keep all files sorted by their primary key. Then for any join of this type, we can traverse the primary-key file block by block; for each primary-key value we need to do a lookup of the FK attribute in the other file. This would be method 2 above; note that we're making no use of the primary index.

Query optimization

As with compilers, the word "optimization" is used loosely.

A major rule is to apply select/project before joins, and in general do operations first that have greatest potential for reducing the size of the number of records.

Query tree: See fig 19.4.

Select lname from employee e, works_on w, project p
where e.ssn = w.essn and w.pno = p.pnumber and p.pname = 'Aquarius';

See fig 19.5, but note that doing cross-products was never on the agenda. However, it still does help to do the second join first.

Some heuristic rules:

Move the more-restrictive SELECT operations down towards the leaf nodes; that will reduce number of records.
Same with PROJECT; try to do joins last
Do the joins in the right order, using the join selection factor estimate
Do the smaller joins first (or, more generally, do the joins that will result in an absolutely smaller set of records, whether this is because the relations were small to start with or because one condition was highly selective).

Transactions

A transaction is a set of operations, which we can idealize as read(X), write(X), which at the end we either commit (save) or rollback/abort (discard). Generally we hope rollbacks are rare.

Abstract transactions: operations are read(X) and write(X), for X a field/record/block/table.
Fig 21.2: two example transactions.
T1: read(X), write(X), read(Y), write(Y)
T2: read(X), write(X)

Ideally, we execute the transactions one at a time, with no overlap. In practice, because of the considerable I/O-wait for disk retrieval, we cannot afford that. Transactions will overlap. We need to figure out how to arrange for this to happen safely. Here are two problems:

Lost-update (write/write) problem: this occurs when one transaction writes a value that another then overwrites. See Fig 21.3a. T1 is transferring N dollars from X to Y, while T2 is adding M dollars to X. T2.write(X) overwrites T1.write(X).

T1	T2
read(X) X = X-N
	read(X) X = X+M
write(X) read(Y)
	write(X)
Y = Y+N write(Y)

Dirty-Read (read/write) problem: this occurs when one transaction reads an item that another then overwrites. This would end up as a lost-update, but consider Fig 21.3b where T1 actually aborts after write(X).

T1	T2
read(X) X = X-N write(X)
	read(X) X = X+M write(X)
read(Y) abort/rollback

Fig 21.3c shows a corrupted result; an abort would have been an improvement!

The system log, or journal, is a master list of all operations performed, used in the event of rollback. Failures leading to rollback may be system crash, lack of disk space, lookup error (eg SQL error), locking problems, or other physical or logical errors.

Items read here can be individual record attributes, entire records, or entire tables. When joining tables, the entire table is read.

The ACID test

This is commonly described as the following set of features.

Atomicity: either all steps of a transaction are performed, or none are. We either commit the entire thing or rollback completely.
Consistency preservation: the individual transaction shoud preserve database consistency and database invariants
Isolation: transactions should appear to be executing in isolation.
Durability: committed transactions survive DB crashes

Note that Durability is arguably implied by Atomicity, or at least the two together are implied by some greater "recovery" rule. There are other redundancies here as well; the main points for transactions are atomicity and isolation, with the latter understood as a special case of correctness.

For example, note that Consistency is closely related to Isolation: if each transaction preserves consistency, and the transactions appear to execute in Isolation, then a series of transactions will be Consistency preserving. Consistency would be implied by Serializability: that the transactions appeared to have been executed in some serial order. However, this might be a stronger condition, if we really had a good definition for isolation.

Moral: beware of choosing your terms to fit a clever acronym.

Transaction schedules

Transaction schedules are lists of read/write operations of each transaction, plus commit/abort. The operations of each individual transaction must be in their original order.

Eventually we will enforce schedules by using locking, but for now we just consider schedules abstractly.

Examples:

Sa: Fig 21.3a: r1(X), r2(X), w1(X), r1(Y), w2(X), w1(Y), c1, c2
Sb: Fig 21.3b: r1(X), w1(X), r2(X), w2(X), r1(Y), a1

Two operations is a schedule conflict (where the word is to be taken as indicating potential conflict) if:

They belong to different transactions
They access the same data item (eg X)
At least one of the operations is a write()

For example, in Sa, r1(X) and w2(X) conflict. So do r2(X) and w1(X). What does not conflict? r1(X) and r2(X), r1(X) and w1(X).

Conflicting operations are those where interchanging the order can result in a different outcome. A conflict is read-write if it involves a read(X) in one transaction and write(X) in the other, and write-write if it involves two transactions each doing write(X).

A complete schedule is a total order of all the operations, incuding abort/commits. (The book allows some partial ordering of irrelevant parts).

Given a schedule S, the committed projection C(S) is those operations of S that are part of transactions that have committed. This is sometimes useful in analysing schedules when transactions are continuously being added.

We can build a transaction precedence graph based on schedule conflict, given a schedule S. Let us say Ti ⟶ Tj if Ti and Tj conflict as above, and one of the following occurs:

Ti executes write(X) and later Tj executes read(X)
Ti executes read(X) and later Tj executes write(X)
Ti executes write(X) and later Tj executes write(X)

These are the three specific cases of conflict that can occur; note that at this level we distinguish between read-write and write-read.

Demo: build graphs for Fig 21.3(a), (b). Note that the figure determines the schedule.

Recoverable and nonrecoverable schedules

Given two transactions T and T', we say T reads from T', or T depends on T', if T' does a write(X) and then later T does a read(X) (this is Case 1 of T'⟶T above) Thus, the outcome of T may depend on the (at least in part) earlier T'.

A schedule is recoverable if no transaction T can commit until all T' where T reads from T' have already committed. Otherwise a schedule is nonrecoverable. The point is that if T reads from T', then if T' aborts, T must abort too. So T waits to commit until T' commits. Given a recoverable schedule, we never have to abort a committed transaction. This is, in practice, an essential rule.

Consider the following modification of Sa above:
    S'a: r1(X), r2(X), w1(X), r1(Y), w2(X), c2, w1(Y), c1.
This is recoverable even though there is a risk of lost update. But now consider
    Sc: r1(X), w1(X), r2(X), r1(Y), w2(X), c2, a1             nonrecoverable: T2 reads X from T1 but T2 commits first
    Sd: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y), c1, c2;    recoverable because T2 waited to commit until T1 did

Suppose T1 aborts in Sd. Then we have Se:

    Se: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y), a1, a2;

Though recoverability implies no committed transaction needs to be rolled back, we still may have cascading rollback. A schedule is said to avoid cascading rollback if every transaction T only reads items that were earlier written by committed transactions. In this case, an abort of some other transaction will not cause T to abort. How can we achieve this in Sd/Se? By delaying r2(X) until after c1.

There is also a notion of strict schedule, in which transactions do not use X for either read or write until all transactions that wrote X have committed. This is more expensive to require.

strict implies cascadeless implies recoverable, but not the reverse.

Consider the following three transactions:

    T1: X+=10, Y*=1.1;
    T2: Y+=10, Z *= 1.1;
    T3: Z+=10, X *= 1.1;

What happens if we do these in order T1, T2, T3? We get the same end result in any serialized order.

What happens if we do all the additions and then all the multiplications? Note that, if we defer the commits, this can be a recoverable schedule. And there are no lost updates. But it is still not equivalent to any serial schedule.

Serializability

The ideal situation with transactions is that they are executed serially, but that is too inefficient. Instead, we look for schedules that are in some sense equivalent to a serial schedule. In Fig 21.5, (a) and (b) show two serial schedules. Schedule (c) has a lost update; schedule (d) is ok (in a sense to be defined below) because we can move T2 to the end of T1.

A schedule is serializable if it is equivalent to a serial schedule. For equivalence, we can use result equivalence, but that is intractable. Instead we use conflict equivalence. Two schedules are conflict-equivalent if the order of any two conflicting operations is the same in both schedules. We can reorder the other (nonconflicting) parts to get what we need.

Conflict-serializability algorithm

Build the transaction precedence (directed) graph above. If the graph has no cycles, it is conflict-serializable. Do a topological sort (ie find a total order of the nodes consistent with the Ti⟶Tj ordering); the original schedule is conflict-equivalent to the schedule of executing the transactions in that order.

The actual algorithm is as follows: first, find all Ti that do not have any predecessors, ie Tk with Tk⟶Ti. Do these first, and remove all Ti⟶Tj links. Now repeat, until there are no more nodes. If at some stage there are no Ti without predecessors, then there must be a cycle.

Debit-credit transactions

These are transactions built from pieces of the form read(X), X+=N, write(X). We do not have to serialize these! As long as we avoid the lost-update problem, we can interleave increments and decrements from different transactions without harm, because addition is commutative. The schedule below is not conflict-serializable, but it yields the same result, namely X = X+N−M, Y=Y+M−N.

T1	T2
read(X) X+=N write(X)
	read(Y) Y+=M write(Y)
read(Y) Y−=N write(Y)
	read(X) X−=M write(X)

Locking

We will only consider relatively simple locks; complex locks tend to be inefficient. A binary lock has two states: locked and unlocked. We can implement a per-field/record/block/table lock with a locking table; at any one time, we are likely to have relatively few locks.

When we lock something, we may have to wait. See Fig 22.1. Ideally, we wait in a queue on the lock.

The actual inner details of lock() and unlock() must be atomic. On multi-processor systems, it is not enough simply to disable interrupts; we must actually involve some kernel locking.

A straightforward improvement on binary locks is read-write locks: write locks imply exclusive access, but several transactions can read-lock the same data value concurrently. Note that if we intend to write X eventually, then we will need to write-lock X before reading X; otherwise we may have lost updates.

Two-phase locking

Two-phase locking is a simple protocol in which all locking operations (including upgrades of read-lock(X) to write-lock(X)) precede the first unlock operation. The locking phase is called the expanding phase; the unlock is the shrinking phase.

If all transactions use two-phase locking, then any schedule will be conflict-serializable.

In fig 22.3(a), the two-phase locking protocol is not followed. The schedule in 22.3(c) is allowed by the locks, but it is not serializable.

Fig 22.4 shows the same transactions using the two-phase locking protocol. Here, however, we have the potential for deadlock.

Two-phase locking comes in flavors. The version above is called basic. We also have:

conservative: transactions must lock all the items to be accessed at the start of the transaction. This is deadlock-free, but sometimes the transaction does not know every item to be read until partway through.
strict: no write locks are released until commit/rollback. This means that no other transaction that reads from this one can do so until the first one has committed. This leads to a strict schedule, which is then cascadeless. Deadlocks may occur.
rigorous: no locks at all are released until commit/rollback. Rigorous two-phase locking is sort of a mirror image of conservative two-phase locking.

Deadlock

Locking protocols generally introduce the risk of deadlock:

T1	T2
write_lock(X)	write_lock(Y)
write_lock(Y)	write_lock(X)
...	...

Perhaps the simplest strategy for deadlock prevention is to lock everything according to one master order. This is impractical for the programmer to enforce; the database can only do it if we ask for all the locks together: write_lock(X,Y,Z).

In practical terms, we also have to deal with deadlock detection, and then decide which transaction will die and which will wait. Note that it soon becomes obvious in practice that deadlock has occurred: any transaction that waits for more than a few seconds on a lock, and is not moving up in the queue for that lock, is likely deadlocked. More formally, we can construct a wait-for graph, and look for cycles.

Timestamps

An alternative to locks entirely is the use of timestamps. We assign each transaction a timestamp, and then only allow schedules that are equivalent to executing the transactions in timestamp order. To do this, each database item has associated with it a read-timestamp and a write-timestamp; the former is the latest timestamp of all transactions that have read X, and the latter is the latest timestamp of all transactions that have written X. If a transaction T attempts read(X), we first compare T's timestamp with read-timestamp(X). If T's timestamp is not later than read-timestamp(X), we abort T and put it back in the transaction queue with a later timestamp; ditto for write(X). This protocol can involve cascading rollback, so further constraints are needed to get it to generate cascadeless or strict schedules (or even recoverable schedules, for that matter).

Transactions in PHP

See http://php.net/manual/en/pdo.transactions.php. Here's an example from that page:

try {  
  $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

  $db->beginTransaction();
  $db->exec("insert into staff (id, first, last) values (23, 'Joe', 'Bloggs')");
  $db->exec("insert into salary (id, amount) values (23, 50000)");
  $db->commit();
  
} catch (Exception $e) {
  $db->rollBack();
  echo "Failed: " . $e->getMessage();
}

Bear in mind that you should not interpose any wait for user input between the beginTransaction() and the commit(); otherwise you may be holding a lock "indefinitely".

Basics of forms

A form is enclosed between <form method="post" action="program"> and </form> tags. The usual value for action is "", meaning to re-invoke the same program (in our case, a php program), but now with posted data values.

Inside the form you can have input items, of the form <input type="input-type" ... >. Input items can be

text (one-line text boxes)
button (clickable buttons)
radio (radio buttons)
checkbox (checkboxes)
select: menus with options
select ... multiple: allows selecting several items simultaneously
textarea (scrolling text boxes)
submit (submit buttons, semantically different from ordinary buttons)
more

Here's a big form. You can see it in action at form1.html.

<form action="form1.cgi" method="get" enctype="multipart/form-data">
<input type = "button" value="click me!" name="b1">
<input type = "button" value="me too!" name="b2">
<p> Are we having fun yet?
<center>
<input type="radio" name="fun" value = "yes">Yep!
<input type="radio" name="fun" value = "no">Nope!
<input type="radio" name="fun" value = "maybe" CHECKED>I dunno!
</center>
<p><b>Why are you doing this?</b>
<input type = "checkbox" name="cbox" value="credit" CHECKED>For course credit
<input type = "checkbox" name="cbox" value="money" CHECKED>For the money!
<br>
<p><b>Who are you?</b>
<input type = "text" name="name1" value="I dunno" size=20>
<p>Now for a menu:
<p>
<select size = 1 name="menu1">
<option> first item
<option> second item
<option> number 3!
<option> fourth
<option> last
</select>
<p> now with size=4</p>
<select size = 4 name="menu2" multiple>
<option> first item
<option> second item
<option> number 3!
<option> fourth
<option> last
</select>
<p>Here's a text box</p>
<textarea name="mybox" rows=5 cols = 30>
This message will be in the box. It's plain text, <b>not</b> html!
I'm going to make it long enough that a scrollbar appears,
through the miracle of cut-and-paste.
I'm going to make it long enough that a scrollbar appears,
through the miracle of cut-and-paste.
</textarea>
<p>
<input type="submit">
</form>

Understand the form action, and the idea tha tone page can have multiple <form ...> ... </form> objects.