Comp 353/453: Database Programming, Corboy
523, 7:00 Thursdays
Week 12, Apr 20
Read in Elmasri & Navathe (EN)
- Chapter 17, File Structures
- Chapter 18, Indexes
- Chapter 19, Query Optimization
- Chapter 21, Concurrency
Some notes on my version of works_on.php
I decided to come up with a reasonable version of works_on.php. Here
are a few design notes for creating your own version. There is quite a
bit more here than I am expecting you to do.
1. I've added a function simple_query($db, $query, arg_array) to
lib353pdo.php. It returns a PDOstatement object if the query succeeds,
or a string if the query fails. You can test the return type using the
function is_string($res).
2. There are two general page categories involved. The first has a
textbox for the ssn and a drop-down menu for the project, and a table
of employees. You come here on first landing. There are two buttons:
one to "submit SSN & project" and one to "look up employee
projects". The second page is reached if the user has provided an SSN
and project; there are two cases. If there is
a works_on entry for that SSN & project, you get update and delete
buttons; if not, you get a button to insert a new record.
This means in the main dispatch section of the works_on.php file, there are six cases:
if ($_POST['submit_ssn_proj']) { // from form1
else if ($_POST['get_employee_projects']) // from form1
else if ($_POST['workson_new']) // from form2
else if ($_POST['workson_update']) // from form2
else if ($_POST['workson_delete']) // from form 2
else // initial arrival; form 1
Should I have split this up into multiple php files? This is not clear.
Unfortunately, after doing the validation checking sometimes I want to
return to the same form, and other times to switch to the new form. For
example, if one provides an illegal SSN in the first,
'submit_ssn_proj', case, then one ends up back on form 1 with a
message. However, if the SSN was valid, we move to form 2. Thus, from
within the php file we can't tell statically what form we're going to
produce (see also #10, below, for an example where sometimes we call
print_form1 and sometimes print_form2).
I ended up leaving it all in one file.
3. There is very little in the way of conditionals for the three
workson_(insert/delete/update) cases. However, for submit_ssn_proj and
get_employee_projects I had to verify that the SSN was present, and was
the SSN of a valid employee.
4. A method called print_form1 has three parameters: $db, as usual, but
also $message and $last_ssn. The $message is displayed towards the top
of the page; it can be null or the empty string (but my code checks for
that, and if it is, doesn't insert the <p> tags around it). The
$last_ssn is the SSN value to display by default; I should probably use
the SSN value last entered in almost every case, even if it is invalid.
The $message parameter works for both error messages:
print_form1($db, "ERROR: you must choose an employee!", "")
or status messages (see #5 below).
5. In my original version of form 2, I cheated a bit. Depending on
whether the record was to be a new insertion or an update, I had php
variables create strings that defined different html buttons; my code
looked something like this.
$hours =
get_hours_employee($db, $ssn,
$pnumfile:///home/pld/353/phpdocs/features.persistent-connections.htmlber);
// returns 0 for no preexisting entry!
if ($hours == 0) { // new entry
$wo_name= "workson_new";
$wo_str = "new work assignment";
$delbutton =
""; // no
$delete button!
} else { // update existing entry
$wo_name= "workson_update";
$wo_str = "update work assignment";
$delbutton = "<input
type=\"submit\" name=\"workson_delete\" value=\"delete work
assignment\">";
}
print <<<END
...file:///home/pld/353/phpdocs/features.persistent-connections.html
<input type="submit" name="$wo_name" value="$wo_str">
$delbutton
END
I eventually decided this was a bad idea, and switched it to the
conventional form: the html is in the form of static strings, but
different strings are printed in the if and the else clauses, eg
if ($hours == 0) {
print '<input type="submit" name="workson_new" value="new work assignment">'
} else ....
This makes
the code decidedly easier to read.
6. My form1 has a "look up employee projects" button. This generates a
long string, which defines a horizontal table. I then insert this table
into the page using the $message parameter to print_form1:
$projects=get_projects($db, $ssn);
.... error_checking
$projstring = projs_to_table($projects);
print_entry($db, "Employee $ssn: $projstring", $ssn);
The $projstring is of the form "<table><tr><td>proj
1: 15 hours</td><td>proj 10: 25
hours</td></tr></table>. I build it using a
modification of the method of table_format_pdo() in lib353pdo.php.
7. Note how projects are listed in the drop-down menu: as pnumber ": "
pname. Getting a query to return this is easy; uses SQL's concat()
function. However, I then had to take the string and extract the
project number. Here's sample code: $pnumber = strstr($pstring, ":",
TRUE);
8. Note how I can verify that an insert/update/delete worked by using the "look up employee projects" button.
9. While I made an effort to keep track of the previously-entered SSN,
I did nothing to preserve the pre-selected project. I probably should
have.
10. When you add error checking, the complexity increases. For example, here's a sketch of my if ($_POST['employee_projects']) code:
if ($ssn == null || $ssn == "") { // no ssn!
print_form1($db, "ERROR: you must choose an employee!", "");
} else if (! is_valid_employee($db, $ssn)) {
print_form1($db, "ERROR: no employee with SSN: $ssn!", $ssn);
} else {
look up $ssn, $pnumber in db
if (query failed) {
print_form1($db, "ERROR: employee-projects query failed!", $ssn);
} else {
do some setup
print_form2($db, $message, $ssn); // $message identifies $ssn & $proj for debugging
}
}
11. Earlier I gave you makeSelectMenuString, which takes a menu name
and an array of options and returns the html for creating a drop-down
menu. Suppose you want a drop-down menu for employee lnames. You can
combine this with simple_query(), above, as follows:
$qresult = simple_query($db, "select lname from
employee", array()); // no prepared
params
if (is_string($qresult)) die($qresult);
$menuentries = $qresult->fetchAll(PDO::FETCH_COLUMN);
$menustring = makeSelectMenuString("lname_list", $menuentries);
12. Demo in class of some error-checking
Finally, note http://www.php.net/manual/en/features.persistent-connections.php. For our purposes, creating a db connection has negligible overhead.
Ch 18: indexing
It is common for databases to provide indexes for files. An index can be on either a key field or a non-key field; in the latter case it is called a clustering index. The index can either be on a field by which the file is sorted or not. An index can have an entry for every record, in which case it is called dense; if not, it is called sparse.
An index on a nonkey field is always considered sparse, since if every
record had a unique value for the field then it would in fact be a key
after all.
A file can have multiple indexes, but the file itself can be structured
only for one index. We'll start with that case. The simplest file
structuring for the purpose of indexing is simply to keep the file
sorted on the attribute being indexed; this allows binary search. For a
while, we will also keep restrict attention to single-level indexes.
Note that keeping a file sorted by the primary key helps in lookup (and helps make range queries (such as invoicenum between 100000 and 120000)
more efficient). However, a good set of indexes can make sorted files
unnecessary, in which case the simple "heap of records" organization
might be used.
Primary Index
A primary index is an index on the primary key of a sorted file (note that an index on the primary key, if the file is not maintained as sorted on that primary key, is thus not a "primary index"!). The
index consists of an ordered list of pairs ⟨k,p⟩, where k is the
first key value to appear in the block pointed to by p (this first
record of each block is sometimes called the anchor record).
To find a value k in the file, we find consecutive ⟨k1,p⟩ and ⟨k2,p+1⟩ where k1≤k<k2; in that case, the record with key k
must be on block p. This is an example of a sparse index. A primary index is usually much smaller than the file itself. See Fig 18.1.
Example 1 on EN6 p 635: 30,000 records, 10 per block, for 3000
blocks. Direct binary search takes 12 block accesses. The index entries
are 9+6 bytes long, so 1024/15 = 68 fit per 1024-byte block. The index has 3000/68 = 45
blocks; binary search requires 6 block accesses, plus one more for the
actual data block itself.
Clustering index
We can also imagine the file is ordered on a nonkey field (think Employee.dno). In this case we create a clustering index.
The index structure is the same as before, except now the block pointer
points to the first block that contains any records with that value;
see Fig 18.2. Because the file is ordered by the nonkey field in question, all the other records for the same field value will be adjacent.
Clustering indexes are of necessity sparse. However, it is not
necessary to include in the index every value of the attribute; we only
need to include in the index the attribute values that appear first in
each block. But there's a tradeoff; if we skip some index values then
we likely will want an index entry for every block; for a non-key index
this may mean many more entries than an index entry for every distinct
value. In Fig 18.2, we have an entry for every distinct value; we
could remove the entries for Dept_number=2 and Dept_number=4.
Another approach to clustering indexes may be seen in Fig 18.3, in
which blocks do not contain records with different cluster values. Note that this is again a method of organizing the file for the purpose of the index.
Secondary Indexes
Now suppose we want to create an index for Employee by (fname, lname),
assumed for the moment to be a secondary key. The record file itself is
ordered by Ssn. An index on a secondary key
will necessarily be dense: as the file won't be ordered by the secondary key; we cannot use block anchors. A common
arrangement is simply to have the index list ⟨key,block⟩ pairs
for every key value appearing; if there are N records in the file then
there will be N in the index and the only savings is that the index
records are smaller. See Fig 18.4. If B is the number of blocks in the
original file, and BI is the number of blocks in the index, then BI ≤B,
but not by much, and log(BI) ≃ log(B), the search time. But note that
unindexed search is linear now, because the file is not ordered on the secondary key.
Example 2, EN6, p 640: 30,000 records, 10 per block. Without an
index, searching takes 1500 blocks on average. Blocks in the index hold
68 records, as before, so the index needs 30,000/68 = 442 blocks;
log2(442) ≃ 9.
Secondary indexes (which are clustering indexes) can also be created on nonkey
fields. We can create dense indexes as above, but now with multiple
entries for the indexed attribute, one for each
record (Option 1 in EN6 p 640). A second approach (option 2 in EN6 p
640) is to have the index consist of a single entry for each value of
the indexed attribute, followed by a list of record pointers.
The third option, perhaps the most common, is for each index entry to point to blocks of record
pointers, as in Fig 18.5. Think of the
employee file ordered by SSN, and we are creating an index on dno.
Given a value for dno, the records with that value will be scattered
throughout the file. So what the index contains is pairs
⟨dno,pointer⟩, where the pointer is to a block that contains a listof
pointers to the actual data blocks that contain employee records with
that value for dno. We don't have to use an entire block for
this purpose; one block might have a list of pointers for several
different values of dno.
Hashing
Hashing can be used to create a form of index, even if we do not structure the file that way. Fig 18.15 illustrates an example. We can
use hashing with equality comparisons, but not order comparisons, which
is to say hashing can help us find a record with ssn=123456789, but
not records with salary between 40000 and 50000. Hash indexes are, in
theory, a good index for joins. Consider the join
select e.lname, w.pno, w.hours from employee e, works_on w where e.ssn = w.essn;
We might choose each record e from employee, and want to find all
records in works_on with e.ssn = w.essn. A hash index for works_on, on
the field essn, can be useful here. Note this can work even though
works_on.essn is not a key field.
There is one problem: we are likely to be retrieiving blocks of
works_on in semi-random order, which means one disk access for each
record. However, this is still faster than some alternatives.
Multilevel indexes
Perhaps our primary sorted index grows so large that we'd like an index for it. At that point we're creating a multi-level index. To create the ISAM index, we start with the primary
index, with an index entry for the anchor record of each block, or a secondary index, with an entry for each record. Call
this the base level, or first level, of the index. We now create a second level index containing an index entry for the anchor record of each block of the first-level index. See Fig 18.6. This is called an indexed sequential file, or an ISAM file.
This technique works as well on secondary keys, except that the first level is now much larger.
What is involved in inserting
new records into an ISAM structure? The first-level index has to be
"pushed down"; unless we have left space ahead of time, most blocks
will need to be restructured. Then the second and higher-level indexes
will also need to be rebuilt, as anchor records have changed.
Example: we have two records per block, and the first level is
Data file: 1 2 3
5 7 9 20
1st-level index: 1 3 7 20
What happens when we insert 8? 4? 6?
What happens when we "push up a level", that is, add an entry that forces us to have one higher level of index?
EN6 Example 3 (p 644): 30,000 records and can fit 68 index entries per block. The first-level index is 30,000/68 = 442 blocks,
second-level index is 442/68 = 7 blocks; third-level index is 1 block.
B-trees (Bayer trees)
Consider a binary search tree for the moment. We decide which of the two leaf nodes to pursue at each point based on comparison.
We can just as easily build N-ary search trees, with N leaf nodes and N-1 values stored in the node. Consider the ternary example for a moment.
Next, note that we can have a different N at each node!
Bayer trees: let's have a max of 4 data items per node.
1 4 8 9 20 22 25 28
How can we add nodes so as to remain balanced? We also want to minimize partial blocks. A B-tree of order p means that each block has at most p tree pointers, and p-1 key values. In addition, all but the top node has at least (p-1)/2 key values.
"push-up" algorithm: we add a new value to a leaf block. If there is room, we are done. If not, we split the block and push up the middle value to the parent block.
The parent block may now also have to be split.
B+ trees: these are a slight technical improvement where we
replicate all the key values on the bottom level. Inside the tree we
still use key values, but values only rather than full records; this
allows for more values per block (because keys-only are smaller than
full records), and thus a fatter tree. B+-trees are also better suited
for standalone indexes; the original B-tree approach was fundamentally
about a way of origanizing the file itself.
Any index in effect defines an iterator
for traversing the set of records. B-trees allow access to the database
in sorted order, though if the file is not kept organized by the key in
question then we must load a separate block for each record.
Databases all support some mechanism of creating indexes, of specific types. For example, MySQL allows
CREATE INDEX indexname ON employee(ssn) USING BTREE; // or USING HASH
Example query (E&N6, p 660):
select e.fname, e.lname from employee e where e.dno = 4 and e.age=59;
If we have an index on dno, then access that cluster using the index and search for age.
If we have an index on age, we can use that.
If we have an index on (dno, age), we have a single lookup! But we also have expensive indexes to maintain.
Indexes can be ordered or not; ordered indexes help us find employees
with e.age >= 59. Ordered and nonordered indexes both support
lookup; ordered indexes also in effect support return of an iterator yielding all DB records starting at the specified point. BTREE and ISAM indexes are ordered; HASH indexes are not.
Indexes v File Organization
What's the difference between creating a hashed file organization and a
hashed index? Or a B-tree file organization versus a B-tree index? An
index needs to stand alone, and allow finding any given record; a
file-organization method generally allows us to retrieve blocks of records at a time.
A B-tree index technically has to be a B+-tree, as a B-tree includes
file blocks at interior levels. The usual B/B+ tree also includes many
related records in single leaf blocks, but that is straightforward to
give up (at the cost of much less efficient traversal).
Query processing
First, note it is relatively easy to sort
records. Typically we will use merge-sort, where we subdivide into
blocks small enough to fit into memory. We sort those with, say,
quicksort. We then make passes until each pair of consecutive blocks is
merged.
Select
Some examples
- select * from employee e where e.ssn = '123456789'
- select * from department where dnumber >=5
- select * from empoyee where dno = 5
- select * from employee where dno = 5 and salary > 30000 and sex = 'F'
- select * from works_on where essn='123456789' and pno=10
Methods of implementation:
Linear search: we read each disk block once.
Binary search: this is an
option if the selection condition is an equality test on a key
attribute, and the file is ordered by that attribute.
Primary index: same requirement as above, plus we need an index
Hash search: equality comparison on any attribute for which we have a hash index (need not be a key attribute)
Primary index and ordered comparison: use the index to find all departments with dnumber > 5
Clustered index (includes hash indexes)
B-tree index
If we are searching for "dno=5 AND salary=4000" we can use an index for either attribute, or a composite index if one exists.
In general, we'd like to choose the condition with the fewest results, and then continue with the remaining conditions.
For OR conditions, we just have to search for each.
Join
Mostly we focus on equijoins:
- select * from employee e, department d where e.dno = d.dnumber
- select * from department d, employee e where d.mgr_ssn = e.ssn
Methods of implementation:
1. Nested-loop join: this is where we think of a for-loop:
for (e in employee) {
for (d in department) {
if (e.dno = d.dnumber) print(e,d);
}
}
This is quadratic. Though note that we can go through both files in block-by-block fashion.
2. index method: if we have an index on one of the attributes, we can use it:
for (e in employee) {
d = lookup(department, e.ssn);
if (d != null) print_it(e,d);
}
Note that any index will do, but that this may
involve retrieving several disk blocks for each e and will almost
certainly involve retrieving at least one disk block (from department)
for every e in employee. Still, it's better than Method 1.
3. Sort-merge join: we sort both files on the attribute in question,
and then do a join-merge. This takes a single linear pass. This is most
efficient if the files are already sorted, but note that it's still faster than 1 (and possibly faster than 2) if we have to sort the files.
4. Partition-hash join: Let the relations (record sets) be R and S. We
partition both files into Ri = {r in R | hash(r) = i}. Now we note that
the join R ⋈ S is simply the disjoint union of the Ri ⋈ Si. In most
cases, either Ri or Si will be small enough to fit in memory.
A good hash function helps; we want the buckets to be relatively uniform. We won't get that, but we should
hope for Poisson distribution. With N things in K buckets, we expect an
average of a=N/K per bucket. The probability of i things is aie-a/i! and the expected number of i-sized buckets is about N*aie-a/i!.
The join selection factor is the fraction of records that will participate in the join. In the example
select * from department d, employee e where d.mgr_ssn = e.ssn
all departments will be involved in the join, but almost no employees.
So we'd rather go through the departments, looking up managing
employees, rather than the other way around. Book example, EN6 p 693:
we have indexes on both fields. The employee.ssn index takes 4
accesses, and the dept.mgr_ssn index takes two. We have 50 departments
(10 blocks) and 6000 employees (2000 blocks). To do the join by
employee, we have 2000 accesses for the employees, plus, for each
employee, 3 block accesses to find the dept manager. That's 2000 +
6000*3 = 20,000 accesses. In the other order, it's 10 blocks for all
the departments plus for each dept 5 block accesses to find the
manager, for a total of 10 + 5*50 = 260.
In the most common case of joins, the join field is a foreign key in
one file and a primary key in the other. Suppose we keep all files
sorted by their primary key. Then for any join of this type, we can
traverse the primary-key file block by block; for each primary-key
value we need to do a lookup of the FK attribute in the other file.
This would be method 2 above; note that we're making no use of the
primary index.
Query optimization
As with compilers, the word "optimization" is used loosely.
A major rule is to apply select/project before joins, and in general do
operations first that have greatest potential for reducing the size of
the number of records.
Query tree: See fig 19.4.
Select lname from employee e, works_on w, project p
where e.ssn = w.essn and w.pno = p.pnumber and p.pname = 'Aquarius';
See fig 19.5, but note that doing cross-products was never on the agenda. However, it still does help to do the second join first.
Some heuristic rules:
- Move the more-restrictive SELECT operations down towards the leaf nodes; that will reduce number of records.
- Same with PROJECT; try to do joins last
- Do the joins in the right order, using the join selection factor estimate
- Do the smaller joins first (or, more generally, do the joins that
will result in an absolutely smaller set of records, whether this is
because the relations were small to start with or because one condition
was highly selective).
Transactions
A transaction is a set of operations, which we can idealize as read(X), write(X), which at the end we either commit (save) or rollback/abort (discard). Generally we hope rollbacks are rare.
Abstract transactions: operations are read(X) and write(X), for X a field/record/block/table.
Fig 21.2: two example transactions.
T1: read(X), write(X), read(Y), write(Y)
T2: read(X), write(X)
Ideally, we execute the transactions one at a time, with no overlap. In
practice, because of the considerable I/O-wait for disk retrieval, we
cannot afford that. Transactions will overlap. We need to figure out how to arrange for this to happen safely. Here are two problems:
Lost-update (write/write)
problem: this occurs when one transaction writes a value that another
then overwrites. See Fig 21.3a. T1 is transferring N dollars from X to
Y, while T2 is adding M dollars to X. T2.write(X) overwrites T1.write(X).
T1
|
T2
|
read(X)
X = X-N
|
|
|
read(X)
X = X+M
|
write(X)
read(Y)
|
|
|
write(X)
|
Y = Y+N
write(Y)
|
|
Dirty-Read (read/write) problem: this occurs when one transaction reads
an item that another then overwrites. This would end up as a
lost-update, but consider Fig 21.3b where T1 actually aborts after
write(X).
T1
|
T2
|
read(X)
X = X-N
write(X)
|
|
|
read(X)
X = X+M
write(X)
|
read(Y)
abort/rollback
|
|
Fig 21.3c shows a corrupted result; an abort would have been an improvement!
The system log, or journal,
is a master list of all operations performed, used in the event of
rollback. Failures leading to rollback may be system crash, lack of
disk space, lookup error (eg SQL error), locking problems, or other
physical or logical errors.
Items read here can be individual record attributes, entire records, or
entire tables. When joining tables, the entire table is read.
The ACID test
This is commonly described as the following set of features.
- Atomicity: either all steps of a transaction are performed, or
none are. We either commit the entire thing or rollback completely.
- Consistency preservation: the individual transaction shoud preserve database consistency and database invariants
- Isolation: transactions should appear to be executing in isolation.
- Durability: committed transactions survive DB crashes
Note that Durability is arguably implied by Atomicity, or at least
the
two together are implied by some greater "recovery" rule. There are
other redundancies here as well; the main points for transactions are
atomicity and isolation, with the latter understood as a special case
of correctness.
For example, note that Consistency is closely related to Isolation: if each
transaction preserves consistency, and the transactions appear to
execute in Isolation, then a series of transactions will be Consistency
preserving. Consistency would be implied by Serializability: that the
transactions appeared
to have been executed in some serial order. However, this might be a
stronger condition, if we really had a good definition for isolation.
Moral: beware of choosing your terms to fit a clever acronym.
Transaction schedules
Transaction schedules are lists of read/write operations of each
transaction, plus commit/abort. The operations of each individual
transaction must be in their original order.
Eventually we will enforce schedules by using locking, but for now we just consider schedules abstractly.
Examples:
Sa: Fig 21.3a: r1(X), r2(X), w1(X), r1(Y), w2(X), w1(Y), c1, c2
Sb: Fig 21.3b: r1(X), w1(X), r2(X), w2(X), r1(Y), a1
Two operations is a schedule conflict (where the word is to be taken as indicating potential conflict) if:
- They belong to different transactions
- They access the same data item (eg X)
- At least one of the operations is a write()
For example, in Sa, r1(X) and w2(X) conflict. So do r2(X) and w1(X). What does not conflict? r1(X) and r2(X), r1(X) and w1(X).
Conflicting operations are those where interchanging the order can result in a different outcome. A conflict is read-write if it involves a read(X) in one transaction and write(X) in the other, and write-write if it involves two transactions each doing write(X).
A complete schedule is a total order of all the operations, incuding abort/commits. (The book allows some partial ordering of irrelevant parts).
Given a schedule S, the committed projection
C(S) is those operations of S that are part of transactions that have
committed. This is sometimes useful in analysing schedules when
transactions are continuously being added.
We can build a transaction precedence graph
based on schedule conflict, given a schedule S. Let us say Ti ⟶
Tj if Ti and Tj conflict as above, and one of the following occurs:
- Ti executes write(X) and later Tj executes read(X)
- Ti executes read(X) and later Tj executes write(X)
- Ti executes write(X) and later Tj executes write(X)
These are the three specific cases of conflict that can occur; note
that at this level we distinguish between read-write and write-read.
Demo: build graphs for Fig 21.3(a), (b). Note that the figure determines the schedule.
Recoverable and nonrecoverable schedules
Given two transactions T and T', we say T reads from T', or T depends on
T', if T' does a write(X) and then later T does a read(X) (this is Case
1 of T'⟶T above) Thus, the outcome of T may depend on the (at least in
part) earlier T'.
A schedule is recoverable if no transaction T can commit until all T' where T reads from T' have already committed. Otherwise a schedule is nonrecoverable.
The point is that if T reads from T', then if T' aborts, T must abort
too. So T waits to commit until T' commits. Given a recoverable
schedule, we never have to abort a committed transaction. This is, in practice, an essential rule.
Consider the following modification of Sa above:
S'a: r1(X), r2(X), w1(X), r1(Y), w2(X), c2, w1(Y), c1.
This is recoverable even though there is a risk of lost update. But now consider
Sc: r1(X), w1(X), r2(X), r1(Y), w2(X), c2, a1
nonrecoverable: T2 reads X from T1 but T2 commits first
Sd: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y),
c1, c2; recoverable because T2 waited to commit until
T1 did
Suppose T1 aborts in Sd. Then we have Se:
Se: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y), a1, a2;
Though recoverability implies no committed transaction needs to be rolled back, we still may have cascading rollback. A schedule is said to avoid cascading rollback if every transaction T only reads items that were earlier written by committed
transactions. In this case, an abort of some other transaction will not
cause T to abort. How can we achieve this in Sd/Se? By delaying
r2(X) until after c1.
There is also a notion of strict
schedule, in which transactions do not use X for either read or write
until all transactions that wrote X have committed. This is more
expensive to require.
strict implies cascadeless implies recoverable, but not the reverse.
Consider the following three transactions:
T1: X+=10, Y*=1.1;
T2: Y+=10, Z *= 1.1;
T3: Z+=10, X *= 1.1;
What happens if we do these in order T1, T2, T3? We get the same end result in any serialized order.
What happens if we do all the additions and then all the
multiplications? Note that, if we defer the commits, this can be a
recoverable schedule. And there are no lost updates. But it is still not equivalent to any serial schedule.
Serializability
The ideal situation with transactions is that they are executed
serially, but that is too inefficient. Instead, we look for schedules
that are in some sense equivalent
to a serial schedule. In Fig 21.5, (a) and (b) show two serial
schedules. Schedule (c) has a lost update; schedule (d) is ok (in a
sense to be defined below) because we can move T2 to the end of T1.
A schedule is serializable if it is equivalent to a serial schedule. For equivalence, we can use result equivalence, but that is intractable. Instead we use conflict equivalence. Two schedules are conflict-equivalent if the order of any two conflicting operations is the same in both schedules. We can reorder the other (nonconflicting) parts to get what we need.
Conflict-serializability algorithm
Build the transaction precedence (directed) graph above. If
the graph has no cycles, it is conflict-serializable. Do a topological
sort (ie find a total order of the nodes consistent with the Ti⟶Tj
ordering); the original schedule is conflict-equivalent to the schedule
of executing the transactions in that order.
The actual algorithm is as follows: first, find all Ti that do not have
any predecessors, ie Tk with Tk⟶Ti. Do these first, and remove all
Ti⟶Tj links. Now repeat, until there are no more nodes. If at some
stage there are no Ti without predecessors, then there must be a cycle.
Debit-credit transactions
These are transactions built from pieces of the form read(X), X+=N,
write(X). We do not have to serialize these! As long as we avoid the
lost-update problem, we can interleave increments and decrements from
different transactions without harm, because addition is commutative.
The schedule below is not conflict-serializable, but it yields the same
result, namely X = X+N−M, Y=Y+M−N.
T1
|
T2
|
read(X)
X+=N
write(X)
|
|
|
read(Y)
Y+=M
write(Y)
|
read(Y)
Y−=N
write(Y)
|
|
|
read(X)
X−=M
write(X)
|
Locking
We will only consider relatively simple locks; complex locks tend to be inefficient. A binary lock has two states: locked and unlocked. We can implement a per-field/record/block/table lock with a locking table; at any one time, we are likely to have relatively few locks.
When we lock something, we may have to wait. See Fig 22.1. Ideally, we wait in a queue on the lock.
The actual inner details of lock() and unlock() must be atomic. On multi-processor systems, it is not enough simply to disable interrupts; we must actually involve some kernel locking.
A straightforward improvement on binary locks is read-write locks:
write locks imply exclusive access, but several transactions can
read-lock the same data value concurrently. Note that if we intend to
write X eventually, then we will need to write-lock X before reading X; otherwise we may have lost updates.
Two-phase locking
Two-phase locking is a simple protocol in which all locking operations
(including upgrades of read-lock(X) to write-lock(X)) precede the first
unlock operation. The locking phase is called the expanding phase; the unlock is the shrinking phase.
If all transactions use two-phase locking, then any schedule will be conflict-serializable.
In fig 22.3(a), the two-phase locking protocol is not followed. The schedule in 22.3(c) is allowed by the locks, but it is not serializable.
Fig 22.4 shows the same transactions using the two-phase locking protocol. Here, however, we have the potential for deadlock.
Two-phase locking comes in flavors. The version above is called basic. We also have:
- conservative:
transactions must lock all the items to be accessed at the start of the
transaction. This is deadlock-free, but sometimes the transaction does
not know every item to be read until partway through.
- strict: no write locks are
released until commit/rollback. This means that no other transaction
that reads from this one can do so until the first one has committed.
This leads to a strict schedule, which is then cascadeless. Deadlocks may occur.
- rigorous: no locks at all
are released until commit/rollback. Rigorous two-phase locking is sort
of a mirror image of conservative two-phase locking.
Deadlock
Locking protocols generally introduce the risk of deadlock:
T1
|
T2
|
write_lock(X)
|
write_lock(Y)
|
write_lock(Y)
|
write_lock(X)
|
...
|
...
|
Perhaps the simplest strategy for deadlock prevention is to lock
everything according to one master order. This is impractical for the
programmer to enforce; the database can only do it if we ask for all
the locks together: write_lock(X,Y,Z).
In practical terms, we also have to deal with deadlock detection,
and then decide which transaction will die and which will wait. Note
that it soon becomes obvious in practice that deadlock has occurred:
any transaction that waits for more than a few seconds on a lock, and
is not moving up in the queue for that lock, is likely deadlocked. More
formally, we can construct a wait-for graph, and look for cycles.
Timestamps
An alternative to locks entirely is the use of timestamps. We assign
each transaction a timestamp, and then only allow schedules that are
equivalent to executing the transactions in timestamp order. To do
this, each database item has associated with it a read-timestamp and a
write-timestamp; the former is the latest timestamp of all transactions
that have read X, and the latter is the latest timestamp of all
transactions that have written X. If a transaction T attempts read(X),
we first compare T's timestamp with read-timestamp(X). If T's timestamp
is not later than read-timestamp(X), we abort T and put it back in the
transaction queue with a later timestamp; ditto for write(X). This
protocol can involve cascading rollback, so further constraints are
needed to get it to generate cascadeless or strict schedules (or even
recoverable schedules, for that matter).
Transactions in PHP
See http://php.net/manual/en/pdo.transactions.php. Here's an example from that page:
try {
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$db->beginTransaction();
$db->exec("insert into staff (id, first, last) values (23, 'Joe', 'Bloggs')");
$db->exec("insert into salary (id, amount) values (23, 50000)");
$db->commit();
} catch (Exception $e) {
$db->rollBack();
echo "Failed: " . $e->getMessage();
}
Bear in mind that you should not interpose any wait for user input
between the beginTransaction() and the commit(); otherwise you may be
holding a lock "indefinitely".
Basics of forms
A form is enclosed between <form method="post" action="program">
and </form> tags. The usual value for action is "", meaning to
re-invoke the same program (in our case, a php program), but now with posted data values.
Inside the form you can have input items, of the form <input type="input-type" ... >. Input items can be
- text (one-line text boxes)
- button (clickable buttons)
- radio (radio buttons)
- checkbox (checkboxes)
- select: menus with options
- select ... multiple: allows selecting several items simultaneously
- textarea (scrolling text boxes)
- submit (submit buttons, semantically different from ordinary buttons)
- more
Here's a big form. You can see it in action at form1.html.
<form action="form1.cgi" method="get" enctype="multipart/form-data">
<input type = "button" value="click me!" name="b1">
<input type = "button" value="me too!" name="b2">
<p> Are we having fun yet?
<center>
<input type="radio" name="fun" value = "yes">Yep!
<input type="radio" name="fun" value = "no">Nope!
<input type="radio" name="fun" value = "maybe" CHECKED>I dunno!
</center>
<p><b>Why are you doing this?</b>
<input type = "checkbox" name="cbox" value="credit" CHECKED>For course credit
<input type = "checkbox" name="cbox" value="money" CHECKED>For the money!
<br>
<p><b>Who are you?</b>
<input type = "text" name="name1" value="I dunno" size=20>
<p>Now for a menu:
<p>
<select size = 1 name="menu1">
<option> first item
<option> second item
<option> number 3!
<option> fourth
<option> last
</select>
<p> now with size=4</p>
<select size = 4 name="menu2" multiple>
<option> first item
<option> second item
<option> number 3!
<option> fourth
<option> last
</select>
<p>Here's a text box</p>
<textarea name="mybox" rows=5 cols = 30>
This message will be in the box. It's plain text, <b>not</b> html!
I'm going to make it long enough that a scrollbar appears,
through the miracle of cut-and-paste.
I'm going to make it long enough that a scrollbar appears,
through the miracle of cut-and-paste.
</textarea>
<p>
<input type="submit">
</form>
Understand the form action, and the idea tha tone page can have multiple <form ...> ... </form> objects.