Comp 150-001, TTh, 11:00-2:00, DH-339, Class 11
- Default routing
- Virtual Machines
- Python dictionaries; word-counting
- AI
Network routing tables: default route
Most real routing tables do not list every destination. They list some destinations (usually local ones), and then have a default
destination, to which is routed all traffic not otherwise covered (thus
including all "unknown" traffic). For a site with an connection to the
Internet provided by their ISP, their default route is towards the ISP.
Only "core" routers (or routers in the "default-free" zone) have tables
with no default entry (therefore requiring that every network be listed).
Artificial Intelligence
Chapter 13 of Dale & Lewis: Artificial Intelligence, or AI (sections 13.1, 13.2, 13.5)
There was a 2001 movie with AI as its title, but that was weird. David is a robot boy who loves his human "mommy"
Classic: 2001, Stanley Kubrick/ Arthur C Clarke 1968
HAL the computer goes nutso.
Nominal plot of movie: space flight to Jupiter, where there's a Monolith.
Real plot: HAL is tormented by conflict:
- he is to help the crew with whatever they need
- he is to conceal the true mission from them
So he tries to kill them.
What is AI?
- [Minsky]: making machines do things that people need intelligence to do. But what is "intelligence"?
- [Hayes]: study of intelligence as computation
- [Tessler]: problems we don't yet know how to solve
algorithmically. Note that this means we'll keep removing problems from
the AI category as we learn to solve them with computers.
The "Turing Test"
Turing's 1950 paper, Computing Machinery and Intelligence, in the journal MIND,
outlined the "imitation game", now known as the Turing test.
Version 1:
Man: A
Woman: B
Interrogator: C
C talks to A & B via teletype.
C's goal: guess who is who
A's goal: make C be wrong
Version 2:
A is a MACHINE
B is a person
C is the same
A tries to make C guess wrong.
There's lots of effort expended on the computer version of this,
but all fail tremendously: the usual constraint is that all
conversation must be within a given restrictive domain, and this just
isn't realistic.
Loebner prize
A competition to write programs that participate in the Turing test.
One 2005 entry (Alice, by Richard Wallace)
http://loebner.net/Prizef/2005_Contest/Alice/ALICE-Judge1.htm
Hi Alice, if you are indeed human, you should be able to tell me which is bigger, my big toe or a Boeing 747
[12:05:51] Alice:
Hi there Judge 1. Um, I think I sometimes am.I am a real person.
As you are sitting in front of your computer screen, you may be
wondering if I am a computer or a person talking.
[12:06:10] Judge 1:
Can you answer the question about the toe and the 747?
[12:06:50] Alice:
Please rephrase the question and try asking me again. As you hear
the sound of keys typing you may begin to feel more comfortable
chatting with me.
[12:07:12] Judge 1:
Which is bigger, a 747 or my big toe. Come on, answer it.
[12:08:01] Alice:
The obvious one. Please rephrase your question, Judge 1.
The feel of the keys on the tips of your fingers may remind you of some
questions you wanted to ask me about.
The winner was Rollo Carpenter's Jabberwacky:
http://loebner.net/Prizef/2005_Contest/Jabberwacky/Jabberwacky_Judge_session1.htm
2005-09-18-10-01-22 PROGRAM: Hello, my name is George and I am the human.
2005-09-18-10-02-57 JUDGE: Hi George, cslkjslw weicmslw weos, right?
2005-09-18-10-03-02 PROGRAM: Don't repeat your name, csthiang!
2005-09-18-10-03-36 JUDGE: If lsjkel lwehlx wouldn't you say?
2005-09-18-10-03-38 PROGRAM: No, he's still good.
2005-09-18-10-03-55 JUDGE: Who.
2005-09-18-10-04-11 PROGRAM: You and me.
2005-09-18-10-04-44 JUDGE: If you are human, tell me which is larger, a salt crystal or the Empire State Building.
2005-09-18-10-04-57 PROGRAM: It's caring about another person's well being more than your own.
2005-09-18-10-07-01 JUDGE: What if I care about another person's well being more than I should?
2005-09-18-10-07-01 PROGRAM: How would you do that?
2005-09-18-10-07-43 JUDGE: Which is further away from us, Queens or Inner Mongolia?
2005-09-18-10-07-57 PROGRAM: HA! HA! That was stupid. Tell me a GOOD joke this time.
2005-09-18-10-08-57 JUDGE: You say you are human but you can't answer a simple question.
2005-09-18-10-09-01 PROGRAM: Because I am alive.
2005-09-18-10-10-35 JUDGE: You are not following the
rules. You are supposed to convince me that you are a human, not
convince me that you are mysterious.
Not a lot better.
These are sometimes called "chatterbots".
Compare with the following hypothetical sample in Turing's 1950 paper:
Judge: Please write me a sonnet on the subject of the Forth Bridge.
Respondent: Count me out on this one. I never could write poetry.
Judge: Add 34957 to 70764.
Respondent: (pause for about 30 seconds) 105621 [note correct answer is 105721]
Judge: Do you play chess?
Respondent: Yes.
Judge: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play?
Respondent: (pause for about 15 seconds) R-R8 mate.
The October 2008 winner was Elbot, by Fred Roberts and Artificial Solutions.
[2008-10-12 08:40:40.187] Judge: Hello.
[2008-10-12 08:40:43.453] Respondent: A very non-mechanical hello to you, also.
[2008-10-12 08:41:00.062] Judge: Are you sure?
[2008-10-12 08:41:04.343] Respondent: Speaking of truth, I say this could be beneficial to me.
[2008-10-12 08:41:48.906] Judge: What could be beneficial to you?
[2008-10-12 08:41:54.156] Respondent: That question was rather indiscreet and I prefer not to answer it!
[2008-10-12 08:42:48.875] Judge: So I have already know that you are a real human. [pld: note grammar trick]
[2008-10-12 08:42:56.109] Respondent: Can we agree to abide by a
minimum standard of civilized behavior? I will not listen to insults!
[2008-10-12 08:43:58.328] Judge: Oh. I am sorry.
[2008-10-12 08:44:05.578] Respondent: The expression 'oh' dates to
Middle English as used in the 13th century. What a surprising etymology!
[2008-10-12 08:44:51.312] Judge: Really?
[2008-10-12 08:44:53.531] Respondent: Then again, maybe not.
[2008-10-12 08:45:22.062] Judge: Seems the knowledge is rather sophasticate. [pld: note spelling trick!]
[2008-10-12 08:45:27.359] Respondent: This is confusing. Why bring intellectual perception into this?
====================================================================================
What do people do?
Think effortlessly: visual recognition, natural-language processing, conversation
Think deeply: chess, reasoning, poetic metaphor
Trivial problem:
Given a collection of photographs, put those with a cat in pile A; the rest in pile B
This is ENORMOUSLY difficult for computers.
Even OCR is not easy, and often uses very specific assumptions about fonts.
Language processing
Turing's "test" is based on this. There are lots of other AI areas of
research, but language processing is relatively concrete. Some issues:
analyze grammar
resolve ambiguous pronouns
Stan went to the restaurant. The waiter brought his meal.
Afterwards he left a big tip. [Stan or the waiter?]
resolve ambiguous references
the chickens are ready to eat [who eats whom here?]
resolve ambiguous prepositional phrases
the cat is sleeping on the tv in
pajamas [who is wearing pajamas? The cat or the tv?]
resolve words with multiple meanings
time flies like an arrow, but fruit flies like banana
ron lies asleep in his bed [v tells the truth?]
resolve situations
Sally was fed up. She got up from her table, leaving just enough to cover the check.
The waitress sneered as she walked out
[Who walked out? What did Sally leave on the table? Why did the waitress sneer?]
These are HARD.
Easy language processing (relatively easy, anyway):
put the red block on top of the blue block: EASY!
who is the father of the oldest sister of sam?
what is the average sale amount for salespeople in Nebraska?
================
Parsing is hard.
Restricted Turing test:
same thing as Turing's original test, except questions submitted to the human must be "parsed" by a
computerized judge & found to be within the relevant area.
================
ELIZA session on p 430 of Dale & Lewis
Eliza was based on pattern matching, which is marginally more
sophisticated than the tech-support chatterbot's keyword matching.
Patterns have variables (beginning with ? in the example below); if the
pattern matches, the pieces matched by these variables can be recycled
into the response.
Pattern:(my ?X me ?Y)
Data: (my FATHER MADE me COME HERE)
Result: (your ?X you ?Y)
(your FATHER MADE you COME HERE)
Keyword matches: DEPRESSED ANGRY HELP ....
More: MOTHER FATHER ...
TELL ME MORE ABOUT YOUR FAMILY
WHAT ELSE COMES TO MIND WHEN YOU THINK ABOUT YOUR ?X
Pattern: (I am ?X)
Response: (DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE ?X)
Pattern: (?X like ?Y)
WHAT RESEMBLANCE DO YOU SEE?
(you are ?X but ?Y)
WHAT MAKES YOU THINK I AM ?X
DOES IT PLEASE YOU TO THINK I AM ?X
Weitzenbaum was horrified at the number of people who really thought ELIZA was intelligent, and (worse) was understanding them.
Clearly, pattern-matching is suggestive but NOT intelligent.
Scripts
Fast-forward 10-15 years to SCRIPTS: sequences of events with lots of "slots" to be filled in.
Restaurant script:
Slots begin with an action (ENTER, ORDERS,
SERVES, PAYS below). Slots are ACTOR (who's doing the thing, that is,
the subject), OBJECT (what it's being done to, the direct object),
PLACE (where), or various prepositional phrases modifying the action
(eg TO). (Note that prepositional phrases modifying the actor or object
should properly be included as part of those slots). Some slots place
restrictions on the slot-filler; eg requiring that the actor be a
person, for example.
(ENTERS (ACTOR ?PERSON) (PLACE RESTAURANT))
(ORDERS (ACTOR ?PERSON) (OBJECT ?FOOD) (ORDER_TAKER ?WAITER))
(SERVES (ACTOR ?WAITER) (OBJECT ?FOOD) (TO ?PERSON))
(PAYS (ACTOR ?PERSON)...
Goal: be able to print a summary and answer questions.
Actions
Actors: people, etc
by name, by role, specific v nonspecific
Even degree of specificity is unclear: in a story,
"sam" might denote a character precisely but in a factual article we
might wonder which "sam smith"
Things
food, money, clothing, cars,
More on SCRIPTS
For some cases (superficial language translation), it suffices to do English_text => Internal_format => Spanish_text
BUT often English_text => Internal_format requires fairly deep "common knowledge" to succeed.
SOMETIMES, though less often, Internal_format => Spanish requires additional common knowledge.
More info about ACTORS:
They can be assigned emotional/mental states, either as consequences of the script or as "inputs" to the script
happy, hungry, angry
Social situations:
Someone enters store
selects items
goes to checkout
pays
leaves
Clothing store: try things on?
How do we represent common knowledge?
How do we apply it in translating to Internal_format?
SAM represented knowledge as a bunch of SCRIPTS.
Ambiguity again
Pronoun-antecedent ambiguity examples:
What is flying:
-
Fred saw the plane flying over Zurich.
-
Fred saw the mountains flying over Zurich.
Who is they?
- The police arrested the demonstrators because they feared violence.
- The police arrested the demonstrators because they advocated violence.
What is it?
- Mary saw the dog in the store window and wanted it.
- Mary saw the dog in the store window and pressed her nose up against it.
Processing newspaper articles
Can we build a data structure representing the article?
Can we ANSWER QUESTIONS about the article?
This involves understanding the questions as well as manipulating the database
Who is story about
Was anyone hurt? Who?
Was anyone arrested?
Processing short (very short) stories: need for a model of social motivation
At what point can we use a DICTIONARY? Probably not early.
Consider "which is bigger, a 747 or a grain of rice"?
Where in the dictionary would it say a rice grain is small?
============
Two issues in natural language:
1. Translating a sentence to a "frame" structure by resolving pronouns, ambiguities, modifiers, references
2. Being able to answer questions about a story.
For #1, we can try both (or more) alternates and see which makes more sense. Database of basic facts resolves most sentences in isolation.
Expert Systems
These have been more successful than natural-language processing. Dale
& Lewis has a section on them, 13.3. Note that many natural
language systems use some form of "expert system" internally, to encode
the world knowledge that is necessary for language interpretation.
Databases
Dale & Lewis 12.3
You can keep a simple database as a spreadsheet, that is, as a single table.
Columns represent fields; rows represent one record about an entity.
For example, see the Movie database on p 389 of D&L, or the
customer table on p 390.
But there's a problem. Suppose we try to keep track of movie rentals
with one big table, containing columns Title, Rating,
Renter_Name, Address, and Year. It might look like this:
title
|
rating
|
name
|
address
|
year
|
2001
|
PG-13
|
Peter
|
Rogers Park
|
1998
|
Monty Python
|
PG-13
|
Peter
|
Rogers Park
|
1999
|
2001
|
PG-13
|
Peter
|
Shabbona
|
2000
|
AI
|
PG-13
|
Peter
|
Shabbona
|
2002
|
The problem is that Peter moved, and the old address is still embedded in some of the old records!
The generally accepted right way to do this is to have separate tables
for movie information and for customer information, and then to have a
third table of just the MovieID, CustomerID, and the date (in D&L,
Fig 12.9 on p 391, the DateDue is also added here).
The CustomerID is the key for the Customer table. The MovieID is the
key for the Movie table. Only the keys appear in the Rents table; note
that the key to Rents is probably the three
columns CustomerID, MovieID, and DateRented. If we want to update a
customer's address, we do so only once, in the Customer table. The real
problem with the table above is that the address should depend on the name, but because the name is not the key, it is difficult to enforce that.
To put it another way, in our original table there was a non-key dependency:
the address depends on the name, but the name is not a key. (The rating
is also a non-key dependency on the title). The general rule is that
whenever there is a dependency of one field on another within a larger
table, eg (name, address)
- drop the dependent field from the table (address)
- create a new table of the subkey (name) and dependent field (address)
This process is called normalization.
The standard mechanism for working with database tables is Structured
Query Language, SQL, usually pronounced Sequel. Actually, "Sequel" was
the trade name of an early IBM product that later became SQL, so this
isn't a case of trying to sound out letters.
Virtual Machines
How are windows and linux programs
different? Both consist of generic sequences of x86 machine code,
more or less interchangeable, interspersed with OS-specific system calls, the latter generally made through trap instructions, below.
We discussed earlier the two-level user/supervisor cpu model:
- Some instructions (eg I/O, memory allocation) can only be executed in supervisor mode
- The only way a user-mode program can get to supervisor mode is to
execute a trap instruction, causing control to jump to a specific preloaded library routine (the trap handler) that is part of the OS
kernel
- When that kernel routine finishes, it resumes the user program, again in user mode
Now suppose you want to run Windows under Linux.
Method 1: interpreted machine. You write software to interpret each
instruction of the Windows code. Alas, this tends to be slow by a
factor of about 10.
Method 2:
When the windows process begins to run, replace the linux trap handler with a modified windows trap handler. The modified handler is formed by jumping to the regular windows trap handler, but first switching back to user mode! We also make one further set of additions to the windows trap/exception handlers, below.
Now the windows trap handler runs, as part of an ordinary linux
process. It does the usual validation/testing stuff, but eventually it
reaches a privileged operation: an I/O instruction, or an update of a
page table, or an update to video RAM. At this point, the linux trap
mechanism gains control, and now the second
part of the VM gains control: the special trap/exception handlers for
these privileged operations. These replace the video-RAM write with a
window-update, or an I/O call with a linux system call to handle the
I/O (perhaps replacing a physical disk with a "disk file" that is part
of the linux filesystem), or a virtual-memory page-table update with an
update consistent with the physical memory allocated to the VM.
The end result: the windows system call appears to have completed normally, even though the "host system" is running linux.
And this is all you need to do to get windows to run as a "guest" system on a linux host!
(There are a few x386 instructions that don't quite work properly for
the above scheme: the windows trap handler can execute instructions
that work in user mode, but inform the windows handler (that thinks it
is running natively) of something that it "should not" see. One
approach is to scan the windows trap handler for such instructions when
initially loading it, and then modify them in place.)
Python
def collatz(n):
count = 0
while n!=1:
if n%2 == 0: n=n/2
else: n=3*n+1
count += 1
return count
index versus value: lab 3
vals = map (collatz, range(1000))
max(vals)
vals.index(178)
map (function, list)
filter (function, list)
[x for x in list if function(x)]
reduce(function(x,y), list)
face(): returns a LIST of drawing pieces? Each drawn with x.draw(w)?
for x in L: x.draw(w)
Here's an example:
from graphics import *
def face(p):
leyep = p.clone()
leyep.move(-40, -20)
reyep = p.clone()
reyep.move(40, -20)
nosep = p.clone()
nosep.move(0,5)
m1 = p.clone()
m1.move(-35, 45)
m2 = p.clone()
m2.move(35, 45)
mc = p.clone()
mc.move(0,60)
face = []
face = face + [Circle(p, 100)]
face = face + [Circle(leyep, 10)]
face = face + [Circle(reyep, 10)]
face = face + [Circle(nosep, 5)]
face = face + [Line(m1, mc)]
face = face + [Line(mc, m2)]
return face;
w = GraphWin("pld", 400, 400)
for x in face(Point(200, 200)): x.draw(w)
Word-Count example
How many unique words are there in, say Shakespeare's Sonnet 18?
Shall I compare thee to a summer's day
Thou art more lovely and more temperate
Rough winds do shake the darling buds of May
And summer's lease hath all too short a date
Sometime too hot the eye of heaven shines
And often is his gold complexion dimmed
And every fair from fair sometime declines,
By chance, or nature's changing course untrimmed
But thy eternal summer shall not fade
Nor lose possession of that fair thou ow'st
Nor shall death brag thou wander'st in his shade
When in eternal lines to time thou grow'st
So long as men can breathe, or eyes can see
So long lives this, and this gives life to thee
Counting the words themselves is pretty easy, but how do we detect duplicates?
- Read the entire file into the string text:
text = open("sonnet18", 'r').read()
- Convert to lower case:
text = string.lower(text) # from string module
- convert those pesky newlines to spaces (slightly trickier):
text = string.replace(text, '\n', ' ')
- split into individual words magically:
words = string.split(text)
You can try this with the sonnet above, and see what words is. Here's all this rolled into one function (you need import string
first):
def getwords():
text = open("sonnet18", 'r').read()
text = string.lower(text)
text = string.replace(text, '\n', ' ') # replace newlines with spaces
words = string.split(text)
# split into
words
return words
Next, we create a dictionary:
wcounts = {}
Now we'll write a function to add all the word to the dictionary:
def addwords(wlist):
for w in wlist:
if w in wcounts:
wcounts[w] += 1
else:
wcounts[w] = 1
We can then fill the dictionary with
w = getwords()
addwords(w)
Demo: type "wcounts" to see the dictonary
Here's a simple function to print those entries in the dictionary for which the count is >= threshold:
def printcounts(thresh):
for w in wcounts:
if wcounts[w]>=thresh: print w, wcounts[w]
Demo: try it.