Comp 150, Dordal, March 17, 2006 (St Patrick's Day)
Goals:
Try to write a function to do the above steps, that takes the filename as parameter (maybe) and returns words (you don't have to do this, but if you don't then you have to show me your results rather than email them):
def getwords(fname): text = open(fname, 'r').read() ... return words
Now we need to count how often each word occurs, which basically means for each word in the list checking to see if we've seen it before and updating the count. Python dictionaries form a really snazzy way to do this. Here's a simple dictionary example (ready for copy/paste into python); note that dictionaries, unlike most variables, must be created before use.
dict = {} # CREATE the dictionary dict["foo"] = 1 dict["bar"] = 2 dict["baz"] = 1 dict["foo"] += 2 # increment "foo"'s countTry this and then type dict and see what it looks like: the words are keys allowing the lookup of numeric values.
We'll call the dictionary of wordcounts counts. To add new words to counts and increment existing words, the following works (where w is the word in question):
if w in counts: counts[w] += 1 # increment; same as counts[w] = counts[w] + 1 else: counts[w] = 1 # create new entryYou can't use counts[w]+=1 if counts[w] isn't already present because the "+=" incrementing operator will need the pre-existing value and there isn't one in this case.
Do the above for each word w in words: Hint: use a for w in words: loop.
If you make this into a function, it will start like
def dictify(words):
counts = {} #
create the dictionary
for w in words:
Now we have to analyze the dictionary. It's moderately large. You can print it nicely with
for w in counts: print w, counts[w]Other things you can do:
for w in counts: if counts[w]>=k: print w, counts[w]
Email me your python file, or, if you do everything at the console, just show me your final steps.
Much of this lab comes from Zelle's book, pp 370-373