Due: Friday April 23
For this project we will return to analyzing the pcap file of project 3, project3.pcap, except this time we'll do it with Python. We'll also use my packet.py library for reading packet headers.
You will also have to install the python-libpcap library. See pypi.org/project/python-libpcap. To install it on your lubuntu virtual machine, execute the following as root:
However, you can also install it natively on your machine (see the website for instructions) and do this assignment without your virtual machine.
This library supports real-time capture from interfaces, but we're just going to read from files.
What you are to do is to answer the following:
As an example of how to get started, here is connection1.py. It goes through the pcap file. For each TCP packet, it extracts the socketpair (localaddr, localport, remoteaddr, remoteport), and uses this (along with the packet's direction) as a key to the dictionary CONNECTIONDICT. Corresponding to each key is a list of all packets of that connection, in that direction. The final output is to print the number of packets in the upstream direction for each TCP connection.
Here is the basic scan through the pcap file:
def process_packets(fname): sum = 0 count=0 for length, time, pktbuf in rpcap(fname): # here is where we examine each packet process_one_pkt(length, time, pktbuf, ETHHDRLEN)
Here is the process_one_pkt() function. Note how headers are extracted, and how packets are entered into CONNECTIONDICT.
def process_one_pkt(length, time,
pktbuf : bytes, startpos):
global CONNECTIONDICT
ethh= ethheader.read(pktbuf, 0)
if ethh.ethtype != 0x0800: return
None # ignore non-ipv4 packets
iph = ip4header.read(pktbuf, ETHHDRLEN)
if not iph: return
# returns None if it doesn't look like an IPv4 packet
if iph.proto == UDP_PROTO:
udph = udpheader.read(pktbuf,
ETHHDRLEN + iph.iphdrlen)
# if udph.dstport == 53:
print('DNS packet')
return
if iph.proto != TCP_PROTO: return
# ignore
tcph = tcpheader.read(pktbuf, ETHHDRLEN +
iph.iphdrlen) # here we *do* allow for the possibility
of header options
if not tcph: return
# Again, tcpheader.read() returns None if it doesn't
look like a TCP packet
datalen = iph.length - iph.iphdrlen
-tcph.tcphdrlen # can't use len(pktbuf) because of
tcpdump-applied trailers
#print (socket.inet_ntoa(srcaddrb), dstport, dlen)
if iph.srcaddrb == LOCALADDRB:
# source address is local endpoint
localport =
tcph.srcport
remoteport =
tcph.dstport
remoteaddrb = iph.dstaddrb
upstream =
True
else:
localport =
tcph.dstport
remoteaddrb = iph.srcaddrb
remoteport =
tcph.srcport
upstream =
False
key = (LOCALADDRB, localport, remoteaddrb,
remoteport, upstream)
if key in CONNECTIONDICT:
CONNECTIONDICT[key].append(pktbuf)
else:
CONNECTIONDICT[key] =
[pktbuf]
Note that, because some "trailers" were added by the original packet-capturing software, tcpdump, the length of the packets should be taken from iph.length, rather than from len(pktbuf) or the "length" variable. (The trailer is identified in the .pcap file as "VSS Monitoring Ethernet trailer"; there is also sometimes padding bytes.)
(This program reports 1793 connections. Wireshark reports 1799 (Statistics => Connections => TCP tab). Why the discrepancy?)
Turn in your python program or programs, and a brief summary of your answers to 1-5 above. Tell me which program was used to answer which question!