Project 3: small HTTP packets

Here are some HTML-only websites (as of March 2021). The numbers represent their rank at one point in a ranking of top websites. The data originally came from whynohttps.com.

11.   baidu.com
61.   xinhuanet.com
80.   apache.org
102.   babytree.com
123.   tianya.cn
129.   go.com
153.   gnu.org
160.   soso.com
166.   china.com.cn
280.   drudgereport.com
295.   nginx.org
341.   washington.edu
348.   thestartmagazine.com
365.   rlcdn.com
477.   chinadaily.com.cn
494.   yimg.com
525.   gmw.cn
526.   eastday.com
537.   eepurl.com

I loaded each site with Chrome, and recorded all the traffic at my router, 123,962 packets in all. That file is here as project3.pcap. This file can be opened with Wireshark to see the packets. (There's a small sample of the data in first20.pcap, which might be easier to start with, but be aware that it is very incomplete.)

Both data and ACK packets are shown. Generally speaking, I would expect that, for HTTP connections, almost all the data would be in the downstream packets (to the client), and the upstream packets (from the client) would be ACKs only.

But now let's use the Wireshark statistics => packet lengths option. That's all packets in either direction, so let's just look at packets with destination equal to my router, which is 192.168.1.10. I do that by entering a Wireshark display filter into the box at the bottom. While I'm at it, I restrict attention to tcp packets involving port 80 at either end (I can specify tcp.srcport if I wanted the source port):

ip.dst ==192.168.1.10 and tcp.port == 80

This is what we get:

==================================================================================================================================
Packet Lengths:
Topic / Item       Count         Average       Min val       Max val       Rate (ms)     Percent       Burst rate    Burst start 
----------------------------------------------------------------------------------------------------------------------------------
Packet Lengths     25490         1310.07       62            1468          0.0911        100%          1.1600        20.461      
 0-19              0             -             -             -             0.0000        0.00%         -             -           
 20-39             0             -             -             -             0.0000        0.00%         -             -           
 40-79             1947          66.74         62            78            0.0070        7.64%         0.4500        24.556      
 80-159            180           124.47        80            159           0.0006        0.71%         0.1800        29.609      
 160-319           169           235.17        160           317           0.0006        0.66%         0.0500        10.894      
 320-639           440           476.20        320           638           0.0016        1.73%         0.1000        58.891      
 640-1279          516           927.21        641           1275          0.0018        2.02%         0.0800        28.895      
 1280-2559         22238         1462.07       1280          1468          0.0795        87.24%        0.9800        19.605      
 2560-5119         0             -             -             -             0.0000        0.00%         -             -           
 5120 and greater  0             -             -             -             0.0000        0.00%         -             -           

----------------------------------------------------------------------------------------------------------------------------------

Now we just have 25,490 packets. 87% of them are in the size group 1280-2559 (for all intents and purposes, this is 1280-1514; packets cannot be larger than that). This is what I expect.

But 7.64% are in the size range 40-79. What are these small packets? ACKs? If so, what data is being sent from the client?

Your assignment is to try to figure out what all these smaller packets are for. Use as many of Wireshark's features as possible. Because this is http (not https) traffic, you can observe the actual data.

You can view just these packets by adding the following to the previous display filter (joined with and):

and frame.len >= 40 and frame.len < 80

Small ACK-sized packets inbound, as identified by the above filter, suggest large packets outbound. It might be better to look explicitly for large packets outbound; the following filter will work (adjust the frame.len threshold as appropriate):

ip.src ==192.168.1.10 and tcp.port == 80 and frame.len >= 500

Look for a few such packets, follow the TCP connection with the tools below, and report on what data the browser was sending that was relatively large. Try to make sure you don't include the same connection twice (different connections will have different ip.dst / port combinations), and be aware that WireShark resets the main filter every time you trace a connection.

Here are some useful tools (aside from the creative use of display filters):

One approach is to try ~10 packets at random, and see what kind of traffic flow they are part of.

To submit, write up a report discussing your analysis.