WUMP: Windowing UDP Message Protocols
My UDP Message Protocol comes in three flavors: BUMP, HUMP, and CHUMP.
BUMP: Basic UDP Message Protocol
HUMP: Handoff UDP Message Protocol
CHUMP: Cookie-handling UDP Message Protocol
As a group, these are known as WUMP: Windowing UDP Message Protocol.
These are three related protocols for requesting and downloading a file
from a server. HUMP and CHUMP are basically theoretical attempts to
resolve obscure technical issues; the practical protocol for
implementation is BUMP, and WUMP without further clarification usually
refers to BUMP. All three support sliding windows, although a
windowsize of 1 is supported if the client requests it.
Every packet begins with two bytes for PROTOCOL and OPCODE. PROTOCOL is
1, 2, or 3 respectively for BUMP, HUMP, or CHUMP; OPCODEs are
REQ
|
1
|
DATA
|
2
|
ACK
|
3
|
ERROR/CANCEL
|
4
|
HANDOFF
|
5 (HUMP only)
|
These will generally be defined in an include/import file.
Each protocol begins with the client sending a REQ packet to the
server, at port SERVERPORT. As detailed below, the REQ packet contains
both filename and window size. Check the current header/import
files for the numeric value of SERVERPORT.
The server will not transfer
the file from SERVERPORT; the problem is that if multiple simultaneous
transfers are in progress then the server would have a considerable
amount of bookkeeping to do. Instead, the server will create a
child process and will "hand off" the transfer to that process, which
will send the data from some new port CHILDPORT. The main difference
between the three protocols here is in how this handoff is
accomplished. The client never changes its port number; each separate
client instance has a (hopefully) new, unique port, and so there is no
conflict.
The third protocol, CHUMP, contains a "cookie" sent by the
client in its initial REQ and echoed back by the server in every
subsequent packet. The cookie fields don't appear at all in the other
protocols. The CHUMP here is more properly CHUMP8, in that it sends an
8-byte cookie; if 4-byte cookies were desired then a new protocol
CHUMP4 would be defined to support that packet layout. CHUMP16, with
16-byte cookies, would be another possibility. By not making the cookie
size variable, and specified in the request packet, we ensure that DATA
always begins at the same offset for a given protocol. We use
COOKIESIZE below as if it were a constant, nominally 8; separate packet
layouts would have to be defined for other sizes.
Packet layouts
REQ packets:
protocol 1 byte
opcode 1 byte
winsize 2 bytes: size
of offered receive window
cookie COOKIESIZE bytes, CHUMP only!
filename N bytes (stored with
NUL below as a C string)
NUL byte 1 byte
DATA packets will hold a 4-byte sequence number, and 512 bytes of data:
protocol 1 byte
opcode 1 byte
pad
2 bytes; used for alignment only.
blocknum 4 bytes
cookie COOKIESIZE bytes; CHUMP only!
data
DATASIZE bytes, or less, up to length of packet
ACK packets are like DATA, but with no data field:
protocol 1 byte
opcode 1 byte
pad
2 bytes; used for alignment only.
blocknum 4 bytes
cookie COOKIESIZE bytes; CHUMP only!
ERROR packets support error and status reporting and cancellation.
protocol 1 byte
opcode 1 byte
errorcode 2 bytes
cookie COOKIESIZE bytes; CHUMP only!
HANDOFF packets are used by the server in reporting to the client the new port
it will use. Note that a HANDOFF packet, like any other, may be lost.
protocol 1 byte
opcode 1 byte
newport 2 bytes
BUMP: Basic UDP Message Protocol
Client sends REQ to SERVERPORT. Server responds from CHILDPORT with the
first windowful of data. Client first learns of the new CHILDPORT when
this windowful of data arrives; note that the client should be prepared
to "latch on" to CHILDPORT if it sees any of the data in the first window.
client
server
|
|
|
|
| --REQ-to-SERVERPORT--> |
|
|
| <-DATA-from-CHILDPORT-- |
|
|
PROBLEMS:
When the client sees data coming in from CHILDPORT, it has no explicit
assurance that this data is actually in response to the REQ. The
"handoff" is unauthenticated, so to speak.
If the client sends two REQs, then the server will start up two
CHILDPORTs. Whichever is the first to get data to the client will be
the port chosen for the transfer; the protocol will need a mechanism
for cancelling the transfer on the other port. The simplest such
protocol is for the client to send a CANCEL message to any port from
which it receives data, other than the port it initially "latches on"
to.
HUMP: Handoff UDP Message Protocol
Client sends REQ to SERVERPORT. Server sets up CHILDPORT, and then
sends a HANDOFF packet from SERVERPORT which contains the new CHILDPORT.
The client then sends an ACK[0] to this CHILDPORT to start the data.
The advantage is that the client receives clear instructions as to
where the data is to come from.
client
server
|
|
| -----REQ-to-SERVERPORT--> |
|
|
| <-HANDOFF-from-SERVERPORT- |
|
|
| --ACK[0]-to-CHILDPORT--> |
|
|
| <--DATA-from-CHILDPORT--- |
|
|
PROBLEMS:
If the HANDOFF is lost, the CHILDPORT waits forever. The server will
need to be sure it times out promptly. When the REQ is eventually
resent, then the server will create a *new* child process and CHILDPORT
to handle what it sees as a new request.
The HANDOFF is not to be
retransmitted on timeout. If it were, then if the REQ were
retransmitted we'd again be in the situation with two separate child
processes on two separate CHILDPORTs sending HANDOFFs, and the client
would have to accept the first and CANCEL the second as in BUMP.
Note that if the REQ is lost the server never sees anything and so never creates any child processes.
CHUMP: Cookie-Handling UDP Message Protocol
Client sends a REQ to SERVERPORT. The REQ contains a "cookie"; some
long hash of, say, the filename+timestamp. The server includes this
cookie in each DATA packet. We now follow the strategy of BUMP;
*however* the presence of the cookie makes it clear that the incoming
DATA from CHILDPORT is coming as a direct result of the REQ sent to the SERVERPORT.
PROBLEMS:
The cookie consumes bandwidth. A 32-bit cookie doesn't consume too much, although 64 bits might be safer.
If someone eavesdrops on the unencrypted cookie, malicious attacks are still possible.
Note also that the server has no way to protect the client from chaos, if the client always chooses the same cookie.
Final packets
All three protocols will use the sliding-window protocol to send data.
The end of the file will be marked by a packet with number of data
bytes less than 512 (the number of data bytes may be zero, but the UDP
portion of the packet will not have size zero due to the header.) This
strategy comes from TFTP.
Lost-final-ACK problem
Note that if the client sends the final ACK and exits, and this ACK is
lost, then the server will continue to retransmit the final DATA.
The usual solution is to have the client send the final ACK and then
"linger", or "dally", continuing to listen for duplicate DATA. The
client at this point is only prepared to retransmit the ACK; it has
already received all the data. For this reason, and because the linger
period should be several timeout intervals (eg 10 seconds), lingering
is best done in the background.
Old-late-duplicates problem
Suppose we transfer data via one of the above; the client uses CPORT
and the server uses CHILDPORT. During the transfer, an extra DATA[2] is
generated. Then, we transfer another
file, and happen to choose the same ports at both ends. Choosing the
same ports makes it arguably the same "connection", so we'll call it a
second incarnation of the connection. During this second transfer, at the point we're expecting DATA[2], we get the old late DATA[2] from the first incarnation of the connection!
Oops.
Solutions:
We can make sure at least one endpoint chooses a new port number each
time. This is another justification for the use of the server
CHILDPORT; this is a port choice under the server's control.
Note that CHUMP has some additional protection here: packets with the
wrong cookie are presumably from an old incarnation of the connection,
and can be ignored. It is the client's responsibility to choose
non-duplicate cookies, but then the client is the one who receives all
the data (that is, data does not flow bidirectionally), so this isn't
necessarily inappropriate.