WUMP: Windowing UDP Message Protocols
My UDP Message Protocol comes in three flavors: BUMP, HUMP, and CHUMP.
BUMP: Basic UDP Message Protocol
HUMP: Handoff UDP Message Protocol
CHUMP: Cookie-handling UDP Message Protocol
As a group, these are known as WUMP: Windowing UDP Message Protocol. These
are three related protocols for requesting and downloading a file from a
server. HUMP and CHUMP are basically theoretical attempts to resolve obscure
technical issues; the practical protocol for implementation is BUMP, and
WUMP without further clarification usually refers to BUMP. All three support
sliding windows, although a windowsize of 1 is supported if the client
requests it.
Every packet begins with two bytes for PROTOCOL and OPCODE. PROTOCOL is 1,
2, or 3 respectively for BUMP, HUMP, or CHUMP; OPCODEs are
REQ
|
1
|
DATA
|
2
|
ACK
|
3
|
ERROR/CANCEL
|
4
|
HANDOFF
|
5 (HUMP only)
|
These will generally be defined in an include/import file.
Each protocol begins with the client sending a REQ packet to the server, at
port SERVERPORT. As detailed below, the REQ packet contains both filename
and window size. Check the current header/import files for the numeric
value of SERVERPORT.
The server will not transfer the
file from SERVERPORT; the problem is that if multiple simultaneous transfers
are in progress then the server would have a considerable amount of
bookkeeping to do. Instead, the server will create a child process and
will "hand off" the transfer to that process, which will send the data from
some new port CHILDPORT. The main difference between the three protocols
here is in how this handoff is accomplished. The client never changes its
port number; each separate client instance has a (hopefully) new, unique
port, and so there is no conflict.
The third protocol, CHUMP, contains a "cookie" sent by the client in
its initial REQ and echoed back by the server in every subsequent packet.
The cookie fields don't appear at all in the other protocols. The CHUMP
here is more properly CHUMP8, in that it sends an 8-byte cookie; if 4-byte
cookies were desired then a new protocol CHUMP4 would be defined to
support that packet layout. CHUMP16, with 16-byte cookies, would be
another possibility. By not making the cookie size variable, and specified
in the request packet, we ensure that DATA always begins at the same
offset for a given protocol. We use COOKIESIZE below as if it were a
constant, nominally 8; separate packet layouts would have to be defined
for other sizes.
Packet layouts
REQ packets:
protocol 1 byte
opcode 1 byte
winsize 2 bytes: size of
offered receive window
cookie COOKIESIZE bytes, CHUMP only!
filename N bytes (stored with NUL
below as a C string)
NUL
byte 1 byte
DATA packets will hold a 4-byte sequence number, and 512 bytes of data:
protocol 1 byte
opcode 1 byte
pad
2
bytes; used for alignment only.
blocknum 4 bytes
cookie COOKIESIZE bytes; CHUMP only!
data
DATASIZE bytes, or less, up to length of packet
ACK packets are like DATA, but with no data field:
protocol 1 byte
opcode 1 byte
pad
2
bytes; used for alignment only.
blocknum 4 bytes
cookie COOKIESIZE bytes; CHUMP only!
ERROR packets support error and status reporting and cancellation.
protocol 1 byte
opcode 1 byte
errorcode 2 bytes
cookie COOKIESIZE bytes; CHUMP only!
HANDOFF packets are used by the server in reporting to the client the new
port
it will use. Note that a HANDOFF packet, like any other, may be lost.
protocol 1 byte
opcode 1 byte
newport 2 bytes
BUMP: Basic UDP Message Protocol
Client sends REQ to SERVERPORT. Server responds from CHILDPORT with the
first windowful of data. Client first learns of the new CHILDPORT when this
windowful of data arrives; note that the client should be prepared to "latch
on" to CHILDPORT if it sees any of
the data in the first window.
client
server
|
|
|
|
|
--REQ-to-SERVERPORT--> |
|
|
|
<-DATA-from-CHILDPORT-- |
|
|
PROBLEMS:
When the client sees data coming in from CHILDPORT, it has no explicit
assurance that this data is actually in response to the REQ. The "handoff"
is unauthenticated, so to speak.
If the client sends two REQs, then the server will start up two CHILDPORTs.
Whichever is the first to get data to the client will be the port chosen for
the transfer; the protocol will need a mechanism for cancelling the transfer
on the other port. The simplest such protocol is for the client to send a
CANCEL message to any port from which it receives data, other than the port
it initially "latches on" to.
HUMP: Handoff UDP Message Protocol
Client sends REQ to SERVERPORT. Server sets up CHILDPORT, and then
sends a HANDOFF packet from SERVERPORT which contains the new CHILDPORT.
The client then sends an ACK[0] to this CHILDPORT to start the data.
The advantage is that the client receives clear instructions as to
where the data is to come from.
client
server
|
|
|
-----REQ-to-SERVERPORT--> |
|
|
|
<-HANDOFF-from-SERVERPORT- |
|
|
|
--ACK[0]-to-CHILDPORT--> |
|
|
|
<--DATA-from-CHILDPORT--- |
|
|
PROBLEMS:
If the HANDOFF is lost, the CHILDPORT waits forever. The server will need to
be sure it times out promptly. When the REQ is eventually resent, then the
server will create a new child process and CHILDPORT to handle
what it sees as a new request.
The HANDOFF is not to be
retransmitted on timeout. If it were, then if the REQ were retransmitted
we'd again be in the situation with two separate child processes on two
separate CHILDPORTs sending HANDOFFs, and the client would have to accept
the first and CANCEL the second as in BUMP.
Note that if the REQ is lost the server never sees anything and so never
creates any child processes.
CHUMP: Cookie-Handling UDP Message Protocol
Client sends a REQ to SERVERPORT. The REQ contains a "cookie"; some long
hash of, say, the filename+timestamp. The server includes this cookie in
each DATA packet. We now follow the strategy of BUMP; *however* the presence
of the cookie makes it clear that the incoming DATA from CHILDPORT is
coming as a direct result of the REQ sent to the SERVERPORT.
PROBLEMS:
The cookie consumes bandwidth. A 32-bit cookie doesn't consume too much,
although 64 bits might be safer.
If someone eavesdrops on the unencrypted cookie, malicious attacks are still
possible.
Note also that the server has no way to protect the client from chaos, if
the client always chooses the same cookie.
Final packets
All three protocols will use the sliding-window protocol to send data. The
end of the file will be marked by a packet with number of data bytes less
than 512 (the number of data bytes may be zero, but the UDP portion of the
packet will not have size zero due to the header.) This strategy comes from
TFTP.
Lost-final-ACK problem
Note that if the client sends the final ACK and exits, and this ACK is lost,
then the server will continue to retransmit the final DATA.
The usual solution is to have the client send the final ACK and then
"linger", or "dally", continuing to listen for duplicate DATA. The client at
this point is only prepared to retransmit the ACK; it has already received
all the data. For this reason, and because the linger period should be
several timeout intervals (eg 10 seconds), lingering is best done in the
background.
Old-late-duplicates problem
Suppose we transfer data via one of the above; the client uses CPORT and the
server uses CHILDPORT. During the transfer, an extra DATA[2] is generated.
Then, we transfer another file,
and happen to choose the same ports at both ends. Choosing the same ports
makes it arguably the same "connection", so we'll call it a second incarnation
of the connection. During this second transfer, at the point we're expecting
DATA[2], we get the old late DATA[2] from the first
incarnation of the connection!
Oops.
Solutions:
We can make sure at least one endpoint chooses a new port number each time.
This is another justification for the use of the server CHILDPORT; this is a
port choice under the server's control.
Note that CHUMP has some additional protection here: packets with the wrong
cookie are presumably from an old incarnation of the connection, and can be
ignored. It is the client's responsibility to choose non-duplicate cookies,
but then the client is the one who receives all the data (that is, data does
not flow bidirectionally), so this isn't necessarily inappropriate.