Start with af_inet.c::tcp_protocol structure, with .handler =
tcp_v4_rcv, called for inbound packets.
Next is tcp_ipv4.c::tcp_v4_rcv()
This does preliminary analysis of the inbound
packet (in skb)
=> __inet_lookup_skb =>
__inet_lookup()
Look up whether the packet matches an existing
connection or listening socket (include/net/inet_hashtables.h)
=> tcp_v4_do_rcv [tcp_ipv4.c]
=>
tcp_input.c::tcp_rcv_established() | tcp_v4_hnd_req(sk, skb) |
tcp_child_process(sk, nsk, skb)
=>
tcp_input.c::tcp_rcv_state_process()
Big switch statement of TCP states
->
icsk->icsk_af_ops->conn_request(sk, skb)
==
tcp_v4_conn_request() // see
table at tcp_ipv4.c::line 1758
Note module interface at start of tcp_ipv4.c
Arriving data packet:
tcp_v4_do_rcv()
tcp_rcv_established() [tcp_input.c] # note
comment on fast path v slow path, and reference to "30 instruction TCP
receive"
The latter eventually puts the data into a set of buffers that the receiving application, when it wakes up, can retrieve.
tcp_v4_hnd_req():
inet_csk_search_req: this is looking for the "request
socket", a mini-socket with additional info
tcp_check_req: checks if there is space in the accept
queue
inet_lookup_established: we *did* just call this:
same as __inet_lookup_established with hnum=dport
main path: ends up returning sk
Caller is tcp_v4_do_rcv();
caller falls through to tcp_rcv_state_process
->
icsk->icsk_af_ops->conn_request(sk, skb)
==
tcp_v4_conn_request() // see table
at tcp_ipv4.c::line 1758
tcp_v4_conn_request(): // handles
incoming SYN
// error cases first
tcp_clear_options();
tcp_parse_options;
tcp_openreq_init
save saddr/daddr in ireq, which is a cast of req,
which is a struct request_sock.
saves req using inet_csk_reqsk_queue_hash_add(sk,
req, TCP_TIMEOUT_INIT); // csk = Connected SocKet
see also inet_csk_search_req
calls __tcp_v4_send_synack
tcp_input.c:
int tcp_rcv_state_process(struct
sock *sk, struct sk_buff *skb) // called by
tcp_v4_do_rcv for states besides ESTABLISHED, LISTEN
ESTABLISHED: tcp_data_queue()
HTB is useful for, say, handling multiple customers and giving each of them a pre-configured amount of bandwidth.
Token Bucket is a rate limiter. HTB combines this with fair queuing, which allows excess bandwidth to be shared among currently active users.
First, why is this in the kernel? Mainly to avoid kernel->userland copying. Otherwise it makes sense to put this kind of code in userspace. VXworks (based on BSD) has packet "handles" that are small, can be easily copied to userspace, and which can be used to trigger the sending of the underlying packet at a later point.
Most of the code is in net/sched/sch_htb.c
struct htb_class (line 74): tree of traffic classes.
note rb_root and rb_node fields
see lib/rbtree.c
htb_classify()
jiffies
htb_dequeue_tree()
Note the use of unlikely()