Subsections

4 Multiple Independent Flows

The biggest asynchronous problem, that pcap files are serial, has to be solved in a scaleable manner. Not much can be assumed about the network traffic contained in a pcap savefile other then Murphy's Law will be in effect. This means we'll have to deal with:

How we implement parallel processing of the pcap savefile will dramatically effect how well we can scale. A few considerations:

4.1 IP Fragments and TCP Streams

There are five major complications with flowreplay:

  1. The IP datagrams may be fragmented- we won't be able to use the standard 5-tuple (src/dst IP, src/dst port, protocol) to lookup which flow a packet belongs to.
  2. IP fragments may arrive out of order which will complicate ordering of data to be sent.
  3. The TCP segments may arrive out of order which will complicate ordering of data to be sent.
  4. Packets may be missing in the pcap file because they were dropped during capture.
  5. There are tools like fragrouter which intentionally create non-deterministic situations.
First off, I've decided, that I'm not going to worry about fragrouter or it's cousins. I'll handle non-deterministic situations one and only one way, so that the way flowreplay handles the traffic will be deterministic. Perhaps, I'll make it easy for others to write a plug-in which will change it, but that's not something I'm going to concern myself with now.

Missing packets in the pcap file will probably make that flow unplayable. There are proabably certain situation where we can make an educated guess, but this is far too complex to worry about for the first stable release.

That still leaves creating a basic TCP/IP stack in user space. The good news it that there is already a library which does this called libnids. As of version 1.17, libnids can process packets from a pcap savefile (it's not documented in the man page, but the code is there).

A potential problem with libnids though is that it has to maintain it's own state/cache system. This not only means additional overhead, but jumping around in the pcap file as I'm planning on doing to handle multiple simultaneous flows is likely to really confuse libnids' state engine. Also, libnids is licensed under the GPL, but I want flowreplay released under a BSD-like license; I need to research if the two are compatible in this way.

Possible solutions:

4.2 Blocking

As earlier stated, one of the main goals of this project is to keep things single threaded to make coding plugins easier. One caveat of that is that any function which blocks will cause serious problems.

There are three major cases where blocking is likely to occur:

  1. Opening a socket
  2. Reading from a socket
  3. Writing to a socket
Reading from sockets in a non-blocking manner is easy to solve for using poll() or select(). Writing to a socket, or merely opening a TCP socket via connect() however requires a different method:

It is possible to do non-blocking IO on sockets by setting the O_NONBLOCK flag on a socket file descriptor using fcntl(2). Then all operations that would block will (usually) return with EAGAIN (operation should be retried later); connect(2) will return EINPROGRESS error. The user can then wait for various events via poll(2) or select(2).7
If connect() returns EINPROGRESS, then we'll just have to do something like this:

int e, len=sizeof(e);

if (getsockopt(conn->s, SOL_SOCKET, SO_ERROR, &e, &len) < 0) { 

    /* not yet */

    if(errno != EINPROGRESS){  /* yuck. kill it. */ 

       log_fn(LOG_DEBUG,"in-progress connect failed. Removing."); 

       return -1; 

    } else { 

       return 0; /* no change, see if next time is better */ 

    } 

/* the connect has finished. */ 

Note: It may not be totally right, but it works ok. (that chunk of code gets called after poll returns the socket as writable. if poll returns it as readable, then it's probably because of eof, connect fails. You must poll for both.



Footnotes

... 7
socket(7)
Aaron Turner 2005-08-07