|
@@ -0,0 +1,664 @@
|
|
|
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
|
+
|
|
|
+<!--Converted with LaTeX2HTML 2002-2-1 (1.70)
|
|
|
+original version by: Nikos Drakos, CBLU, University of Leeds
|
|
|
+* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
|
|
|
+* with significant contributions from:
|
|
|
+ Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
|
|
|
+<HTML>
|
|
|
+<HEAD>
|
|
|
+<TITLE>Flowreplay Design Notes</TITLE>
|
|
|
+<META NAME="description" CONTENT="Flowreplay Design Notes">
|
|
|
+<META NAME="keywords" CONTENT="flowreplay">
|
|
|
+<META NAME="resource-type" CONTENT="document">
|
|
|
+<META NAME="distribution" CONTENT="global">
|
|
|
+
|
|
|
+<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
|
|
|
+<META NAME="Generator" CONTENT="LaTeX2HTML v2002-2-1">
|
|
|
+<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
|
|
|
+
|
|
|
+<LINK REL="STYLESHEET" HREF="flowreplay.css">
|
|
|
+
|
|
|
+</HEAD>
|
|
|
+
|
|
|
+<BODY >
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<P>
|
|
|
+<H1 ALIGN="CENTER"><SPAN ID="hue33">Flowreplay Design Notes</SPAN></H1>
|
|
|
+<DIV CLASS="author_info">
|
|
|
+
|
|
|
+<P ALIGN="CENTER"><STRONG><SPAN ID="hue35">Aaron Turner </SPAN></STRONG></P>
|
|
|
+<P ALIGN="CENTER"><I><SPAN ID="hue37">http://synfin.net/</SPAN></I></P>
|
|
|
+<P ALIGN="CENTER"><STRONG><SPAN ID="hue39">Last Edited:</SPAN>
|
|
|
+<BR><SPAN ID="hue41">October 23, 2003</SPAN></STRONG></P>
|
|
|
+</DIV>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00010000000000000000">
|
|
|
+<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue43">Overview</SPAN></A>
|
|
|
+</H1>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue45">Tcpreplay</SPAN><A NAME="tex2html1"
|
|
|
+ HREF="#foot362"><SUP><SPAN CLASS="arabic">1</SPAN></SUP></A> <SPAN ID="hue49">was designed to replay traffic previously captured
|
|
|
+in the pcap format back onto the wire for testing NIDS and other passive
|
|
|
+devices. Over time, it was enhanced to be able to test in-line network
|
|
|
+devices. However, a re-occurring feature request for tcpreplay is
|
|
|
+to connect to a server in order to test applications and host TCP/IP
|
|
|
+stacks. It was determined early on, that adding this feature to tcpreplay
|
|
|
+was far too complex, so I decided to create a new tool specifically
|
|
|
+designed for this.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue51">Flowreplay is designed to replay traffic at Layer
|
|
|
+4 or 7 depending on the protocol rather then at Layer 2 like tcpreplay
|
|
|
+does. This allows flowreplay to connect to one or more servers using
|
|
|
+a pcap savefile as the basis of the connections. Hence, flowreplay
|
|
|
+allows the testing of applications running on real servers rather
|
|
|
+then passive devices. </SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00020000000000000000">
|
|
|
+<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue53">Features</SPAN></A>
|
|
|
+</H1>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00021000000000000000">
|
|
|
+<SPAN CLASS="arabic">2</SPAN>.<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue55">Requirements</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue58">Full TCP/IP support, including IP fragments and
|
|
|
+TCP stream reassembly.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue60">Support replaying TCP and UDP flows.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue62">Code should handle each flow/service independently.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue64">Should be able to connect to the server(s) in the
|
|
|
+pcap file or to a user specified IP address.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue66">Support a plug-in architecture to allow adding application
|
|
|
+layer intelligence.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue68">Plug-ins must be able to support multi-flow protocols
|
|
|
+like FTP.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue365">Ship with a default plug-in which will work ``well
|
|
|
+enough'' for simple single-flow protocols like HTTP and telnet.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue366">Flows being replayed ``correctly'' is more important
|
|
|
+then performance (Mbps).</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue74">Portable to run on common flavors of Unix and Unix-like
|
|
|
+systems.</SPAN>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00022000000000000000">
|
|
|
+<SPAN CLASS="arabic">2</SPAN>.<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue77">Wishes</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue80">Support clients connecting to flowreplay on a limited
|
|
|
+basis. Flowreplay would replay the server side of the connection.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue82">Support other IP based traffic (ICMP, VRRP, OSPF,
|
|
|
+etc) via plug-ins.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue84">Support non-IP traffic (ARP, STP, CDP, etc) via
|
|
|
+plug-ins.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue86">Limit which flows are replayed using user defined
|
|
|
+filters. (bpf filter syntax?)</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue88">Process pcap files directly with no intermediary
|
|
|
+file conversions.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue90">Should be able to scale to pcap files in the 100's
|
|
|
+of MB in size and 100+ simultaneous flows on a P3 500MHz w/ 256MB
|
|
|
+of RAM.</SPAN>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00030000000000000000">
|
|
|
+<SPAN CLASS="arabic">3</SPAN> <SPAN ID="hue93">Design Thoughts</SPAN></A>
|
|
|
+</H1>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00031000000000000000">
|
|
|
+<SPAN CLASS="arabic">3</SPAN>.<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue95">Sending and Receiving traffic</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue97">Flowreplay must be able to process multiple connections
|
|
|
+to one or more devices. There are two options:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue100">Use sockets</SPAN><A NAME="tex2html2"
|
|
|
+ HREF="#foot370"><SUP><SPAN CLASS="arabic">2</SPAN></SUP></A> <SPAN ID="hue104">to send and receive data</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue106">Use libpcap</SPAN><A NAME="tex2html3"
|
|
|
+ HREF="#foot371"><SUP><SPAN CLASS="arabic">3</SPAN></SUP></A> <SPAN ID="hue110">to receive packets and libnet</SPAN><A NAME="tex2html4"
|
|
|
+ HREF="#foot372"><SUP><SPAN CLASS="arabic">4</SPAN></SUP></A> <SPAN ID="hue114">to send packets</SPAN>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+<SPAN ID="hue117">Although using libpcap/libnet would allow more simultaneous
|
|
|
+connections and greater flexibility, there would be a very high complexity
|
|
|
+cost associated with it. With that in mind, I've decided to use sockets
|
|
|
+to send and receive data.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00032000000000000000">
|
|
|
+<SPAN CLASS="arabic">3</SPAN>.<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue119">Handling Multiple Connections</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue121">Because a pcap file can contain multiple simultaneous
|
|
|
+flows, we need to be able to support that too. The biggest problem
|
|
|
+with this is reading packet data in a different order then stored
|
|
|
+in the pcap file. </SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue123">Reading and writing to multiple sockets is easy
|
|
|
+with select() or poll(), however a pcap file has it's data stored
|
|
|
+serially, but we need to access it randomly. There are a number of
|
|
|
+possible solutions for this such as caching packets in RAM where they
|
|
|
+can be accessed more randomly, creating an index of the packets in
|
|
|
+the pcap file, or converting the pcap file to another format altogether.
|
|
|
+Alternatively, I've started looking at libpcapnav</SPAN><A NAME="tex2html5"
|
|
|
+ HREF="#foot124"><SUP><SPAN CLASS="arabic">5</SPAN></SUP></A> <SPAN ID="hue126">as an alternate means to navigate a pcap file and
|
|
|
+process packets out of order.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00033000000000000000">
|
|
|
+<SPAN CLASS="arabic">3</SPAN>.<SPAN CLASS="arabic">3</SPAN> <SPAN ID="hue128">Data Synchronization</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue375">Knowing when to start sending client traffic in
|
|
|
+response to the server will be "tricky". Without
|
|
|
+understanding the actual protocol involved, probably the best general
|
|
|
+solution is waiting for a given period of time after no more data
|
|
|
+from the server has been received. Not sure what to do if the client
|
|
|
+traffic doesn't elicit a response from the server (implement some
|
|
|
+kind of timeout?). This will be the basis for the default plug-in.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00034000000000000000">
|
|
|
+<SPAN CLASS="arabic">3</SPAN>.<SPAN CLASS="arabic">4</SPAN> <SPAN ID="hue133">TCP/IP</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue135">Dealing with IP fragmentation and TCP stream reassembly
|
|
|
+will be another really complex problem. We're basically talking about
|
|
|
+implementing a significant portion of a TCP/IP stack. One thought
|
|
|
+is to use libnids</SPAN><A NAME="tex2html6"
|
|
|
+ HREF="#foot403"><SUP><SPAN CLASS="arabic">6</SPAN></SUP></A> <SPAN ID="hue139">which basically implements a Linux 2.0.37 TCP/IP
|
|
|
+stack in user-space. Other solutions include porting a TCP/IP stack
|
|
|
+from Open/Net/FreeBSD or writing our own custom stack from scratch.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00040000000000000000">
|
|
|
+<SPAN CLASS="arabic">4</SPAN> <SPAN ID="hue141">Multiple Independent Flows</SPAN></A>
|
|
|
+</H1>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue143">The biggest asynchronous problem, that pcap files
|
|
|
+are serial, has to be solved in a scaleable manner. Not much can be
|
|
|
+assumed about the network traffic contained in a pcap savefile other
|
|
|
+then Murphy's Law will be in effect. This means we'll have to deal
|
|
|
+with:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<UL>
|
|
|
+<LI><SPAN ID="hue146">Thousands of small simultaneous flows (captured
|
|
|
+on a busy network)</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue379">Flows which ``hang'' mid-stream (an exploit
|
|
|
+against a server causes it to crash)</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue150">Flows which contain large quantities of data (FTP
|
|
|
+transfers of ISO's for example)</SPAN>
|
|
|
+</LI>
|
|
|
+</UL>
|
|
|
+<SPAN ID="hue153">How we implement parallel processing of the pcap
|
|
|
+savefile will dramatically effect how well we can scale. A few considerations:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<UL>
|
|
|
+<LI>Most Unix systems limit the maximum number of open file descriptors
|
|
|
+a single process can have. Generally speaking this shouldn't be a
|
|
|
+problem except for highly parallel pcap's.
|
|
|
+</LI>
|
|
|
+<LI>While RAM isn't limitless, we can use mmap() to get around this.
|
|
|
+</LI>
|
|
|
+<LI>Many Unix systems have enhanced solutions to poll() which will improve
|
|
|
+flow management.
|
|
|
+</LI>
|
|
|
+</UL>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00041000000000000000">
|
|
|
+<SPAN CLASS="arabic">4</SPAN>.<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue157">IP Fragments and TCP Streams</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue159">There are five major complications with flowreplay:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue162">The IP datagrams may be fragmented- we won't be
|
|
|
+able to use the standard 5-tuple (src/dst IP, src/dst port, protocol)
|
|
|
+to lookup which flow a packet belongs to.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue164">IP fragments may arrive out of order which will
|
|
|
+complicate ordering of data to be sent.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue166">The TCP segments may arrive out of order which will
|
|
|
+complicate ordering of data to be sent.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue168">Packets may be missing in the pcap file because
|
|
|
+they were dropped during capture.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue170">There are tools like fragrouter which intentionally
|
|
|
+create non-deterministic situations.</SPAN>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+<SPAN ID="hue173">First off, I've decided, that I'm not going to worry
|
|
|
+about fragrouter or it's cousins. I'll handle non-deterministic situations
|
|
|
+one and only one way, so that the way flowreplay handles the traffic
|
|
|
+will be deterministic. Perhaps, I'll make it easy for others to write
|
|
|
+a plug-in which will change it, but that's not something I'm going
|
|
|
+to concern myself with now.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue175">Missing packets in the pcap file will probably make
|
|
|
+that flow unplayable. There are proabably certain situation where
|
|
|
+we can make an educated guess, but this is far too complex to worry
|
|
|
+about for the first stable release.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue177">That still leaves creating a basic TCP/IP stack
|
|
|
+in user space. The good news it that there is already a library which
|
|
|
+does this called libnids. As of version 1.17, libnids can process
|
|
|
+packets from a pcap savefile (it's not documented in the man page,
|
|
|
+but the code is there).</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue179">A potential problem with libnids though is that
|
|
|
+it has to maintain it's own state/cache system. This not only means
|
|
|
+additional overhead, but jumping around in the pcap file as I'm planning
|
|
|
+on doing to handle multiple simultaneous flows is likely to really
|
|
|
+confuse libnids' state engine. Also, libnids is licensed under the
|
|
|
+GPL, but I want flowreplay released under a BSD-like license; I need
|
|
|
+to research if the two are compatible in this way.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue181">Possible solutions:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<UL>
|
|
|
+<LI><SPAN ID="hue184">Developing a custom wedge between the capture file
|
|
|
+and libnids which will cause each packet to only be processed a single
|
|
|
+time.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue186">Use libnids to process the pcap file into a new
|
|
|
+flow-based format, effectively putting the TCP/IP stack into a dedicated
|
|
|
+utility.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue188">Develop a custom user-space TCP/IP stack, perhaps
|
|
|
+based on a BSD TCP/IP stack, much like libnids is based on Linux 2.0.37.</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue190">Screw it and say that IP fragmentation and out of
|
|
|
+order IP packets/TCP segments are not supported. Not sure if this
|
|
|
+will meet the needs of potential users.</SPAN>
|
|
|
+</LI>
|
|
|
+</UL>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00042000000000000000">
|
|
|
+<SPAN CLASS="arabic">4</SPAN>.<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue193">Blocking</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue195">As earlier stated, one of the main goals of this
|
|
|
+project is to keep things single threaded to make coding plugins easier.
|
|
|
+One caveat of that is that any function which blocks will cause serious
|
|
|
+problems.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue197">There are three major cases where blocking is likely
|
|
|
+to occur:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue200">Opening a socket</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue202">Reading from a socket</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue204">Writing to a socket</SPAN>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+<SPAN ID="hue207">Reading from sockets in a non-blocking manner is
|
|
|
+easy to solve for using poll() or select(). Writing to a socket, or
|
|
|
+merely opening a TCP socket via connect() however requires a different
|
|
|
+method:</SPAN>
|
|
|
+<P>
|
|
|
+<BLOCKQUOTE>
|
|
|
+<SPAN ID="hue210">It is possible to do non-blocking IO on sockets
|
|
|
+by setting the O_NONBLOCK flag on a socket file descriptor using
|
|
|
+fcntl(2). Then all operations that would block will (usually) return
|
|
|
+with EAGAIN (operation should be retried later); connect(2) will return
|
|
|
+EINPROGRESS error. The user can then wait for various events via poll(2)
|
|
|
+or select(2).</SPAN><A NAME="tex2html7"
|
|
|
+ HREF="#foot382"><SUP><SPAN CLASS="arabic">7</SPAN></SUP></A>
|
|
|
+</BLOCKQUOTE>
|
|
|
+<SPAN ID="hue215">If connect() returns EINPROGRESS, then we'll just
|
|
|
+have to do something like this:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<DL COMPACT>
|
|
|
+<DT>
|
|
|
+<DD><SPAN ID="hue218">int e, len=sizeof(e);</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue220">if (getsockopt(conn->s, SOL_SOCKET, SO_ERROR, &e, &len) < 0) { </SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue383"> /* not yet */</SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue384"> if(errno != EINPROGRESS){ /* yuck. kill it. */ </SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue385"> log_fn(LOG_DEBUG,"in-progress connect failed. Removing."); </SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue231"> return -1; </SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue233"> } else { </SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue386"> return 0; /* no change, see if next time is better */ </SPAN>
|
|
|
+<P>
|
|
|
+ <SPAN ID="hue238"> } </SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue240">} </SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue387">/* the connect has finished. */ </SPAN>
|
|
|
+</DD>
|
|
|
+</DL><BLOCKQUOTE>
|
|
|
+<SPAN ID="hue247">Note: It may not be totally right, but it works
|
|
|
+ok. (that chunk of code gets called after poll returns the socket
|
|
|
+as writable. if poll returns it as readable, then it's probably because
|
|
|
+of eof, connect fails. You must poll for both.</SPAN>
|
|
|
+</BLOCKQUOTE>
|
|
|
+
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00050000000000000000">
|
|
|
+<SPAN CLASS="arabic">5</SPAN> <SPAN ID="hue250">pcap vs flow File Format</SPAN></A>
|
|
|
+</H1>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue252">As stated before, the pcap file format really isn't
|
|
|
+well suited for flowreplay because it uses the raw packet as a container
|
|
|
+for data. Flowreplay however isn't interested in packets, it's interested
|
|
|
+in data streams</SPAN><A NAME="tex2html8"
|
|
|
+ HREF="#foot404"><SUP><SPAN CLASS="arabic">8</SPAN></SUP></A> <SPAN ID="hue256">which may span one or more TCP/UDP segments, each
|
|
|
+comprised of an IP datagram which may be comprised of multiple IP
|
|
|
+fragments. Handling all this additional complexity requires a full
|
|
|
+TCP/IP stack in user space which would have additional feature requirements
|
|
|
+specific to flowreplay.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue258">Rather then trying to do that, I've decided to create
|
|
|
+a pcap preprocessor for flowreplay called: flowprep. Flowprep will
|
|
|
+handle all the TCP/IP defragmentation/reassembly and write out a file
|
|
|
+containing the data streams for each flow.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue260">A flow file will contain three sections:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue263">A header which identifies this as a flowprep file
|
|
|
+and the file version</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue265">An index of all the flows contained in the file</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue267">The data streams themselves</SPAN>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+<DIV ALIGN="CENTER">
|
|
|
+<SPAN ID="hue390"><IMG
|
|
|
+ WIDTH="668" HEIGHT="748" ALIGN="BOTTOM" BORDER="0"
|
|
|
+ SRC="img1.png"
|
|
|
+ ALT="\includegraphics{flowheader.eps}"></SPAN>
|
|
|
+</DIV>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue274">At startup, the file header is validated and the
|
|
|
+data stream indexes are loaded into memory. Then the first data stream
|
|
|
+header from each flow is read. Then each flow and subsequent data
|
|
|
+stream is processed based upon the timestamps and plug-ins.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00060000000000000000">
|
|
|
+<SPAN CLASS="arabic">6</SPAN> <SPAN ID="hue276">Plug-ins</SPAN></A>
|
|
|
+</H1>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue392">Plug-ins will provide the ``intelligence'' in
|
|
|
+flowreplay. Flowreplay is designed to be a mere framework for connecting
|
|
|
+captured flows in a flow file with socket file handles. How data is
|
|
|
+processed and what should be done with it will be done via plug-ins.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue280">Plug-ins will allow proper handling of a variety
|
|
|
+of protocols while hopefully keeping things simple. Another part of
|
|
|
+the consideration will be making it easy for others to contribute
|
|
|
+to flowreplay. I don't want to have to write all the protocol logic
|
|
|
+myself.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00061000000000000000">
|
|
|
+<SPAN CLASS="arabic">6</SPAN>.<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue282">Plug-in Basics</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue284">Each plug-in provides the logic for handling one
|
|
|
+or more services. The main purpose of a plug-in is to decide when
|
|
|
+flowreplay should send data via one or more sockets. The plug-in can
|
|
|
+use any</SPAN> <SPAN ID="hue394"><SPAN CLASS="textit">non-blocking</SPAN></SPAN> <SPAN ID="hue288">method
|
|
|
+of determining if it appropriate to send data or wait for data to
|
|
|
+received. If necessary, a plug-in can also modify the data sent.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue290">Each time poll() returns, flowreplay calls the plug-ins
|
|
|
+for the flows which either have data waiting or in the case of a timeout,
|
|
|
+those flows which timed out. Afterwords, all the flows are processed
|
|
|
+and poll() is called on those flows which have their state set to
|
|
|
+POLL. And the process repeats until there are no more nodes in the
|
|
|
+tree.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00062000000000000000">
|
|
|
+<SPAN CLASS="arabic">6</SPAN>.<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue292">The Default Plug-in</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue396">Initially, flowreplay will ship with one basic plug-in
|
|
|
+called ``default''. Any flow which doesn't have a specific plug-in
|
|
|
+defined, will use default. The goal of the default plug-in is to work
|
|
|
+``good enough'' for a majority of single-flow protocols such as
|
|
|
+SMTP, HTTP, and Telnet. Protocols which use encryption (SSL, SSH,
|
|
|
+etc) or multiple flows (FTP, RPC, etc) will never work with the default
|
|
|
+plug-in. Furthermore, the default plug-in will only support connections</SPAN><SPAN ID="hue397"><SPAN CLASS="textit">to</SPAN></SPAN> <SPAN ID="hue299">a server, it will not
|
|
|
+support accepting connections from clients.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue398">The default plug-in will provide no data level manipulation
|
|
|
+and only a simple method for detecting when it is time to send data
|
|
|
+to the server. Detecting when to send data will be done by a ``no
|
|
|
+more data'' timeout value. Basically, by using the pcap file as a
|
|
|
+means to determine the order of the exchange, anytime it is the servers
|
|
|
+turn to send data, flowreplay will wait for the first byte of data
|
|
|
+and then start the ``no more data'' timer. Every time more data
|
|
|
+is received, the timer is reset. If the timer reaches zero, then flowreplay
|
|
|
+sends the next portion of the client side of the connection. This
|
|
|
+is repeated until the the flow has been completely replayed or a ``server
|
|
|
+hung'' timeout is reached. The server hung timeout is used to detect
|
|
|
+a server which crashed and never starts sending any data which would
|
|
|
+start the ``no more data'' timer.</SPAN>
|
|
|
+<P>
|
|
|
+<SPAN ID="hue399">Both the ``no more data'' and ``server hung''
|
|
|
+timers will be user defined values and global to all flows using the
|
|
|
+default plug-in.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H2><A NAME="SECTION00063000000000000000">
|
|
|
+<SPAN CLASS="arabic">6</SPAN>.<SPAN CLASS="arabic">3</SPAN> <SPAN ID="hue309">Plug-in Details</SPAN></A>
|
|
|
+</H2>
|
|
|
+
|
|
|
+<P>
|
|
|
+<SPAN ID="hue311">Each plug-in will be comprised of the following:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<OL>
|
|
|
+<LI><SPAN ID="hue314">An optional global data structure, for intra-flow
|
|
|
+communication</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue316">Per-flow data structure, for tracking flow state
|
|
|
+information</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue318">A list of functions which flow replay will call
|
|
|
+when certain well-defined conditions are met.</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<UL>
|
|
|
+<LI><SPAN ID="hue321">Required functions:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<UL>
|
|
|
+<LI><SPAN ID="hue324">initialize_node() - called when a node in the tree
|
|
|
+created using this plug-in</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue326">post_poll_timeout() - called when the poll() returned
|
|
|
+due to a timeout for this node</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue328">post_poll_read() - called when the poll() returned
|
|
|
+due to the socket being ready</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue330">buffer_full() - called when a the packet buffer
|
|
|
+for this flow is full</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue332">delete_node() - called just prior to the node being
|
|
|
+free()'d</SPAN>
|
|
|
+</LI>
|
|
|
+</UL>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue335">Optional functions:</SPAN>
|
|
|
+<P>
|
|
|
+
|
|
|
+<UL>
|
|
|
+<LI><SPAN ID="hue338">pre_send_data() - called before data is sent</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue340">post_send_data() - called after data is sent</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue342">pre_poll() - called prior to poll()</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue344">post_poll_default() - called when poll() returns
|
|
|
+and neither the socket was ready or the node timed out </SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue346">open_socket() - called after the socket is opened</SPAN>
|
|
|
+</LI>
|
|
|
+<LI><SPAN ID="hue348">close_socket() - called after the socket is closed</SPAN>
|
|
|
+</LI>
|
|
|
+</UL>
|
|
|
+</LI>
|
|
|
+</UL>
|
|
|
+</LI>
|
|
|
+</OL>
|
|
|
+
|
|
|
+<DL COMPACT>
|
|
|
+<DT>
|
|
|
+<DD><P>
|
|
|
+</DD>
|
|
|
+</DL>
|
|
|
+<P>
|
|
|
+
|
|
|
+<H1><A NAME="SECTION00070000000000000000">
|
|
|
+About this document ...</A>
|
|
|
+</H1>
|
|
|
+ <STRONG><SPAN ID="hue33">Flowreplay Design Notes</SPAN></STRONG><P>
|
|
|
+This document was generated using the
|
|
|
+<A HREF="http://www.latex2html.org/"><STRONG>LaTeX</STRONG>2<tt>HTML</tt></A> translator Version 2002-2-1 (1.70)
|
|
|
+<P>
|
|
|
+Copyright © 1993, 1994, 1995, 1996,
|
|
|
+<A HREF="http://cbl.leeds.ac.uk/nikos/personal.html">Nikos Drakos</A>,
|
|
|
+Computer Based Learning Unit, University of Leeds.
|
|
|
+<BR>
|
|
|
+Copyright © 1997, 1998, 1999,
|
|
|
+<A HREF="http://www.maths.mq.edu.au/~ross/">Ross Moore</A>,
|
|
|
+Mathematics Department, Macquarie University, Sydney.
|
|
|
+<P>
|
|
|
+The command line arguments were: <BR>
|
|
|
+ <STRONG>latex2html</STRONG> <TT>-nonavigation -no_subdir -split 0 -show_section_numbers flowreplay.tex</TT>
|
|
|
+<P>
|
|
|
+The translation was initiated by Aaron Turner on 2005-02-10
|
|
|
+<BR><HR><H4>Footnotes</H4>
|
|
|
+<DL>
|
|
|
+<DT><A NAME="foot362">...Tcpreplay</A><A
|
|
|
+ HREF="flowreplay.html#tex2html1"><SUP><SPAN CLASS="arabic">1</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue47">http://tcpreplay.sourceforge.net/</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot370">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html2"><SUP><SPAN CLASS="arabic">2</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue102">socket(2)</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot371">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html3"><SUP><SPAN CLASS="arabic">3</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue108">http://www.tcpdump.org/</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot372">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html4"><SUP><SPAN CLASS="arabic">4</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue112">http://www.packetfactory.net/projects/libnet/</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot124">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html5"><SUP><SPAN CLASS="arabic">5</SPAN></SUP></A></DT>
|
|
|
+<DD>http://netdude.sourceforge.net/
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot403">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html6"><SUP><SPAN CLASS="arabic">6</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue377">http://www.avet.com.pl/~nergal/libnids/</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot382">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html7"><SUP><SPAN CLASS="arabic">7</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue212">socket(7)</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+<DT><A NAME="foot404">... </A><A
|
|
|
+ HREF="flowreplay.html#tex2html8"><SUP><SPAN CLASS="arabic">8</SPAN></SUP></A></DT>
|
|
|
+<DD><SPAN ID="hue389">A ``data stream'' as I call it is a simplex
|
|
|
+communication from the client or server which is a complete query,
|
|
|
+response or message.</SPAN>
|
|
|
+
|
|
|
+</DD>
|
|
|
+</DL>
|
|
|
+<BR><HR>
|
|
|
+<ADDRESS>
|
|
|
+Aaron Turner
|
|
|
+2005-02-10
|
|
|
+</ADDRESS>
|
|
|
+</BODY>
|
|
|
+</HTML>
|