flowreplay.tex 22 KB

  1. %% LyX 1.3 created this file. For more info, see http://www.lyx.org/.
  2. %% Do not edit unless you really know what you are doing.
  3. \documentclass[english]{article}
  4. \usepackage{pslatex}
  5. \usepackage[T1]{fontenc}
  6. \usepackage[latin1]{inputenc}
  7. \usepackage{geometry}
  8. \geometry{verbose,letterpaper,tmargin=10mm,bmargin=15mm,lmargin=10mm,rmargin=10mm}
  9. \setcounter{secnumdepth}{4}
  10. \setlength\parskip{\medskipamount}
  11. \setlength\parindent{0pt}
  12. \usepackage{color}
  13. \usepackage{graphicx}
  14. \makeatletter
  15. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands.
  16. \usepackage{verbatim}
  17. \newenvironment{lyxcode}
  18. {\begin{list}{}{
  19. \setlength{\rightmargin}{\leftmargin}
  20. \setlength{\listparindent}{0pt}% needed for AMS classes
  21. \raggedright
  22. \setlength{\itemsep}{0pt}
  23. \setlength{\parsep}{0pt}
  24. \normalfont\ttfamily}%
  25. \item[]}
  26. {\end{list}}
  27. \AtBeginDocument{
  28. \renewcommand{\labelitemii}{\(\ast\)}
  29. \renewcommand{\labelitemiii}{\normalfont\bfseries{--}}
  30. }
  31. \usepackage{babel}
  32. \makeatother
  33. \begin{document}
  34. \title{\textcolor{black}{Flowreplay Design Notes}}
  35. \author{\textcolor{black}{Aaron Turner }\\
  36. \textcolor{black}{http://synfin.net/}}
  37. \date{\textcolor{black}{Last Edited:}\\
  38. \textcolor{black}{October 23, 2003}}
  39. \maketitle
  40. \newpage
  41. \section{\textcolor{black}{Overview}}
  42. \textcolor{black}{Tcpreplay}%
  43. \footnote{\textcolor{black}{http://tcpreplay.sourceforge.net/}%
  44. } \textcolor{black}{was designed to replay traffic previously captured
  45. in the pcap format back onto the wire for testing NIDS and other passive
  46. devices. Over time, it was enhanced to be able to test in-line network
  47. devices. However, a re-occurring feature request for tcpreplay is
  48. to connect to a server in order to test applications and host TCP/IP
  49. stacks. It was determined early on, that adding this feature to tcpreplay
  50. was far too complex, so I decided to create a new tool specifically
  51. designed for this.}
  52. \textcolor{black}{Flowreplay is designed to replay traffic at Layer
  53. 4 or 7 depending on the protocol rather then at Layer 2 like tcpreplay
  54. does. This allows flowreplay to connect to one or more servers using
  55. a pcap savefile as the basis of the connections. Hence, flowreplay
  56. allows the testing of applications running on real servers rather
  57. then passive devices. }
  58. \section{\textcolor{black}{Features}}
  59. \subsection{\textcolor{black}{Requirements}}
  60. \begin{enumerate}
  61. \item \textcolor{black}{Full TCP/IP support, including IP fragments and
  62. TCP stream reassembly.}
  63. \item \textcolor{black}{Support replaying TCP and UDP flows.}
  64. \item \textcolor{black}{Code should handle each flow/service independently.}
  65. \item \textcolor{black}{Should be able to connect to the server(s) in the
  66. pcap file or to a user specified IP address.}
  67. \item \textcolor{black}{Support a plug-in architecture to allow adding application
  68. layer intelligence.}
  69. \item \textcolor{black}{Plug-ins must be able to support multi-flow protocols
  70. like FTP.}
  71. \item \textcolor{black}{Ship with a default plug-in which will work {}``well
  72. enough'' for simple single-flow protocols like HTTP and telnet.}
  73. \item \textcolor{black}{Flows being replayed {}``correctly'' is more important
  74. then performance (Mbps).}
  75. \item \textcolor{black}{Portable to run on common flavors of Unix and Unix-like
  76. systems.}
  77. \end{enumerate}
  78. \subsection{\textcolor{black}{Wishes}}
  79. \begin{enumerate}
  80. \item \textcolor{black}{Support clients connecting to flowreplay on a limited
  81. basis. Flowreplay would replay the server side of the connection.}
  82. \item \textcolor{black}{Support other IP based traffic (ICMP, VRRP, OSPF,
  83. etc) via plug-ins.}
  84. \item \textcolor{black}{Support non-IP traffic (ARP, STP, CDP, etc) via
  85. plug-ins.}
  86. \item \textcolor{black}{Limit which flows are replayed using user defined
  87. filters. (bpf filter syntax?)}
  88. \item \textcolor{black}{Process pcap files directly with no intermediary
  89. file conversions.}
  90. \item \textcolor{black}{Should be able to scale to pcap files in the 100's
  91. of MB in size and 100+ simultaneous flows on a P3 500MHz w/ 256MB
  92. of RAM.}
  93. \end{enumerate}
  94. \section{\textcolor{black}{Design Thoughts}}
  95. \subsection{\textcolor{black}{Sending and Receiving traffic}}
  96. \textcolor{black}{Flowreplay must be able to process multiple connections
  97. to one or more devices. There are two options:}
  98. \begin{enumerate}
  99. \item \textcolor{black}{Use sockets}%
  100. \footnote{\textcolor{black}{socket(2)}%
  101. } \textcolor{black}{to send and receive data}
  102. \item \textcolor{black}{Use libpcap}%
  103. \footnote{\textcolor{black}{http://www.tcpdump.org/}%
  104. } \textcolor{black}{to receive packets and libnet}%
  105. \footnote{\textcolor{black}{http://www.packetfactory.net/projects/libnet/}%
  106. } \textcolor{black}{to send packets}
  107. \end{enumerate}
  108. \textcolor{black}{Although using libpcap/libnet would allow more simultaneous
  109. connections and greater flexibility, there would be a very high complexity
  110. cost associated with it. With that in mind, I've decided to use sockets
  111. to send and receive data.}
  112. \subsection{\textcolor{black}{Handling Multiple Connections}}
  113. \textcolor{black}{Because a pcap file can contain multiple simultaneous
  114. flows, we need to be able to support that too. The biggest problem
  115. with this is reading packet data in a different order then stored
  116. in the pcap file. }
  117. \textcolor{black}{Reading and writing to multiple sockets is easy
  118. with select() or poll(), however a pcap file has it's data stored
  119. serially, but we need to access it randomly. There are a number of
  120. possible solutions for this such as caching packets in RAM where they
  121. can be accessed more randomly, creating an index of the packets in
  122. the pcap file, or converting the pcap file to another format altogether.
  123. Alternatively, I've started looking at libpcapnav}%
  124. \footnote{http://netdude.sourceforge.net/%
  125. } \textcolor{black}{as an alternate means to navigate a pcap file and
  126. process packets out of order.}
  127. \subsection{\textcolor{black}{Data Synchronization}}
  128. \textcolor{black}{Knowing when to start sending client traffic in
  129. response to the server will be \char`\"{}tricky\char`\"{}. Without
  130. understanding the actual protocol involved, probably the best general
  131. solution is waiting for a given period of time after no more data
  132. from the server has been received. Not sure what to do if the client
  133. traffic doesn't elicit a response from the server (implement some
  134. kind of timeout?). This will be the basis for the default plug-in.}
  135. \subsection{\textcolor{black}{TCP/IP}}
  136. \textcolor{black}{Dealing with IP fragmentation and TCP stream reassembly
  137. will be another really complex problem. We're basically talking about
  138. implementing a significant portion of a TCP/IP stack. One thought
  139. is to use libnids}%
  140. \footnote{\textcolor{black}{http://www.avet.com.pl/\textasciitilde{}nergal/libnids/}%
  141. } \textcolor{black}{which basically implements a Linux 2.0.37 TCP/IP
  142. stack in user-space. Other solutions include porting a TCP/IP stack
  143. from Open/Net/FreeBSD or writing our own custom stack from scratch.}
  144. \section{\textcolor{black}{Multiple Independent Flows}}
  145. \textcolor{black}{The biggest asynchronous problem, that pcap files
  146. are serial, has to be solved in a scaleable manner. Not much can be
  147. assumed about the network traffic contained in a pcap savefile other
  148. then Murphy's Law will be in effect. This means we'll have to deal
  149. with:}
  150. \begin{itemize}
  151. \item \textcolor{black}{Thousands of small simultaneous flows (captured
  152. on a busy network)}
  153. \item \textcolor{black}{Flows which {}``hang'' mid-stream (an exploit
  154. against a server causes it to crash)}
  155. \item \textcolor{black}{Flows which contain large quantities of data (FTP
  156. transfers of ISO's for example)}
  157. \end{itemize}
  158. \textcolor{black}{How we implement parallel processing of the pcap
  159. savefile will dramatically effect how well we can scale. A few considerations:}
  160. \begin{itemize}
  161. \item Most Unix systems limit the maximum number of open file descriptors
  162. a single process can have. Generally speaking this shouldn't be a
  163. problem except for highly parallel pcap's.
  164. \item While RAM isn't limitless, we can use mmap() to get around this.
  165. \item Many Unix systems have enhanced solutions to poll() which will improve
  166. flow management.
  167. \end{itemize}
  168. \begin{comment}
  169. \textcolor{black}{Unix systems implement a maximum limit on the number
  170. of file descriptors a single process can open. My Linux box for example
  171. craps out at 1021 (it's really 1024, but 3 are reserved for STDIN,
  172. STDOUT, STDERR), which seems to be pretty standard for recent Unix's.
  173. This means we're limited to at most 1020 simultaneous flows if the
  174. pcap savefile is opened once and half that (510 flows) if the savefile
  175. is re-opened for each flow.}%
  176. \footnote{\textcolor{black}{It appears that most Unix-like OS's allow root to
  177. increase the {}``hard-limit'' beyond 1024. Compiling a list of methods
  178. to do this for common OS's should be added to the flowreplay documentation.}%
  179. }
  180. \textcolor{black}{RAM isn't limitless. Caching packets in memory may
  181. cause problems when one or more flows with a lot of data {}``hang''
  182. and their packets have to be cached so that other flows can be processed.
  183. If you work with large pcaps containing malicious traffic (say packet
  184. captures from DefCon), this sort of thing may be a real problem. Dealing
  185. with this situation would require complicated buffer limits and error
  186. handling.}
  187. \textcolor{black}{Jumping around in the pcap file via fgetpos() and
  188. fsetpos() is probably the most disk I/O intensive solution and may
  189. effect performance. However, on systems with enough free memory, one
  190. would hope the system disk cache will provide a dramatic speedup.
  191. The {}``bookmarks'' used by fgetpos/fsetpos are just 64 bit integers
  192. which are relatively space efficent compared to other solutions.}
  193. \textcolor{black}{The other typical asynchronous issue is dealing
  194. with multiple sockets, which we will solve via poll()}%
  195. \footnote{\textcolor{black}{poll(2)}%
  196. }\textcolor{black}{. Each flow will define a} \textcolor{black}{\emph{struct
  197. pollfd}} \textcolor{black}{and the amount of time in ms to timeout.
  198. Then prior to calling poll() we walk the list of flows and create
  199. the array of pollfd's and determine the flow(s) with the smallest
  200. timeout. A list of these flows is saved for when poll() returns. Finally,
  201. the current time is tucked away and the timeout and array of pollfd's
  202. is passed to poll().}
  203. \textcolor{black}{When poll() returns, the sockets that returned ready
  204. have their plug-in called. If no sockets are ready, then the flows
  205. saved prior to calling poll() are processed.}
  206. \textcolor{black}{Once all flows are processed, all the flows not
  207. processed have their timeout decremented by the time difference of
  208. the current time and when poll was last called and we start again.}
  209. \end{comment}
  210. \subsection{\textcolor{black}{IP Fragments and TCP Streams}}
  211. \textcolor{black}{There are five major complications with flowreplay:}
  212. \begin{enumerate}
  213. \item \textcolor{black}{The IP datagrams may be fragmented- we won't be
  214. able to use the standard 5-tuple (src/dst IP, src/dst port, protocol)
  215. to lookup which flow a packet belongs to.}
  216. \item \textcolor{black}{IP fragments may arrive out of order which will
  217. complicate ordering of data to be sent.}
  218. \item \textcolor{black}{The TCP segments may arrive out of order which will
  219. complicate ordering of data to be sent.}
  220. \item \textcolor{black}{Packets may be missing in the pcap file because
  221. they were dropped during capture.}
  222. \item \textcolor{black}{There are tools like fragrouter which intentionally
  223. create non-deterministic situations.}
  224. \end{enumerate}
  225. \textcolor{black}{First off, I've decided, that I'm not going to worry
  226. about fragrouter or it's cousins. I'll handle non-deterministic situations
  227. one and only one way, so that the way flowreplay handles the traffic
  228. will be deterministic. Perhaps, I'll make it easy for others to write
  229. a plug-in which will change it, but that's not something I'm going
  230. to concern myself with now.}
  231. \textcolor{black}{Missing packets in the pcap file will probably make
  232. that flow unplayable. There are proabably certain situation where
  233. we can make an educated guess, but this is far too complex to worry
  234. about for the first stable release.}
  235. \textcolor{black}{That still leaves creating a basic TCP/IP stack
  236. in user space. The good news it that there is already a library which
  237. does this called libnids. As of version 1.17, libnids can process
  238. packets from a pcap savefile (it's not documented in the man page,
  239. but the code is there).}
  240. \textcolor{black}{A potential problem with libnids though is that
  241. it has to maintain it's own state/cache system. This not only means
  242. additional overhead, but jumping around in the pcap file as I'm planning
  243. on doing to handle multiple simultaneous flows is likely to really
  244. confuse libnids' state engine. Also, libnids is licensed under the
  245. GPL, but I want flowreplay released under a BSD-like license; I need
  246. to research if the two are compatible in this way.}
  247. \textcolor{black}{Possible solutions:}
  248. \begin{itemize}
  249. \item \textcolor{black}{Developing a custom wedge between the capture file
  250. and libnids which will cause each packet to only be processed a single
  251. time.}
  252. \item \textcolor{black}{Use libnids to process the pcap file into a new
  253. flow-based format, effectively putting the TCP/IP stack into a dedicated
  254. utility.}
  255. \item \textcolor{black}{Develop a custom user-space TCP/IP stack, perhaps
  256. based on a BSD TCP/IP stack, much like libnids is based on Linux 2.0.37.}
  257. \item \textcolor{black}{Screw it and say that IP fragmentation and out of
  258. order IP packets/TCP segments are not supported. Not sure if this
  259. will meet the needs of potential users.}
  260. \end{itemize}
  261. \subsection{\textcolor{black}{Blocking}}
  262. \textcolor{black}{As earlier stated, one of the main goals of this
  263. project is to keep things single threaded to make coding plugins easier.
  264. One caveat of that is that any function which blocks will cause serious
  265. problems.}
  266. \textcolor{black}{There are three major cases where blocking is likely
  267. to occur:}
  268. \begin{enumerate}
  269. \item \textcolor{black}{Opening a socket}
  270. \item \textcolor{black}{Reading from a socket}
  271. \item \textcolor{black}{Writing to a socket}
  272. \end{enumerate}
  273. \textcolor{black}{Reading from sockets in a non-blocking manner is
  274. easy to solve for using poll() or select(). Writing to a socket, or
  275. merely opening a TCP socket via connect() however requires a different
  276. method:}
  277. \begin{quotation}
  278. \textcolor{black}{It is possible to do non-blocking IO on sockets
  279. by setting the O\_NONBLOCK flag on a socket file descriptor using
  280. fcntl(2). Then all operations that would block will (usually) return
  281. with EAGAIN (operation should be retried later); connect(2) will return
  282. EINPROGRESS error. The user can then wait for various events via poll(2)
  283. or select(2).}%
  284. \footnote{\textcolor{black}{socket(7)}%
  285. }
  286. \end{quotation}
  287. \textcolor{black}{If connect() returns EINPROGRESS, then we'll just
  288. have to do something like this:}
  289. \begin{lyxcode}
  290. \textcolor{black}{int~e,~len=sizeof(e);}
  291. \textcolor{black}{if~(getsockopt(conn->s,~SOL\_SOCKET,~SO\_ERROR,~\&e,~\&len)~<~0)~\{~}
  292. ~\textcolor{black}{~~~/{*}~not~yet~{*}/}
  293. ~\textcolor{black}{~~~if(errno~!=~EINPROGRESS)\{~~/{*}~yuck.~kill~it.~{*}/~}
  294. ~\textcolor{black}{~~~~~~log\_fn(LOG\_DEBUG,\char`\"{}in-progress~connect~failed.~Removing.\char`\"{});~}
  295. ~\textcolor{black}{~~~~~~return~-1;~}
  296. ~\textcolor{black}{~~~\}~else~\{~}
  297. ~\textcolor{black}{~~~~~~return~0;~/{*}~no~change,~see~if~next~time~is~better~{*}/~}
  298. ~\textcolor{black}{~~~\}~}
  299. \textcolor{black}{\}~}
  300. \textcolor{black}{/{*}~the~connect~has~finished.~{*}/~}
  301. \end{lyxcode}
  302. \begin{quote}
  303. \textcolor{black}{Note: It may not be totally right, but it works
  304. ok. (that chunk of code gets called after poll returns the socket
  305. as writable. if poll returns it as readable, then it's probably because
  306. of eof, connect fails. You must poll for both.}
  307. \end{quote}
  308. \section{\textcolor{black}{pcap vs flow File Format}}
  309. \textcolor{black}{As stated before, the pcap file format really isn't
  310. well suited for flowreplay because it uses the raw packet as a container
  311. for data. Flowreplay however isn't interested in packets, it's interested
  312. in data streams}%
  313. \footnote{\textcolor{black}{A {}``data stream'' as I call it is a simplex
  314. communication from the client or server which is a complete query,
  315. response or message.}%
  316. } \textcolor{black}{which may span one or more TCP/UDP segments, each
  317. comprised of an IP datagram which may be comprised of multiple IP
  318. fragments. Handling all this additional complexity requires a full
  319. TCP/IP stack in user space which would have additional feature requirements
  320. specific to flowreplay.}
  321. \textcolor{black}{Rather then trying to do that, I've decided to create
  322. a pcap preprocessor for flowreplay called: flowprep. Flowprep will
  323. handle all the TCP/IP defragmentation/reassembly and write out a file
  324. containing the data streams for each flow.}
  325. \textcolor{black}{A flow file will contain three sections:}
  326. \begin{enumerate}
  327. \item \textcolor{black}{A header which identifies this as a flowprep file
  328. and the file version}
  329. \item \textcolor{black}{An index of all the flows contained in the file}
  330. \item \textcolor{black}{The data streams themselves}
  331. \end{enumerate}
  332. \begin{center}\textcolor{black}{\includegraphics{flowheader.eps}}\end{center}
  333. \textcolor{black}{At startup, the file header is validated and the
  334. data stream indexes are loaded into memory. Then the first data stream
  335. header from each flow is read. Then each flow and subsequent data
  336. stream is processed based upon the timestamps and plug-ins.}
  337. \section{\textcolor{black}{Plug-ins}}
  338. \textcolor{black}{Plug-ins will provide the {}``intelligence'' in
  339. flowreplay. Flowreplay is designed to be a mere framework for connecting
  340. captured flows in a flow file with socket file handles. How data is
  341. processed and what should be done with it will be done via plug-ins.}
  342. \textcolor{black}{Plug-ins will allow proper handling of a variety
  343. of protocols while hopefully keeping things simple. Another part of
  344. the consideration will be making it easy for others to contribute
  345. to flowreplay. I don't want to have to write all the protocol logic
  346. myself.}
  347. \subsection{\textcolor{black}{Plug-in Basics}}
  348. \textcolor{black}{Each plug-in provides the logic for handling one
  349. or more services. The main purpose of a plug-in is to decide when
  350. flowreplay should send data via one or more sockets. The plug-in can
  351. use any} \textcolor{black}{\emph{non-blocking}} \textcolor{black}{method
  352. of determining if it appropriate to send data or wait for data to
  353. received. If necessary, a plug-in can also modify the data sent.}
  354. \textcolor{black}{Each time poll() returns, flowreplay calls the plug-ins
  355. for the flows which either have data waiting or in the case of a timeout,
  356. those flows which timed out. Afterwords, all the flows are processed
  357. and poll() is called on those flows which have their state set to
  358. POLL. And the process repeats until there are no more nodes in the
  359. tree.}
  360. \subsection{\textcolor{black}{The Default Plug-in}}
  361. \textcolor{black}{Initially, flowreplay will ship with one basic plug-in
  362. called {}``default''. Any flow which doesn't have a specific plug-in
  363. defined, will use default. The goal of the default plug-in is to work
  364. {}``good enough'' for a majority of single-flow protocols such as
  365. SMTP, HTTP, and Telnet. Protocols which use encryption (SSL, SSH,
  366. etc) or multiple flows (FTP, RPC, etc) will never work with the default
  367. plug-in. Furthermore, the default plug-in will only support connections}
  368. \textcolor{black}{\emph{to}} \textcolor{black}{a server, it will not
  369. support accepting connections from clients.}
  370. \textcolor{black}{The default plug-in will provide no data level manipulation
  371. and only a simple method for detecting when it is time to send data
  372. to the server. Detecting when to send data will be done by a {}``no
  373. more data'' timeout value. Basically, by using the pcap file as a
  374. means to determine the order of the exchange, anytime it is the servers
  375. turn to send data, flowreplay will wait for the first byte of data
  376. and then start the {}``no more data'' timer. Every time more data
  377. is received, the timer is reset. If the timer reaches zero, then flowreplay
  378. sends the next portion of the client side of the connection. This
  379. is repeated until the the flow has been completely replayed or a {}``server
  380. hung'' timeout is reached. The server hung timeout is used to detect
  381. a server which crashed and never starts sending any data which would
  382. start the {}``no more data'' timer.}
  383. \textcolor{black}{Both the {}``no more data'' and {}``server hung''
  384. timers will be user defined values and global to all flows using the
  385. default plug-in.}
  386. \subsection{\textcolor{black}{Plug-in Details}}
  387. \textcolor{black}{Each plug-in will be comprised of the following:}
  388. \begin{enumerate}
  389. \item \textcolor{black}{An optional global data structure, for intra-flow
  390. communication}
  391. \item \textcolor{black}{Per-flow data structure, for tracking flow state
  392. information}
  393. \item \textcolor{black}{A list of functions which flow replay will call
  394. when certain well-defined conditions are met.}
  395. \begin{itemize}
  396. \item \textcolor{black}{Required functions:}
  397. \begin{itemize}
  398. \item \textcolor{black}{initialize\_node() - called when a node in the tree
  399. created using this plug-in}
  400. \item \textcolor{black}{post\_poll\_timeout() - called when the poll() returned
  401. due to a timeout for this node}
  402. \item \textcolor{black}{post\_poll\_read() - called when the poll() returned
  403. due to the socket being ready}
  404. \item \textcolor{black}{buffer\_full() - called when a the packet buffer
  405. for this flow is full}
  406. \item \textcolor{black}{delete\_node() - called just prior to the node being
  407. free()'d}
  408. \end{itemize}
  409. \item \textcolor{black}{Optional functions:}
  410. \begin{itemize}
  411. \item \textcolor{black}{pre\_send\_data() - called before data is sent}
  412. \item \textcolor{black}{post\_send\_data() - called after data is sent}
  413. \item \textcolor{black}{pre\_poll() - called prior to poll()}
  414. \item \textcolor{black}{post\_poll\_default() - called when poll() returns
  415. and neither the socket was ready or the node timed out }
  416. \item \textcolor{black}{open\_socket() - called after the socket is opened}
  417. \item \textcolor{black}{close\_socket() - called after the socket is closed}
  418. \end{itemize}
  419. \end{itemize}
  420. \end{enumerate}
  421. \begin{lyxcode}
  422. \end{lyxcode}
  423. \end{document}