node4.html 10 KB


  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
  2. <!--Converted with LaTeX2HTML 2002-2-1 (1.70)
  3. original version by: Nikos Drakos, CBLU, University of Leeds
  4. * revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
  5. * with significant contributions from:
  6. Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
  7. <HTML>
  8. <HEAD>
  9. <TITLE>4 Multiple Independent Flows</TITLE>
  10. <META NAME="description" CONTENT="4 Multiple Independent Flows">
  11. <META NAME="keywords" CONTENT="flowreplay">
  12. <META NAME="resource-type" CONTENT="document">
  13. <META NAME="distribution" CONTENT="global">
  14. <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
  15. <META NAME="Generator" CONTENT="LaTeX2HTML v2002-2-1">
  16. <META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
  17. <LINK REL="STYLESHEET" HREF="flowreplay.css">
  18. <LINK REL="next" HREF="node5.html">
  19. <LINK REL="previous" HREF="node3.html">
  20. <LINK REL="up" HREF="flowreplay.html">
  21. <LINK REL="next" HREF="node5.html">
  22. </HEAD>
  23. <BODY >
  24. <DIV CLASS="navigation"><!--Navigation Panel-->
  25. <A NAME="tex2html72"
  26. HREF="node5.html">
  27. <IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="next.png"></A>
  28. <A NAME="tex2html70"
  29. HREF="flowreplay.html">
  30. <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="up.png"></A>
  31. <A NAME="tex2html64"
  32. HREF="node3.html">
  33. <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="prev.png"></A>
  34. <BR>
  35. <B> Next:</B> <A NAME="tex2html73"
  36. HREF="node5.html">5 pcap vs flow</A>
  37. <B> Up:</B> <A NAME="tex2html71"
  38. HREF="flowreplay.html">Flowreplay Design Notes</A>
  39. <B> Previous:</B> <A NAME="tex2html65"
  40. HREF="node3.html">3 Design Thoughts</A>
  41. <BR>
  42. <BR></DIV>
  43. <!--End of Navigation Panel-->
  44. <!--Table of Child-Links-->
  45. <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></A>
  46. <UL CLASS="ChildLinks">
  47. <LI><A NAME="tex2html74"
  48. HREF="node4.html#SECTION00041000000000000000"><SPAN CLASS="arabic">4</SPAN>.<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue157">IP Fragments and TCP Streams</SPAN></A>
  49. <LI><A NAME="tex2html75"
  50. HREF="node4.html#SECTION00042000000000000000"><SPAN CLASS="arabic">4</SPAN>.<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue193">Blocking</SPAN></A>
  51. </UL>
  52. <!--End of Table of Child-Links-->
  53. <HR>
  54. <H1><A NAME="SECTION00040000000000000000">
  55. <SPAN CLASS="arabic">4</SPAN> <SPAN ID="hue141">Multiple Independent Flows</SPAN></A>
  56. </H1>
  57. <P>
  58. <SPAN ID="hue143">The biggest asynchronous problem, that pcap files
  59. are serial, has to be solved in a scaleable manner. Not much can be
  60. assumed about the network traffic contained in a pcap savefile other
  61. then Murphy's Law will be in effect. This means we'll have to deal
  62. with:</SPAN>
  63. <P>
  64. <UL>
  65. <LI><SPAN ID="hue146">Thousands of small simultaneous flows (captured
  66. on a busy network)</SPAN>
  67. </LI>
  68. <LI><SPAN ID="hue379">Flows which ``hang'' mid-stream (an exploit
  69. against a server causes it to crash)</SPAN>
  70. </LI>
  71. <LI><SPAN ID="hue150">Flows which contain large quantities of data (FTP
  72. transfers of ISO's for example)</SPAN>
  73. </LI>
  74. </UL>
  75. <SPAN ID="hue153">How we implement parallel processing of the pcap
  76. savefile will dramatically effect how well we can scale. A few considerations:</SPAN>
  77. <P>
  78. <UL>
  79. <LI>Most Unix systems limit the maximum number of open file descriptors
  80. a single process can have. Generally speaking this shouldn't be a
  81. problem except for highly parallel pcap's.
  82. </LI>
  83. <LI>While RAM isn't limitless, we can use mmap() to get around this.
  84. </LI>
  85. <LI>Many Unix systems have enhanced solutions to poll() which will improve
  86. flow management.
  87. </LI>
  88. </UL>
  89. <P>
  90. <H2><A NAME="SECTION00041000000000000000">
  91. <SPAN CLASS="arabic">4</SPAN>.<SPAN CLASS="arabic">1</SPAN> <SPAN ID="hue157">IP Fragments and TCP Streams</SPAN></A>
  92. </H2>
  93. <P>
  94. <SPAN ID="hue159">There are five major complications with flowreplay:</SPAN>
  95. <P>
  96. <OL>
  97. <LI><SPAN ID="hue162">The IP datagrams may be fragmented- we won't be
  98. able to use the standard 5-tuple (src/dst IP, src/dst port, protocol)
  99. to lookup which flow a packet belongs to.</SPAN>
  100. </LI>
  101. <LI><SPAN ID="hue164">IP fragments may arrive out of order which will
  102. complicate ordering of data to be sent.</SPAN>
  103. </LI>
  104. <LI><SPAN ID="hue166">The TCP segments may arrive out of order which will
  105. complicate ordering of data to be sent.</SPAN>
  106. </LI>
  107. <LI><SPAN ID="hue168">Packets may be missing in the pcap file because
  108. they were dropped during capture.</SPAN>
  109. </LI>
  110. <LI><SPAN ID="hue170">There are tools like fragrouter which intentionally
  111. create non-deterministic situations.</SPAN>
  112. </LI>
  113. </OL>
  114. <SPAN ID="hue173">First off, I've decided, that I'm not going to worry
  115. about fragrouter or it's cousins. I'll handle non-deterministic situations
  116. one and only one way, so that the way flowreplay handles the traffic
  117. will be deterministic. Perhaps, I'll make it easy for others to write
  118. a plug-in which will change it, but that's not something I'm going
  119. to concern myself with now.</SPAN>
  120. <P>
  121. <SPAN ID="hue175">Missing packets in the pcap file will probably make
  122. that flow unplayable. There are proabably certain situation where
  123. we can make an educated guess, but this is far too complex to worry
  124. about for the first stable release.</SPAN>
  125. <P>
  126. <SPAN ID="hue177">That still leaves creating a basic TCP/IP stack
  127. in user space. The good news it that there is already a library which
  128. does this called libnids. As of version 1.17, libnids can process
  129. packets from a pcap savefile (it's not documented in the man page,
  130. but the code is there).</SPAN>
  131. <P>
  132. <SPAN ID="hue179">A potential problem with libnids though is that
  133. it has to maintain it's own state/cache system. This not only means
  134. additional overhead, but jumping around in the pcap file as I'm planning
  135. on doing to handle multiple simultaneous flows is likely to really
  136. confuse libnids' state engine. Also, libnids is licensed under the
  137. GPL, but I want flowreplay released under a BSD-like license; I need
  138. to research if the two are compatible in this way.</SPAN>
  139. <P>
  140. <SPAN ID="hue181">Possible solutions:</SPAN>
  141. <P>
  142. <UL>
  143. <LI><SPAN ID="hue184">Developing a custom wedge between the capture file
  144. and libnids which will cause each packet to only be processed a single
  145. time.</SPAN>
  146. </LI>
  147. <LI><SPAN ID="hue186">Use libnids to process the pcap file into a new
  148. flow-based format, effectively putting the TCP/IP stack into a dedicated
  149. utility.</SPAN>
  150. </LI>
  151. <LI><SPAN ID="hue188">Develop a custom user-space TCP/IP stack, perhaps
  152. based on a BSD TCP/IP stack, much like libnids is based on Linux 2.0.37.</SPAN>
  153. </LI>
  154. <LI><SPAN ID="hue190">Screw it and say that IP fragmentation and out of
  155. order IP packets/TCP segments are not supported. Not sure if this
  156. will meet the needs of potential users.</SPAN>
  157. </LI>
  158. </UL>
  159. <P>
  160. <H2><A NAME="SECTION00042000000000000000">
  161. <SPAN CLASS="arabic">4</SPAN>.<SPAN CLASS="arabic">2</SPAN> <SPAN ID="hue193">Blocking</SPAN></A>
  162. </H2>
  163. <P>
  164. <SPAN ID="hue195">As earlier stated, one of the main goals of this
  165. project is to keep things single threaded to make coding plugins easier.
  166. One caveat of that is that any function which blocks will cause serious
  167. problems.</SPAN>
  168. <P>
  169. <SPAN ID="hue197">There are three major cases where blocking is likely
  170. to occur:</SPAN>
  171. <P>
  172. <OL>
  173. <LI><SPAN ID="hue200">Opening a socket</SPAN>
  174. </LI>
  175. <LI><SPAN ID="hue202">Reading from a socket</SPAN>
  176. </LI>
  177. <LI><SPAN ID="hue204">Writing to a socket</SPAN>
  178. </LI>
  179. </OL>
  180. <SPAN ID="hue207">Reading from sockets in a non-blocking manner is
  181. easy to solve for using poll() or select(). Writing to a socket, or
  182. merely opening a TCP socket via connect() however requires a different
  183. method:</SPAN>
  184. <P>
  185. <BLOCKQUOTE>
  186. <SPAN ID="hue210">It is possible to do non-blocking IO on sockets
  187. by setting the O_NONBLOCK flag on a socket file descriptor using
  188. fcntl(2). Then all operations that would block will (usually) return
  189. with EAGAIN (operation should be retried later); connect(2) will return
  190. EINPROGRESS error. The user can then wait for various events via poll(2)
  191. or select(2).</SPAN><A NAME="tex2html7"
  192. HREF="#foot382"><SUP><SPAN CLASS="arabic">7</SPAN></SUP></A>
  193. </BLOCKQUOTE>
  194. <SPAN ID="hue215">If connect() returns EINPROGRESS, then we'll just
  195. have to do something like this:</SPAN>
  196. <P>
  197. <DL COMPACT>
  198. <DT>
  199. <DD><SPAN ID="hue218">int&nbsp;e,&nbsp;len=sizeof(e);</SPAN>
  200. <P>
  201. <SPAN ID="hue220">if&nbsp;(getsockopt(conn-&gt;s,&nbsp;SOL_SOCKET,&nbsp;SO_ERROR,&nbsp;&amp;e,&nbsp;&amp;len)&nbsp;&lt;&nbsp;0)&nbsp;{&nbsp;</SPAN>
  202. <P>
  203. &nbsp;<SPAN ID="hue383">&nbsp;&nbsp;&nbsp;/*&nbsp;not&nbsp;yet&nbsp;*/</SPAN>
  204. <P>
  205. &nbsp;<SPAN ID="hue384">&nbsp;&nbsp;&nbsp;if(errno&nbsp;!=&nbsp;EINPROGRESS){&nbsp;&nbsp;/*&nbsp;yuck.&nbsp;kill&nbsp;it.&nbsp;*/&nbsp;</SPAN>
  206. <P>
  207. &nbsp;<SPAN ID="hue385">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;log_fn(LOG_DEBUG,&#34;in-progress&nbsp;connect&nbsp;failed.&nbsp;Removing.&#34;);&nbsp;</SPAN>
  208. <P>
  209. &nbsp;<SPAN ID="hue231">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;-1;&nbsp;</SPAN>
  210. <P>
  211. &nbsp;<SPAN ID="hue233">&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;{&nbsp;</SPAN>
  212. <P>
  213. &nbsp;<SPAN ID="hue386">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;0;&nbsp;/*&nbsp;no&nbsp;change,&nbsp;see&nbsp;if&nbsp;next&nbsp;time&nbsp;is&nbsp;better&nbsp;*/&nbsp;</SPAN>
  214. <P>
  215. &nbsp;<SPAN ID="hue238">&nbsp;&nbsp;&nbsp;}&nbsp;</SPAN>
  216. <P>
  217. <SPAN ID="hue240">}&nbsp;</SPAN>
  218. <P>
  219. <SPAN ID="hue387">/*&nbsp;the&nbsp;connect&nbsp;has&nbsp;finished.&nbsp;*/&nbsp;</SPAN>
  220. </DD>
  221. </DL><BLOCKQUOTE>
  222. <SPAN ID="hue247">Note: It may not be totally right, but it works
  223. ok. (that chunk of code gets called after poll returns the socket
  224. as writable. if poll returns it as readable, then it's probably because
  225. of eof, connect fails. You must poll for both.</SPAN>
  226. </BLOCKQUOTE>
  227. <P>
  228. <BR><HR><H4>Footnotes</H4>
  229. <DL>
  230. <DT><A NAME="foot382">...&nbsp;</A><A
  231. HREF="node4.html#tex2html7"><SUP><SPAN CLASS="arabic">7</SPAN></SUP></A></DT>
  232. <DD><SPAN ID="hue212">socket(7)</SPAN>
  233. </DD>
  234. </DL>
  235. <DIV CLASS="navigation"><HR>
  236. <!--Navigation Panel-->
  237. <A NAME="tex2html72"
  238. HREF="node5.html">
  239. <IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="next.png"></A>
  240. <A NAME="tex2html70"
  241. HREF="flowreplay.html">
  242. <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="up.png"></A>
  243. <A NAME="tex2html64"
  244. HREF="node3.html">
  245. <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="prev.png"></A>
  246. <BR>
  247. <B> Next:</B> <A NAME="tex2html73"
  248. HREF="node5.html">5 pcap vs flow</A>
  249. <B> Up:</B> <A NAME="tex2html71"
  250. HREF="flowreplay.html">Flowreplay Design Notes</A>
  251. <B> Previous:</B> <A NAME="tex2html65"
  252. HREF="node3.html">3 Design Thoughts</A></DIV>
  253. <!--End of Navigation Panel-->
  254. <ADDRESS>
  255. Aaron Turner
  256. 2005-08-07
  257. </ADDRESS>
  258. </BODY>
  259. </HTML>