Non Cooperative Path Characterization using Packet Spacing Techniques Supriyo Chakraborty, Bikramjit Walia and D. Manjunath Department of Electrical Engineering, UT-Bombay, Mumbai lNDIA s u p r i y o , bikram, dmanju@ee.iitb.ac.in Abstract- Non cooperative network path measurements assume that there is no cooperation from the other end of a path. Such methods have to cleverly exploit standard protocol options to initiate probing traffic. Performing measurements in this framework for the ‘reverse path’ is particularly challenging. We describe the design of iPathmetefl, non cooperative tool to measure capacity and avdable bandwidth on the reverse path. The probing traffic of iPathmeter2 consists of a chirp of packetpairs with appropriate spacing. The spacing between the ACKs from the measuring host and the size of the advertised receive window help shape the transmitting pattern from the remote host to the measuring node. Thus, iPathmeter2 adapts the existing cooperative packet spacing techniques for reverse path, noncooperative measurements. A utilization based estimator for the available bandwidth is also described. iPathmeter2 is validated by performing measurements under controlled conditions. Results from live tests on the Internet are also reported. a tool for the public Internet in Section IV. These design issues are used in iPathmeter2 and Section V describes its basic design. Some experimental results are presented in Section VI, 11. COOPERATIVE BANDWiDTH ESTIMATION Many estimators are available to estimate the bandwidth metrics [ll-[ll]. Almost all of these estimators work in a cooperative framework which requires access to both ends of the path being measured, These estimators work as follows: sender transmits probing packets according to a specified pattern and the receiver node timestamps these packets and obtains the deviation from the pattern. These deviations are used in the bandwidth estimation. There are four basic schemes of packet transmission patterns that are used by the above estimators(1) packet pair dispersion, ( 2 ) variable packet size probing, (3) 1, ZNTRODUCTlON self induced congestion and (4)train of packet pairs. In packet pair dispersion (PPD) based bandwidth estimators, End-to-end bandwidth estimation on a network path is useful in a network measurement and monitoring tool. Example two packets (a packet-pair) are transmitted back-to-back to applications of path’bandwidth estimates include peer-to-peer cause them to queue together at the bottleneck link. If the links applications, service level monitors and dynamic server selec- were rate-based servers then, when the packets arrive at the tion [I]. Two bandwidth related metrics are usually defined for destination, the dispersion will be the same as the dispersion a network path-bottleneck capacity and available bandwidfli. when they exit the bottleneck link. Thus if L is the length of In this paper we describe a non-cooperative technique to obtain the probing packet and the dispersion observed is D , the path A modification of these performance measures for a path from only one end capacity C can be estimated by C = of the path. An obvious application of such a single ended the basic packet pair technique is packet train probing which measurement method would be in monitoring Internet service sends multiple back to back packets. The dispersion of the quality by an end user, e.g., bandwidth received by the node packet train is shown be asymptotically equal to the available from a popular web server, where the user typically does not capacity even in the presence of cross traffic. Pathrate [4],IGI have access to the far-end of any Internet path. Further, we [8], Cprobe [ 11 and Spruce [ 111 use this technique. In variable packet size (VPS) probing. the capacity of each remark that for most users the service quality on a few paths in the Internet will dominate the Internet service quality that hop along a path is measured by exploiting the fact that the it sees. Hence by measuring the path characteristics on these number of links that a packet traverses on a path can be limited by the TTL field in the IP packet header. On reception of a paths, the user can quantify its Internet service quality. An important requirement of a path bandwidth estimation packet with an expired TTL.a router responds with an ICMP technique is that it be able to measure forward and reverse error message back to the sender node. By varying the TTL path characteristics from the measuring node. Most techniques field within a packet, the minimum RTT (Round Trip lime) available i n the literature can be easily extended to obtain for each hop along the path is obtained as a function of the the non-cooperative forward path characteristics while reverse packet size, The capacity of each link on the path is obtained path measurements require clever techniques. We describe from a the estimate of the link capacity on the preceding link one such technique in this paper. Since we adapt existing and a plot of the minimum RTT (for packets returned from techniques for reverse path, non-cooperative measurement. a the receiver side of the 1ink)against the packet size. Pathchar brief survey of the cooperative measurement techniques is [2], Clink [3] and Pchar [lo] use this technique. In the third technique to estimate the bottleneck capacity and given in the next section. We discuss generic issues in the design of non cooperative estimators in Section I11 and the available bandwidth, a binary search kind of approach is used. design issues that need to be addressed when developing such The goal here is to build up a queue of the probing packets at h. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 03:39 from IEEE Xplore. Restrictions apply. the bottleneck link, thereby causing a ‘self induced congestion’ [SIC) at the link, and infer its bandwidth from the ‘queueing signature’. By varying the packet rate in the packet-train at the source, the available bandwidth can be estimated as the rate €or which the queue length begins to increase. The interpacket spacing corresponds to the transmission rare. Pathload [9] and Pathchirp [6] are examples of this approach. A fourth approach uses a train of packet-pairs (TOPP) 151 in which the spacing between the packets in the train is progressively decreased. When the spacing becomes less than the service time of the bottleneck link, the second probing packet is queued at the bottleneck link and the spacing between the packets at the output of the link starts to increase. Thus the packet spacings in the received train can be used as an estimator of the available bandwidth. Observe from the above discussion that, except for Pathchar and Clink, the other bandwidth estimators are designed for a cooperative framework. Although Pathchar and Clink do not assume a cooperative framework, they depend on the prompt generation of ICMP messages, which, in the days of DOS attacks is not a reasonable assumption. Further, Pathchar and Clink can only measure forward path characteristics. REMOTE HOST LOCAL HOST s3 54 8 Fig. 1. S1. S2 and S3 are the exponentially decreasing spacings between the ACKs sent from the measuring node. When this spacing is to be 54.there is no packet to ACK at the local host. The ACK that should have been sent according to the algorithm but could not be sent is shown as a dotted line. Thus. long chirps cannot tx initiatsd from the far-end of the path. Lmal host Remote host Local host Remote host 111. N O N COOPERATIVE BANDWIDTH ESTIMATION A non-cooperative tool has to cleverly exploit standard protocol options to initiate probing traffic in the network according to a pattern specified by the measurement algorithm. e.g., like that used by Pathload or Pathchirp. Note though that this pattern has io be initiated in the direction in which the measurements are to be carried out, i.e., for reverse path characterization, thus the required pattern should be initiated at the far end. We exploit the features of TCP and the fact that ai the other end of a path, there will likely be open public TCP ports, e.g., HTTP and FTP,in the design of a non-cooperative, reverse path bandwidth estimator. The TCP feature that we exploit is as follows. Recall that the TCP sending window is affected by the acknowledgments (ACKs) from the receiver and by the advertised ‘receiver window’ (rwin).The rate at which ACKs are sent by receiver (spacing between them) and the size of w i n is used to shape the incoming traffic as per the requirement of the measuring algorithm. Fig. 2. On the left side is the buffering phpse. On the right is the probing phase. The ACKs sent i n the probing phase correspond to the packets captured during the buffenng phase. Here S1, 52.S3 and 54 are the exponentially chstributed spacing between the packets sent from the local host to the remote host. As the spacing ai the local host decreases (the rate of probing increases), the packets from the remote host wMlld stan getting queued. IV. DESIGN ISSUES It is possible to emulate any of the cooperative algorithms in the non cooperative framework. For illustration, consider the possible use of packet chirps as in Pathchirp where the receiver could send exponentially spaced ACKs with Twin equal to MTU sized packets. Then, one would expect that the remote host would send data packets with interpacket spacing similar to the ACK spacing and hence emulate a chirp from h e far-end. In our experiments, we could initiate a chirp (exponentially spaced packets) from the far-end on the local network but we could not achieve it over the WAN. The problem here is that because of the WAN path delays, the receiver may not be able send a sufficient number of ACKs with the required spacing. This is because a sufficient number 0-7803-R924-7/05/$20.00 (~>zaO5IEEE. of packets are not yet received to allow the sending of ACKs at a high rate. S e e Fig. 1 for an illustration. To overcome this, the following two phase approach was attempted. 1) BufSeering Phase: We first ‘buffer’ a sufficient number of packets by not acknowledging them. These will be ACKed in the probing phase as per the requirements of the measuring algorithm. After establishing the TCP connection with the remote host, for the first few ACKs, we gradually increase rwin. As rwin is doubled. the remote host responds with two back to back packets. The last of the two packets is acknowledged with rwin equal to four times the original min. The remote host now responds with four back l o back packets. These 227 Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 03:39 from IEEE Xplore. Restrictions apply. four packets are ‘buffered’, to be ACKed in the probing phase (as explained below). It is important to ensure that the number of packers so buffered should not cause retransmissions by the remote host as this would reduce cwnd. See Fig. 2 for illustration of these events, Probing Phase: The measuring node transmits ACKs for the buffered packets with spacings and w i n values that would cause the remote host to transmit new packets resembling a packet chirp. This is illustrated in Fig. 2. While experimenting with the above two phase approach, we found that our A-abing phase did not perform as expected. This was because even when the w i n and (our estimate 00 mnd were such as to allow these transmissions, many hosts would not transmit new packets till all their previous transmissions were acknowledged. This, we believe, is due to the use of Nagle’s algorithm I121 on the remote host. We also encountered the following unexpected behavior with respect to a sender’s response to rwin. Many hosts would send two packets of size equal to half of rwin for each ACK sent. This behavior was seen for all values of m i n . However if the ACK spacing were reduced. we received packets equal to rwin, or equal to the path MTU when rwin was greater than the path MTU. The reason for this is not clear. The above experience leads us to the following outline of the final design. The packets from the remote host should be shaped so that it resembles a “chirp of packet pairs” (COPP). where the spacing between the packets in each packet-pair is successively reduced. This chirp pattern that should be initiated by remote host has similarities to both TOFP and Pathchirp. We reiterate that ours is a non-cooperative tool and the challenge is to be able to initiate a packet pair from the remote host at any time within the experiment by exploiting the current TCP implementation. We have successfully achieved this in iPathmeter2. 3 ) The two-phase measurement described in the previous section is then initiated. The code for iPathmeter2 uses two independent processes-one to send appropriately spaced ACKs and the other to receive the packets from the remote host. Clearly, neither of these should be blocked because the ACK spaces have to be correct and the received packets should be timestamped accurately. Thus we need to poll the NIC for received packets in non-blocking manner. In iPathmeter2, we this is achieved using a system independent library called Libpcap that provides a portable framework €or low-level network monitoring. We use it to poll the NZC regularly to capture the probe packets that will be sent by the remote host. This is explained in detail below. The Buffering Phase: RecalI that in this phase we accumulate packets by not acknowledging them. This is achieved as follows. Initially, send an ACK with rwin = 1500 bytes. The remote host responds with two back-to-back packets of 750 bytes each. Do a cumulative acknowledgment by sending an ACK for the second packet, but changing w i n to four times its original value to account for the fact that it can now accept four MTU sized packets, The remote host responds with a series of four back-to-back packets. We however observed deviation in rhe number of back-to-back packets received from different sites. This may be attributed to various burst mitigation techniques adopted in the TCP implementations [ 131. On some of the Linux implementauons a variable MAXBURST is used to achieve the same, and it is normally set to a v d u e of 3. We must also mention here that this does not affect our experiment, because we require only about three packets in our buffer to be able to continue with the probing phase. The buffering phase is illustrated in Fig. 3. The Probing Phase: In this phase we do cumulative ACKing for the packets. In the Buffering Phase, the senders nund increases to 3* M S S or more. In this phase we ensure v. IPATHMETER2 : DESIGN A N D IMPLEMENTATION that the cwnd does not decrease in size. Since the amount We make the following reasonable assumptions in the of data received is always the minimum of cwnd and rwin design of iPathmeter2. we use w i n to control the data being received. At the start 1) An HTTP daemon is running on a remote host at the of this phase the last packet of the packet train accumulated far-end of the path. in the previous phase is acknowledged and rwin = 1500 is 2) A file of size that will sustain the burst of probe packets advertised. This opens up the mnd at the sender side and it is available for download via HTTP at remote host. sends us a packet of size w i n = 1500. 3) Like in Clink and Pathchar we also assume that the We use duplicate ACKs (DUPACKs) to generate an ACKACKs do not experience any congestion. pair which in turn will generate a corresponding packet-pair The implementation details are as follows: from the remote host. Here we expIoit the property that TCP 1) Initialize by setting up firewall rules in the INPUT enters into a Fast Retransmit state only on the reception of 3 chain of Iptables to block incoming probe packets from DUFACKs [14]. By using only one DUPACK, we ensure that reaching the kernel TCP Stack directly. This will prevent the remote host does not enter into a Fast Retransmit state. the kernel TCP from sending of ACKs for these probe This eliminates the possibility of a decrease in m n d . The packets. Note that the probe packets come as HTTP DUPACK sent contains Twin increased to twice its current packets from the remote host with destination port the value causing the remote host to send a new packet. The two packets classify as a ‘packet pair’. From our assumptions, same as that used by iPathmeter2. 2) We then bind to a randomly chosen port on the Jocal these packet pairs are injected into the network at the same host, establish a TCP session with the remote host rate at which their respective ACKs are sent. The above two using the 3-way handshake and then initiate a HlTP data packets (forming the packet pair) are received by the measuring host and timestamped. The last of these two packets connection with it. 0-7803-8924-7/05/$20.00 ( ~ ) 2 0 O 5IEEE. 228 Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 03:39 from IEEE Xplore. Restrictions apply. Local host Remote host Local host Remote host J, Ln I Cross traffic generator Local host 1- 51 -/ 4 s2 Remote host Fig. 1. Testbed setup. All the WAN links are of 2 Mbps. 8 4 for packet-pair j the spacing is less than the transmission time on the bottleneck link. Dj will approach the packet 53 8 transmission time on the bottleneck link, i.e., D j approaches a constant for increasing j (the packet-pair spacing is decreasing with increasing j ) . Denote this constant by D.The bottleneck capacity is then estimated as C = The available bandwidth is estimated by considering the Fig. 3. iPathmeter2 : The buffering phase is shown on the left and the fraction of packets that are queued at the bottleneck link. For probing phase on the right. Si are the spacing &tween the ACK pairs at the this we use the estimate of the capacity from above. The local measuring and Ri are the spacing between the corresponding packet-pairs host sends ACK pairs to the remote hosr with spacing of 2L/C as received at the measuring node. 53 is less than the service time on the bottleneck link of the path; hence R3 > S3. (corresponding to a probing rate of CIS), which in turn will generate the packet pairs from the remote host with the above spacing. In each pair. if the second packet is queued at the bottleneck link, the packet spacing will be more than the 2L/C is used for generating the next ACK pair, and correspondingly that we started with. Thus we can estimate the number of the next packet pair from the remote host. The time difference The utilization of the bottleneck link, packets that are queued. between the original ACK and the duplicate ACK in the new U is estimated as iteration is decreased and thus the rate of probing is increased. This is illustrated in Fig. 3. NPackets queued U= As we decrease the spacing between the packet-pairs. it can NPackets queued -k NPackets not queued happen that this spacing corresponds to rate of probing greater We will call this the utilization based estimate (UBE) of the than the available bandwidth on the path. In this case, the available bandwidth. An estimator similar to that of Spruce second packet in a e packet-pair gets queued up behind the [11] is also used in Pathmetee. Here the ACKs from the first packet on the bottleneck link and the spacing between measuring node are spaced at L / C , corresponding to a probing them is increased at the output, Thus the spacing between this rate of C, and the spacing between the received packets are packet pair as observed at the local host will be more than used in Eqn. 2 of [ll]. This estimator is denoted by ESS. what it was between the corresponding ACK pair. VI. EXPERIMENTAL RESULTS The experiment is repeated a number of times. After collecting a sufficient number of samples, the TCP session wilh We first show the results from experiments under controlled the remote host is closed. The data collected is fed to the conditions to validate iPathmeter2. The testbed setup was as inference engine described below. shown in Fig. 4. A cross traffic of UDP packets generated Estimator: As can be seen from above, there is no correla- as a Poisson process of a specified rate is introduced on the tion between the packet pairs corresponding to the different path being measured, A sample trace from this experiment is ACK pairs and packets from different packet pairs do not shown in Fig. 5 with timestamps obtained on the respective queue up at the bottleneck link together. Hence we cannot machines using t cpdump. use the inference engine of SIC based bandwidth estimators. All the links are of capacity of 2 Mbps. The utilization of The inference engine of iPathmeter2 estimates both the all the links is low, i.e., there is very less cross traffic on this capacity and the available bandwidth of the path. We consider network path. Table I shows the estimates of the capacity and capacity fist. Assume that the remote host initiates N chirps of the available bandwidlh (using both the UBE and ESS). The of J packet-pairs and that each packet has L bits. At the available bandwidth estimates from Pathchirp are also shown. measuring node. iPathmeter2 obtains the packet-pair spacings Pathchirp estimates are provided by taking an average of all for all the packet-pairs in the chirp. Recall that each packet- the per chirp estimates from Pathchirp 161. pair in a chirp corresponds lo a different probing rate. Let dj,,, We now report some results from measurements on the Indenore the spacing at the measuring host for the j-th packet- ternet. iPathmeter2 was used to estimate bandwidth which the pair in the n-th chirp. Define Dj = minn=1,2,3,...,Ar(dj,n), IIT-Bombay network receives from web servers. Several web i.e., the minimum spacing at the local-end of the reverse path servers were probed and estimates for capacity and available for !he j-th packet pair in a chirp. It is easy to see that if bandwidlh are as provided in the Table 11. A comparison is 6. 0-7803-8924-7/05/$20.00 ( ~ ) 2 0 0 5IEEE. 229 Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 03:39 from IEEE Xplore. Restrictions apply. LOCAL HOST (202.141.154.131) TABLE I BANDWIDTH ESTIMATES UNDER CONTROLLED CONDITIONS REMOTE HOST (202.41.97.50) 0.0000 o.oDo0 o.oM10 0.0241 0.50 OZ985 07.1285 07.1317 07.4578 07.6529 07.6562 07.6563 07.6854 07.6884 07.8137 07.8429 07.8462 B7.8462 07.8822 07.8935 07.9289 07.9289 07.9431 07.9644 07.9792 07.9793 07.9836 08,0149 08.0217 08.0218 08.0231 Web Server 07.0980 07.0980 07.0981 1.72 1.58 1.25 Capacity W P S ) Yahoo Rediff Nokia Shockwave.com HPSR.com Stanford.edu 07.6231 07.6231 07.6231 07.6555 07.6555 07.6555 Infosys.com 1.98 1.98 2.015 1.99 2.01 1.96 0.04 1.51 1.26 1.46 1.32 Available bandwidth Rvlbps? 1.53 1.18 1.23 0.62 1.03 1.16 0.016 07.8130 07.8130 07.8130 07.8458 Finally, we remark that iPathmeter2 is a network friendly tool, and does not significantly aJter the network load. 07.B458 07.8928 07.8928 ACKNOWLEDGMENTS The authors thank K. Ani1 Kumar for many perceptive remarks and pointers to literature. 07.9283 07.92M 07.9429 07.9429 07.9786 07.9787 07.9830 07.9830 08.0573 08.0211 08.0211 08.0226 08.0226 08.0633 08.0634 08.0630 REFERENCES R. Carter and M. Crovella, “Server selection using dynamic path characterization in wide-area networks,” in Proc. of IEEE INFOCOM., Kobe, Japan. Apr. 1997, pp. 10141021. [2] V. Jacobson. (1997, Apt) Pathchar: A tool to infer characteristics of internet paths. [Onhe]. Available: ftp://ftp.ee.lbl.gov/paihchar/ [3] A. B. Downey, “Using pathchar to estimate internet link characteristics.” in Proc. of ACM SIGCOMM., Sept. 1999, pp. 222-223. [43 C. Dovrolis, P. Ramanahan. and D. Moore, “Whai do packet dispmion techniques measure?” in Proc. oflEEEINFOCOM., Apr. 2001, pp. 905914. [5] B. Melander, M. Bjorhan, and P. Gunningberg. “A new end-to-end probing and analysis method for estimating bandwidth bottlenecks,” in Pmc. of IEEE GLOBECOM., San Francisco CA, USA, Nov. 2ooO. [6] V. Ribeiro, R.hedi, R. Baraniuk J. Navratil, and L. Cdtrell, “pathchirp: Efficient available bandwidth estimation for network palhs,” in Pmc. of Passive and Active Mearurements (PAM) Workrhop.. Apr. 2003. VI ( 2 m , Dec.) Caida. [Online]. Available: http://www.caida,org/toolshaxonomy [8j N. Hu and P. Steenkiste, ‘Emhation and characterization of available bandwidth probing techniques.” IEEE J o u d on Selecred Areas in Communication, vol. 21, no. 6, pp. 879-894. Aug. 2003. 191 M. Jain and C. Dovrolis. “End-to-end available bandwidth: Measurement methodology, dynamics, and relation with TCP throughput.” in Pmc. of ACM SIGCOMM., Aug. 2W2, p p ~295-308. [lo] B. A. Mah. (1999, Feb.) pchar: a tool for measuring internet path characterisitcs. [Online]. Avaiiable: hltp://www.employees.org/ bmah/Software/pchar/ [11] J. Straws, D. Kalabi. and E Kaashoek. “A measurement study of available bandwidth estimation tools,” in Pmc. of ACM SIGCOMM., 2003, pp. 3 9 4 4 . [I21 J, Nagle, “Congestion control in IP/TCP internetwork$.” RFC 896, Jan. 1984. [I31 S . Floyd. “Highspeed TCP for large congestion windows,” RFC 36.19, Dec. 2003. [14] W. Stevens, “TCP slow start, congestion avoidance, fast retransmit, and fast recovery algorithms,” RFC 2001, Jan. 1997. [l] Fig. 5. Timing diagram of data flow between the ioal host and remote host. The two clocks are not synchronized. At the local host, times are referenced to the transmission of the first SYN packet while at the remote the times are referenced to the reception of this SYN packet. T h e times shown are in minutes. not possible as Pathchirp and other tools discussed above all work in cooperative setups. VII. DISCUSSION AND FUTURE WORK There were some observations of unexpected behavior from our testing of iPathmeter2 on the Internet. Not all hosts in the Internet behaved as expected with respect to ACK spacing and varying m i n . When a w i n = 1500 is advertised, some of the servers respond with MTU sized packets while ochers send two back-to-back packets of size 750 bytes each. iPathmeter2 responds to such behavior adaptively. More surprisingly, a few of the web servers when probed respond with different size packets for each ACK pair. The methods that we develop must work in the real world with a variety of possible non standard TCP implementations. iPathmeter2 has been designed witb t h e same objective. With extensive testing we expect to be able to discover more unexpected behaviors and adapt to them. 0-7803-8924-7/05l$20.00(~)2005IEEE. 1.98 230 Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 3, 2008 at 03:39 from IEEE Xplore. Restrictions apply.
© Copyright 2025 Paperzz