当前位置：文档之家› end to end英汉合体版

end to end英汉合体版

End-to-End Internet Packet Dynamics

端到端的网络动态数据包

Vern Paxson 弗恩派松

Network Research Group Lawrence Berkeley National Laboratory_ University of California, Berkeley

伯克利加利福尼亚大学劳伦斯伯克利国家实验室网络研究小组

vern@https://www.doczj.com/doc/6c8252982.html,

Abstract 摘要

We discuss findings from a large-scale study of Internet packet dynamics conducted by tracing 20,000 TCP bulk transfers between 35 Internet sites.Because we traced each 100 Kbyte transfer at both the sender and the receiver,the measurements allow us to distinguish between the end-to-end behaviors due to the different directions of the Internet paths, which often exhibit asymmetries. We characterize the prevalence of unusual network events such as out-of-order delivery and packet corruption; discuss a robust receiver-based algorithm for estimating ―bottleneck bandwidth‖ that addresses deficiencies discovered in techniques based on ―packet pair‖; investigate patterns of packet loss, finding that loss events are not well-modeled as independent and, furthermore, that the distribution of the duration of loss events exhibits infinite variance; and analyze variations in packet transit delays as indicators of congestion periods, finding that congestion periods also span a wide range of time scales.本文我们将讨论有关网络动态数据包的大量研究的结果，它是通过在35个网站之间跟踪20000个数据包得到的。由于我们在收发端对所要发送的100Kbyte的数据都进行了跟踪，这种方法使得我们能够根据不同的网络传输方向区分各种端到端的行为不同，而这些网络传输路径是不对称的。我们以比较常见的一些网络事件为特性，例如无序传输或者数据包错误，讨论一种以接受者为基础的算法从而对瓶颈的频带宽度进行估测，这种算法可以定位一些缺陷，而这些缺陷是以数据包组为基础的技术发现的，研究数据包丢失的多种方式。通过这些研究，我们发现数据丢失并不是独立的被效仿，而且，数据丢失事件的持续时间所服从的分布的呈现出极大的不同。以数据包传输延迟作为拥塞时间的标准，通过分析数据之间的变更发现，拥塞周期也持续一个很长的时间跨度。

1 Introduction 引言

As the Internet grows larger, measuring and characterizing its dynamics grows harder. Part of the problem is how quickly the network changes. Another part, though, is its increasing heterogeneity.It is more and more difficult to measure a plausibly representative cross-section of its behavior. The few studies to date of end-to-end packet dynamics have all been confined to measuring a handful of Internet paths, because of the great logistical difficulties presented by larger-scale measurement [Mo92, Bo93, CPB93, Mu94]. Consequently, it is hard to gauge how representative their findings are for today's Internet. Recently, we devised a measurement framework in which a number of sites run special measurement daemons(―NPDs‖) to facilitate measurement. With this framework, the number of Internet paths available for measurement grows as N2 for N N sites, yielding an attractive scaling. We previously used the framework with N=37 sites to study end-to-end routing dynamics of about 1,000 Internet paths [Pa96].

随着网络范围的不断增大，测量和表征网络的动态特性变得越来越难。其中一部分问题在于网络是如何快速变化的，另外就是它不断增长的不对称性。更加困难的是衡量一个合理的具有代表性的网络行为的横截面。目前，由于大量的测量所带来的巨大的逻辑困难使得现在针对端到端的动态数据包的研究只局限于去测量一小部分的网络传输路径。因此，很难判断他们的研究结果对现在的网络状况有多少借鉴意义（代表性）。目前，我们设计了一个测量框架，在这个框架中，为使测量更加方便快捷，许多网站都运行了特殊的测量进程。通过这个框架，可用于测量的网络传输路径的数量为N2个，此前我们用N=37的框架来研究具

有1000条传输路径的网络的动态路由选择。

In this study we report on a large-scale experiment to study end-to-end Internet packet dynamics.1 Our analysis is based on measurements of TCP bulk transfers conducted between 35 NPD sites. Using TCP—rather than

fixed-rate UDP or ICMP ―echo‖ packets as done in [Bo93, CPB93, Mu94]—reaps significant benefits. First, TCP traffic is ―real world,‖ since TCP is widely used in today's Internet. Consequently, any network path properties we can derive from measurements of a TCP transfer can potentially be directly applied to tuning TCP performance. Second, TCP packet streams allow fine-scale probing without unduly loading the network, since TCP adapts its transmission rate to current congestion levels.

在本次研究中，我们做了大量端到端的网络动态数据包的研究实验，并对实验结果做了总结。我们的分析是以在35个网址之间的TCP数据传输为基础进行的。用TCP而不是UDP或者ICMP获得了很好的研究效果。首先，TCP传输是最现实的，因为它被广泛的应用于当前的网络中。所以，任何一个从TCP传输中衍生出来的网络传输路径的性能都能够直接应用于现在的网络中以改善它的性能。其次，TCP数据包流允许一定量的探索而不会引起网络过载，因为它能够适应当前拥塞控制水平下的传输速率。

Using TCP, however, also incurs two serious analysis headaches.First, we need to distinguish between the apparently effects of the transport protocol and the network. To do so, we developed tcpanaly, a program that understands the specifics of the different TCP implementations in our study and thus can separate TCP behavior from network behavior [Pa97a]. tcpanaly also forms the basis for the analysis in this paper: after removing TCP effects, it then computes a wide range of statistics concerning network dynamics.

然而，用TCP也会有两个很棘手的问题。首先我们需要区分传输协议和网络之间的明显的不同。为此，我们设计了TCP分析方法，这是一种再我们研究中的能够识别不同TCP传输方式的实现的细节的方法，这样我们就能够经TCP行为同网络行为相区别。而且这种方法是本文分析的理论基础：当去除TCP影响后，它能估算出有关网络动态特性的很宽范围的静态分布。

Second, TCP packets are sent over a wide range of time scales,from milliseconds to many seconds between consecutive packets.Such irregular spacing greatly complicates correlational and frequency-domain analysis, because a stream of TCP packets does not give us a traditional time series of constant-rate observations to work with. Consequently, in this paper we do not attempt these sorts of analyses, though we hope to pursue them in future work. See also [Mu94] for previous work in applying frequency-domain analysis to Internet paths.

其次，TCP连续传输数据包所用时间的跨度很大，从几毫秒到数秒不等。这样的不规则分布使得数据包之间的相关性和频域分析变得复杂，因为一条数据包流传输所用时间并不是恒定不变的，不能用做一个常数使用。所以，虽然我们很希望在以后的工作中能够实现但在这篇文章中并没有对其进行分析。对先前工作的着眼点仍放在网络路径的频域分析上。

In 3 we characterize unusual network behavior: out-of-order delivery, replication, and packet corruption. Then in 4 we discuss a robust algorithm for estimating the ―bottleneck‖ bandwidth that limits a connection' s maximum rate. This estimation is crucial for subsequent analysis because knowing the bottleneck rate lets us determine when the closely-spaced TCP data packets used for our network probes are correlated with each other. (We note that the stream of ack packets returned by the TCP data receiver in general is not correlated, due to the small size and larger spacing of the acks.) Once we can determine which probes were correlated and which not, we then can turn to analysis of end-to-end Internet packet loss (5) and delay (6). In 7 we briefly summarize our findings, a number of which challenge commonly-held assumptions about network behavior.

在第三章，我们分析了不同网络行为的特性，包括不规则传输，重传，数据报错误。在第四章中，我们讨论了一种有效的算法，这种算法能够测量出限制连接最大速率的瓶颈的频域宽度。在接下来的分析中，这种算法是很有效的，因为掌握了瓶颈速率能够帮助我们了解到应用于网络探针中的空间上相邻的TCP数据包在什么时候彼此相关，一旦我们能够掌握那些探针相关哪些不相关信息，我们就能够分析在端到端的网络传输过程中数据包的丢失和延迟。数据包的丢失和延迟将分别在五六章介绍。在第七章，我们详细的总

结了我们的发现，其中一部分与目前所掌握的与网络行为有关的设想相冲突。

2 The Measurements 测量方法

We gathered our measurements using the ―NPD‖ measurement framework we developed and discussed in [Pa96].

35 sites participated in two experimental runs. The sites include educational institutes,research labs, network service providers, and commercial companies, in 9 countries. We conducted the first run N1 during Dec. 1994, and the second N2 during Nov–Dec. 1995. Thus, differences between N1 and N2 give an indication how Internet packet dynamics changed during the course of 1995. Throughout this paper,when discussing such differences, we always limit discussion to the 21 sites that participated in both N1 and N2

对所有方法进行比较后，我们决定使用NPD测量，该种方法将会在96页进行阐述。参与试验的共有35个网站，这些网站将分成两个实验性小组，这些网站分布在九个国家，包括教育机构、研究实验室、网络服务提供商以及商业公司。在1994年12月，我们进行了第一次试验，所用网站数量为N1，在1995年11-12月间进行了第二次试验，使用网站数量为N2。这样，N1、N2之间的不同可以帮助我们了解到在1995年网络动态数据包的变化情况。我们将研究范围进行了限制，在下文中，当涉及到这种不同时，仅讨论两次试验都参与的这21个网站。

Each measurement was made by instructing daemons running at two of the sites to send or receive a 100 Kbyte TCP bulk transfer,and to trace the results using tcpdump [JLM89]. Measurements occurred at Poisson intervals, which, in principle, results in unbiased measurement, even if the sampling rate varies [Pa96]. In N1 the mean

per-site sampling interval was about 2 hours, with each site randomly paired with another. Sites typically participated in about 200 measurements, and we gathered a total of 2,800 pairs of traces.In N2 we sampled pairs of sites in a series of grouped measurements, varying the sampling rate from minutes to days, with most rates on the order of 4–30 minutes. These groups then give us observations of the path between the site pair over a wide range of time scales. Sites typically participated in about 1,200 measurements,for a total of 18,000 trace pairs. In addition to the different sampling rates, the other difference between N1 and N2 is that in N2 we used Unix socket options to assure that the sending and receiving TCPs had big ―windows,‖ to prevent window limitations from throttling the transfer' s throughput.

每次测量都是通过在两个网站之间执行指令进程，实现发送或者接收100KbyteTCP数据包，然后利用TCP/IP 数据包分析法跟踪结果而实现的。每次测量是在泊松间隔处实现的，这样即使采样速率不恒定也可以实现较好的测量。在N1的测量中，对于每个网站的平均采样间隔约为两个小时，每一个网站都是随机的与其他的组合。这些网站主要参与了其中大约200个测量，对于所跟踪的这些测量，我们一共集中了2800组。在N2的测量中，我们一分组测量的方式对成对的网站进行了采样，采样时间从数分钟到数天不等，大部分在4-30分钟。这些分组在一个较大的时间范围内帮助我们对网站之间的传输路径有所研究，这些网站主要参与了其中大约1200个测量，对于所跟踪的这些测量，我们一共收集到了18000组。此外，由于采样频率的不同，N1N2之间的另一个不同是，N2的接口为UNIX，这确保了收发过程中有较大的窗口，不会因为窗口的大小而限制了发送方的输出量。

We limited measurements to a total of 10 minutes. This limit leads to under-representation of those times during which network conditions were poor enough to make it difficult to complete a 100 Kbyte transfer in that much time. Thus, our measurements are biased towards more favorable network conditions. In [Pa97b] we show that the bias is negligible for North American sites, but noticeable for European sites.

我们有时会限制测量十分钟，在此期间，网络状况差，不能在规定时间内完成对100KByte的传输，而这种限制使得这十分钟的测量不具有代表性。所以，我们的测量结果更倾向于性能较好的网络。对于北美的网站，这种倾向可以不考虑，但欧洲的网站不可以。

3 Network Pathologies 网络故障

We begin with an analysis of network behavior we might consider ―pathological,‖ meaning unusual or unexpected: out-of-order delivery,packet replication, and packet corruption. It is important to recognize pathological behaviors so subsequent analysis of packet loss and delay is not skewed by their presence. For example, it is very difficult to perform any sort of sound queueing delay analysis in the presence of out-of-order delivery, since the latter indicates that a first-in-first-out (FIFO) queueing model of the network does not apply.

在这一章，我们将对网络中出现的故障问题进行分析，这些不寻常的，出乎意料的问题有：无序传输，包重传，包错误。认识这些故障行为是很重要的，这样，在后续对包丢失和包延迟的分析中就不会受到这些故障的干扰。例如，在无序传输的现象中，很难完成任何声音序列的延迟分析，因为网络中的先入先出序列的例子并不适用。

3.1 Out-of-order delivery

Even though Internet routers employ FIFO queueing, any time a route changes, if the new route offers a lower delay than the old one, then reordering can occur [Mo92]. Since we recorded packets at both ends of each TCP connection, we can detect network reordering,as follows. First, we remove from our analysis any trace pairs suffering packet filter errors [Pa97a]. Then, for each arriving packet pi ,we check whether it was sent after the last non-reordered packet. If so, then it becomes the new such packet. Otherwise, we count its arrival as an instance of a network reordering. So, for example,if a flight of ten packets all arrive in the order sent except the last one arrives before all of the others, we consider this to reflect 9 reordered packets rather than 1. Using this definition emphasizes ―late‖ arrivals rather than ―premature‖ arriv als. It turns out that counting late arrivals gives somewhat higher (=+25%) numbers than counting premature arrivals—not a big difference, though.

即使网络路由器才偶那个先进先出序列，然而，如果一条线路改变，新形成的线路的延迟就会相对较慢，就会引起再排序。既然我们在每一个TCP连接的最后计算数据包的数量，我们就能给够向下面所说的那样监测到网络中的重新排序。首先，在我们的分析中保证在任一个连接中都不会发生数据包的过滤错误，然后对于每一个到达的数据包，检测它是否是再上一个未排序数据包的基础上发送的。如果是，则它会成为新的未排序数据包，否则，我们会将该数据包的发送视为一次网络的再排序。例如，发送10个数据包，最后一个先于其他九个到达，我们就认为需要9次数据包的再排序而不是一次。用这种方式定义重点强调的是晚到，而不是提前到。虽然区别不大，但我们会发现，计算晚到的数比按提前到的数大（约为百分之二十五以上）。

Observations of reordering. Out-of-order delivery is fairly prevalent in the Internet. In N1 36% of the traces included at least one packet (data or ack) delivered out of order, while in N2 12% did. Overall, 2.0% of all of the

N1 data packets and 0.6% of the acks arrived out of order (0.3% and 0.1% in N2). Data packets are no doubt more often reordered than acks because they are frequently sent closer together (due to ack-every-other policies), so their reordering requires less of a difference in transit times.

重排序的观测：在网络中，无序传输很常见的现象。在N1的测量中，有36%的跟踪含有至少一个数据包的无序传输，在N2中，有12%的这样。总体上，在N1的测量中，有2.0%的数据包和0.6%的确认符是无序到达的（在N2的测量中分别为0.3%和0.1%）。数据包比确认符无序传输的次数多，因为他们经常一起发送，而确认符是每隔一个发送一次，所以在同一个发送时间他们的无需传输要求的不同较小。

We should not infer from the differences between reordering in N1 and N2 that reordering became less likely over the course of 1995, because out-of-order delivery varies greatly from site-to-site. For example, fully 15% of the data packets sent by the ―ucol‖ site2 during N1 arrived out of order, much higher than the 2.0% overall average.

As discussed in [Pa96], we do not claim that the individual sites participating in the measurement framework are plausibly representative of Internet sites in general, so site-specific behavior cannot be argued to reflect general Internet behavior.

我们无法从N1、N2中的无序传输的不同得到推断，在1995年的整个测量过程中，无序传输很少发生，因为无需传输广泛的存在于网络传输中。；如，在N1测量中，―ucol‖的数据包中有15%的数据包出现无序传输，

远远高于总体平均值的2.0%。正如在96页讨论的那样，可以说在该测量中的那些独立的网站通常不具有典型的代表性，所以，网站的特定行为并不能反映网络行为的普遍情况。

Reordering is also highly asymmetric. For example, only 1.5% of the data packets sent to ucol during N1 arrived out of order.This means a sender cannot soundly infer whether the packets it sends are likely to be reordered, based on observations of the acks it receives, which is too bad, as otherwise the reordering information would aid in determining the optimal duplicate ack threshold to use for fast retransmission (see below).

无序传输也是很不对称的。例如，在发送到ucol中的数据包，有1.5%的出现无序传输。这就意味着，对于一个发送者，无法通过接收到的确认符对发送的数据包作出判断，数据包传输过程中是否出现无序传输，这样，无序传输的信息就不能在快速转播中作为有用信息使用--------（确定最好的确认符阈值）

The site-to-site variation in reordering coincides with our earlier findings concerning route flutter among the same sites [Pa96]. We identified two sites as particularly exhibiting flutter, ucol and the ―wustl‖ site. For the part o f N1 during which wustl exhibited route flutter, 24% of all of the data packets it sent arrived out of order, a rather stunning degree of reordering. If we eliminate ucol and wustl from the analysis, then the proportion of all of the N1 data packets delivered out-of-order falls by a factor of two. We also note that in N2N2, packets sent by ucol were reordered only 25 times out of nearly 100,000 sent, though 3.3% of the data packets sent to ucol arrived out of order, dramatizing how over long time scales,site-specific effects can completely change.

在重传中出现的端到端的变化与我们先前有关在相同网站之间的线路干扰的发现是一致的。我们认为ucol 和wustl是比较特殊的能显现出干扰的两类网站。在N1的试验中，wustl网站呈现出线路干扰，所发送的数据包中有24%出现了重传现象，这种程度让人相当震惊。如果我们在分析中去除ucol和wustl的影响，那么N1中数据包重传的比例将会下降。当然我们也注意到，在ucol发送的100000个数据包中只有25词的重传，而发送至ucol的数据包中只有3.3%的出现了重传，-----特定的网站的影响能够完全改变。Thus, we should not interpret the prevalence of out-of-order delivery summarized above as giving representative numbers for the Internet, but instead form the rule of thumb: Internet paths are sometimes subject to a high incidence of reordering, but the effect is strongly site-dependent, and apparently correlated with route

fluttering,which makes sense since route fluttering provides a mechanism for frequently reordering packets.

这样，我们不应该将先前对无序传输的总结作为网络中具有代表性的数量，但是，与单凭经验形成的方法不同的是，网络传输路径有时会受到重排序的影响，这种影响有很强的网站从属性，与线路变动有很明显的联系，这很有意义，因为线路变动为常出现的数据包再排序提供了一种改善机制。

We observed reordering rates as high as 36% of all packets arriving in a single connection. Interestingly, some of the most highly reordered connections did not suffer any packet loss, and no needless retransmissions due to false signals from duplicate acks. We also occasionally observed humongous reordering ―gaps.‖ However, the evidence suggests that these gaps are not due to route changes, but a different effect. Figure 1 shows a sequence plot exhibiting a massive reordering event. This plot reflects packet arrivals at the TCP receiver, where each square marks the upper sequence number of an arriving data packet. All packets were sent in increasing sequence order.

在单一的链接中，对所有接收的数据包的进行测试中，重排序率率高达36%。有趣的是，许多重排序率高的连接中不会出现数据包丢失，也没有因为确认符出错而出现的重传现象。有时，我们也会发现一个较大的重传缺口。但是结果表明，这种缺口不是因为线路的改变造成的，是另一种影响。图一表示一个较大的在排序事件。这个图反映了TCP接收者处数据包的到达，每一个方块表示数据包到达时序号的最大值。所有的数据包以递增序列发送。

Fitting a line to the upper points yields a data rate of a little over 170 Kbyte/sec, which was indeed the true (T1) bottleneck rate (x 4). The slope of the packets delivered late, though, is just under 1 Mbyte/sec, consistent with an Ethernet bottleneck. What has apparently happened is that a router with Ethernet-limited connectivity to the receiver stopped forwarding packets for 110 msec just as sequence 72,705 arrived, most likely because at that point it processed a routing update [FJ94]. It finished between the arrival of 91,137 and 91,649, and began forwarding packets normally again at their arrival rate, namely T1 speed. Meanwhile, it had queued 35 packets while processing the update, and these it now finally forwarded whenever it had a chance, so they went out as quickly as possible, namely at Ethernet speed, but interspersed with new arrivals.

使曲线达到最高点会产生略超过每秒170Kbyte的数据传输速率，这才是真正的瓶颈速率。数据包传输速率虽然只是低于每秒1Mbyte，但这与网络中的瓶颈相一致。很明显，一个被网络限制与发送方连接的路由器会在110毫秒内停止转发数据包，在此期间，有72705个数据包到达，也许给路由器正在进行罗经的更新。更新会在91137到91649个数据包到达时完成，然后以速度T1重新开始传输数据包。同时，他还准备好了35个包，只要有机会，这些包就会被转发，所以他们一网络速率尽快传出，但有时会新到的数据包打断。We observed this pattern a number of times in our data—not frequent enough to conclude that it is anything but a pathology,but often enough to suggest that significant momentary increases in networking delay can be due to effects different from both route changes and queueing; most likely due to router forwarding lulls.

我们用自己的资料对该形式进行了很多次研究，虽不足以断定这不是错误，但也足以表明在网络延迟中增长的具有重要意义的那一刻所产生的影响与路径的改变和排序是不同的，很可能是因为路由器的转发停止造成的。

Impact of reordering. While out-of-order delivery can violate one's assumptions about the network—in particular, the abstraction that it is well-modeled as a series of FIFO queueing servers—we find it often has little impact on TCP performance. One way it can make a difference, however, is in determining the TCP ―duplicate ack‖ threshold a sender uses to infer that a packet requires retransmission. If the network never exhibited reordering, then as soon as the receiver observed a packet arriving that created a sequence ―hole,‖ it would know that the expected in-sequence packet was dropped, and could signal to the sender calling for prompt retransmission. Because of reordering, however, the receiver does not know whether the packet in fact was dropped; it may instead just be late. Presently, TCP senders retransmit if N d=3 ―dups‖ arrive, a value chosen so that ―false‖ dups caused by out-of-order delivery are unlikely to lead to spurious retransmissions.

重排序的影响：无序传输会与人们有关网络的假设相违背，特别是那些被很好的模仿并被作为一系列先入先出序列的提供者的抽象概念，它通常会对TCP传输的性能有所影响。比如在决定TCP传输的确认符的副本的阈值，发送方通常根据此判断数据包是否需要重传。如果网络中没有发生再排序，那么接收方一旦接收到那些能够产生序列洞的数据包，就明白所希望接收到的数据包已经丢掉了，这可以作为发送方进行合适的重传的标识。由于在排序，接收方并不知道数据包是否已经丢失，有可能只是较晚到达。现在，如果接收到该数据包，发送方就会选择一个数值，这样由于无序传输引起的错误的数据包就不会引起错误的

重传。

The value of Nd=3 was chosen primarily to assure that the threshold was conservative. Large-scale measurement studies were not available to further guide the selection of the threshold. We now examine two possible ways to improve the fast retransmit mechanism: by delaying the generation of dups to better disambiguate packet loss from reordering, and by altering the threshold to improve the balance between seizing retransmission opportunities, versus avoiding unneeded retransmissions.

We first look at packet reordering time scales to determine how long a receiver needs to wait to disambiguate reordering from loss.We only look at the time scales of data packet reorderings, since ack reorderings do not affect the fast retransmission process. We find a wide range of times between an out-of-order arrival and the later arrival of the last packet sent before it. One noteworthy artifact in the distribution is the presence of ―spikes‖ at particular values, the strongest at 81 msec. This turns out to be due to a 56 Kbit/sec link, which has a bottleneck bandwidth

of about 6,320 user data bytes/sec. Consequently, transmitting a 512 byte packet across the link requires 81.0 msec, so data packets of this size can arrive no closer, even if reordered. Thus we see that reordering can have associated with it a minimum time, which can be quite large.

该值会事先确定好保证门限值比较合理。许多测量研究不适合阈值的选择。我们现在对两种能够提高快速重传速度的方式：一个是通过延迟该数据包，消除再排序中的数据包丢失，另一种是通过调整阈值，使捕捉重传和避免不必要的重传保持平衡。

首先我们看数据包再排序的时间长度，以决定接收方需要等待多长时间以保证在再排序中不出现包丢失。我们只需要看数据包再排序的时间长度，因为确认符的再排序并不影响快速重传过程。我们发现在无序接收与后到达的数据包之间有一个很宽的时间范围。在分布中值得注意的是有特殊值的毛刺的出现，最长的能持续81毫秒。这缘于每秒56Kbyte的链接，该链接的瓶颈频域宽度为6320。因此，通过该种链接传送一个512byte的数据包需要81毫秒，而这么大的数据包到达时并不会太靠近，即使是再排序。这样，再排序可以在很短的时间内建立连接，而该时间也可以很大。

Inspecting the N1 distributions further, we find that a strategy of waiting 20 msec would identify 70% of the

out-of-order deliveries.For N2 the same proportion can be achieved waiting 8 msec, due to its overall shorter reordering times (presumably due to overall higher bandwidths). Thus, even though the upper end of the distribution is very large (12 seconds!), a generally modest wait serves to disambiguate most sequence holes.

进一步检查N1的分布时，我们发现，20毫秒的等待中，有70%的无需传输被识别，在N2中，由于其在整个过程中缩短的重排序时间(由于较高的带宽)，这个过程仅需要8毫秒。这样，即使该分布曲线中的最大值很大，适度的等待也能消除大部分的序列洞。

We now look at the degree to which false fast retransmit signals due to reordering are actually a problem. We classify each sequence of dups as either good or bad, depending on whether a retransmission in response to it was necessary or unnecessary. When considering a refinement to the fast retransmission mechanism, our interest lies in the resulting ratio of good to bad ,R g:b , controlled by both the dup ack threshold value Nd we consider, and the waiting time W ,observed by the receiver before generating a dup upon the advent of a sequence hole.

由于重排序所造成的快速错误重传信号所达到的程度确实是一个问题。我们根据对于重传的回应是否必要将数据包分成好坏两种。当考虑改进快速重传方法时，我们把重点放在了二者的比例上，该值由数据包门限值Nd和等待时间W决定，在序列洞产生前由接收方进行研究。

For current TCP,Nd=3 dups and W=0 For these values,we find in N1 ,Rg:b=22 and in N2,Rg:b=300 The order of magnitude improvement between N1 and N2 is due to the use in N2 of bigger windows (2), and hence more opportunity for generating good dups. Clearly, the current scheme works well. While Nd=4 improves Rg:b by about a factor of 2.5, it also diminishes fast retransmit opportunities by about 30%, a significant loss.

目前的TCP传输，当Nd=3，W=0时，在N1中，该比值为22，N2中为300。对N1、N2间进行改进的命令由N2中的较大窗口决定，这样就会产生更多的好的数据包。很明显，目前的计划进行的很好。当

Nd=4时，该比值会提高2.5个百分点，而且将快速重传的比例降低了30%，很有意义的减少。

For Nd=2 ,we gain about 65–70% more fast retransmit opportunities,a hefty improvement, each generally saving a connection from an expensive timeout retransmission. The cost, however,is that Rg:b falls by about a factor of three. If the receiving TCP waited W=20 msec before generating a second dup, then Rg:b falls only slightly (30% for N1, not at all for N2). Unfortunately,adding to TCPs Nd=2 coupled with the W=20 msec delay requires both sender and receiver modifications, greatly increasing the problem of deploying the change. Since partial deploy

-ment of only the sender change (Nd=2) significantly increases spurious retransmissions, we conclude that, due to the size of the Internet' s installed base, safely lowering Nd is impractical.

当Nd=2时，快速重传被提高了65-70%，每一个都从较大延时的重传中保存了一个链接，但Rg:b降低了3%。如果W=20毫秒，Rg:b只会产生很小幅度的降低。若综合以上两种因素，就要求收发双方都要做出改变，这极大的增加了配置改变问题的比例。既然仅对发送方进行的改变会明显增加错误的重传，那么可以认为，由于网络安装基础的尺寸的限制，仅降低Nd而提高性能是不切实际的。

We note that the TCP selective acknowledgement (―SACK‖) option,now pending standardization, also holds promise for honing TCP retransmission [MMSR96]. SACK provides sufficiently finegrained acknowledgement information that the sending TCP can generally tell which packets require retransmission and which have safely arrived (x 5.4). To gain any benefits from SACK, however,requires that both the sender and the receiver support the option, so the deployment problems are similar to those discussed above.Furthermore, use of SACK aids a TCP in determining what to retransmit,but not when to retransmit. Because these considerations are orthogonal, investigat -ing the effects of lowering Nd=2 to 2 merits investigation, even in face of impending deployment of SACK.

我们发现TCP中的SACK虽然目前还没有标准化，但一直希望能够对TCP的重传进行处理。SACK提供了十分细致的确认信息，包括哪些数据包需要重传，哪些已经安全到达。然而，要想从SACK中得到更多，就需要收发双方都支持该技术，所以部署问题同我们上面讨论的那些一样。而且，SACK的使用使得TCP能够了解到哪些需要重传，而不是何时去重传。由于这些注意事项是很系统具体的，所以即使是面对将要进行的SACK，对于降低Nd所带来的影响的研究也是值得的。

We observed one other form of dup ack series potentially leading to unnecessary retransmission. Sometimes a series occurs for which the original ack (of which the others are dups) had acknowledged all of the outstanding data. When this occurs, the subsequent dups are always due to an unnecessary retransmission arriving at the receiving TCP, until at least a round-trip time (RTT) after the sending TCP sends new data. For Nd=3 these sorts

of series are 2-15 times more frequent than bad series, which is why they merit discussion.They are about 10 times rarer than good series. They occur during retransmission periods when the sender has already filled all of the sequence holes and is now retransmitting unnecessarily. Use of SACK eliminates these series. So would the following heuristic: whenever a TCP receives an ack, it notes whether the ack covers all of the data sent so far. If so, it then ignores any duplicates it receives for the ack, otherwise it acts on them in accordance with the usual fast retransmission mechanism.

我们研究了另一种可能引起不必要重传的确认符副本的形式。有时，为了表明原始确认符已经对所有的未知数据进行了确认，这一组数据包会被发送。当这些数据包被发送时，由于在接收到TCP数据包时接收到了不必要的重传，后续的数据包会持续至少一个循环周期，知道发送方发送新的数据。当Nd=3时，这类数据包的发送频率是那些数据包的2-15倍，这也就是为什么它们值得研究的原因。他们比那些数据包更稀罕。在重传期间，当发送方已经将所有的序列洞填满并且现在不需要重传时，这些数据包就会被发送出去。通过使用SACK可以消除这些序列。所以我们得到下面的启示：无论何时当TCP接收到确认符时，就会判断它是否能对当前接收到的全部数据包有效。如果有效，它将会忽略掉它接收到的所有有关该确认符的副本，否则，它将会像前面的快速重传机制一样进行操作。

3.2 Packet replication 数据包应答

In this section we look at packet replication: the network delivering multiple copies of the same packet. Unlike

reordering, it is difficult to see how replication can occur. Our imaginations notwithstanding, it does happen, albeit rarely. We suspect one mechanism may involve links whose link-level technology includes a notion of retrans

-mission, and for which the sender of a packet on the link incorrectly believes the packet was not successfully received , so it sends the packet again

在这一节，我们将介绍数据包的应答：对于同一个数据包，网络会传输它的很多复本。与再排序不同的是，数据包应答的实现方式是很难了解的。我们的假设虽然很少发生，但确实存在，我们怀疑有一种与连接相关的机制，这种机制包含重传的概念，由于这种机制，发送方会错误的认为数据包没有成功接收，然后将该数据包再发送一遍。

In N1, we observed only once instance of packet replication, in which a pair of acks, sent once, arrived 9 times, each copy coming 32 msec after the last. The fact that two packets were together replicated does not fit with the explanation offered above for how a single packet could be replicated, since link-layer effects should only replicate one packet at a time. In N2, we observed 65 instances of the network infrastructure replicating a packet, all of a single packet, the most striking being 23 copies of a data packet arriving in a short blur at the receiver. Several sites dominated the N2 replication events: in particular, the two Trondheim sites, ―sintef1‖and ―sintef2‖, accounted for half of the events (almost all of these involving sintef1), and the two British sites, ―ucl‖ and

―ukc‖, for half the remainder. After eliminating these, we still observed replication events among connections between 7 different sites, so the effect is somewhat widespread.

在N1中，只有一次数据包的应答，在这次应答中，一组确认符只发送一次，却接收到了九遍，数据包每隔32毫秒到达一次。两个数据包同时被复制与上面提到的一个数据包是怎样被复制的解释不相符，因为链路层一次只能复制一个数据包。在N2中，我们对65个网络数据包复制进行了研究，在接收方，有23个复本同时到达。许多网站完成了N2中的应答：特别是特隆赫姆的sintef1和sintef2网站，占了一半，两个英国网站―ucl和ukc占去了剩下的一半。除去这些网站，我们仍能在7个不同网站间发现应答事件，所以这种现象是普遍存在的。

Surprisingly, packets can also be replicated at the sender, before the network has had much of a chance to perturb them. We know these are true replications and not packet filter duplications, as discussed in [Pa97a], because the copies have had their TTL fields decremented. There were no sender-replicated packets in N1 but 17 instances in

N2, involving two sites (so the phenomenon is clearly site-specific).

令人惊讶的是，在要发送的数据包被扰乱前，数据包可以在发送方处被复制。这种复制是真实存在的，不是数据包过滤的复制，因为每一个复本都使自己的TTL区域减少了。在N1中，没有发送方复制的数据包，在N2中有17个（所以这种现象很明显是网站特有的）。

3.3 Packet corruption 数据包出错

The final pathology we look at is packet corruption, in which the network delivers to the receiver an imperfect copy of the original packet. For data packets, tcpanaly cannot directly verify the checksum because the packet filter used in our study only recorded the packet headers, and not the payload. (For ―pure acks,‖ i.e., acknowledge -ment packets with no data payload, it directly verifies the checksum.) Consequently, tcpanaly includes algorithms to infer whether data packets arrive with invalid checksums, discussed in [Pa97a]. Using that analysis, we first found that one site, ―lbli,‖ was much more prone to checksum errors than any other. Since lbli' s Internet link is via an ISDN link, it appears quite likely that these are due to noise on the ISDN channels.

最后，我们来看一下数据报错误，主要是由于网络不能对原始数据包进行较好的传输。对于数据包，由于我们在试验中使用的数据包过滤器只记录数据包的包头，而不是净负荷，所以TCP不能直接识别出数据包的校验和。因此，TCP包含数据包是否会携带无效校验和到达的机制将在97页介绍。应用这种分析方法，发现lbli网站与其他网站相比，出现的校验和错误更多。因为lbli的网络连接是通过ISDN连接的，这可能是由ISDN通道上的噪声造成的。

After eliminating lbli, the proportion of corrupted packets is about 0.02% in both datasets. No other single site

strongly dominated in suffering from corrupted packets, and in N2, most of the sites receiving corrupted packets had fast (T1 or greater) Internet connectivity, so the corruptions are not primarily due to noisy, slow links. Thus, this evidence suggests that, as a rule of thumb, the proportion of Internet data packets corrupted in transit is around 1 in 5,000 (but see below).

消除lbli后，在数据集中，出错的数据包的比例为0.02%。所有网站都有数据包出错现象，在N2中，接收到出错数据包的大部分网站在网络连接中较快，所以这些出错并不是因为噪声或者是较慢的链接。这表明，作为一个经验，在传输中出现网络数据包出错的值为1-5000

A corruption rate of 1 packet in 5,000 is certainly not negligible, because TCP protects its data with a 16-bit checksum. Consequently, on average one bad packet out of 65,536 will be erroneously accepted by the receiving TCP, resulting in undetected data corruption. If the 1 in 5,000 rate is indeed correct, then about one in every 300 million Internet packets is accepted with corruption— certainly, many each day. In this case, we argue that TCP' s 16-bit checksum is no longer adequate, if the goal is that globally in the Internet there are very few corrupted packets accepted by TCP implementations. If the checksum were instead 32 bits, then only about one in

2*1013packets would be accepted with corruption.

5000个数据包中有一个数据包出错的概率是可以忽略的，因为TCP通过一个16bite的校验和保证数据传输的正确性。结果，在65536个数据包中平均有一个数据包由于出错被错误的接受，且不会被发现。如果1//5000的概率是正确的，那么每接收3亿个数据包就会出现一次错误。鉴于此，如果目标是达到全球网络中出现较少的数据包接受错误，那么含有16bite校验和的TCP传输已经不能满足要求，若果将校验和拓展至32位，那么出错的概率将为1/20000亿。

Finally, we note that the data checksum error rate of 0.02% of the packets is much higher than that measured directly (by verifying the checksum) for pure acks. For pure acks, we found only 1 corruptionout of 300,000 acks in N1, and, after eliminating lbli, 1 out of 1.6 million acks in N2. This discrepancy can be partially addressed by accounting for the different lengths of data packets versus pure acks. It can be further reconciled if―header compression‖ such as CSLIP is used along the Internet paths in our study [Ja90], as that would greatly increase the relative size of data packets to that of pure acks. But it seems unlikely that header compression is widely used for high-speed links, and most of the inferred N2 data packet corruptions occurred for T1 and faster network paths.

最后，我们注意到，0.02%的数据包的校验和的出错率比纯确认符的直接测量的高的多。对于确认符，在N1中，出错率为1/3000000，消除lbli后，在N2中为1/1600000。与纯粹的确认符相比，这种差距可能是由于不同的数据包长度所致。如果像我们在CSLIP中沿着网络传输路径进行一样，对数据包进行头压缩，那么他们会很好的解决，并能很好的提高数据包之间的相关度。但是，头压缩好像并没有被广泛的应用于高速链路传输中，N中大部分的相关数据报错误发生在T1或者更快的网络链路传输中。

One possibility is that the packets inferred by tcpanaly infer as arriving corrupted—because the receiving TCP did not respond to them in any fashion—actually were never received by the TCP for a different reason, such as inadequate buffer space.We partially tested for this possibility by computing corruption rates for only those traces monitored by a packet filter running on machine separate from the receiver (but on the same local network), versus those running on the receiver' s machine. The former resulted in slightly higher inferred corruption rates, but not significantly so, so if the TCP is failing to receive the packets in question, it must be due to a mechanism that still enables the packet filter on the receiving machine to receive a copy. One can imagine such mechanisms, but it seems unlikely they would lead to drop rates of 1 in 5,000.

一种可能是，数据包会根据接收到的错误（因为接受TCP不会以任何方式对它们进行应答）自然不会由于不同原因被TCP接收，例如缓冲区空间中不足。我们通过计算出错率测试这种可能性，因为只有这些在与接收者分开的机器上进行的过滤器操作。前者是的出错率略高，但并不明显，如果TCP不能成功接受这些数据包，就会根据某种机制使得在接收机上的过滤器接收数据包的复本。我们可以想象到这种操作机制，但它好像并不能降低出错率。

Another possibility is that data packets are indeed much more likely to be corrupted than the small pure ack packets, because of some artifact in how the corruption occurs. For example, it may be that corruption primarily occurs inside routers, where it goes undetected by any link-layer checksum, and that the mechanism (e.g., botched DMA, cache inconsistencies) only manifests itself for packets larger than a particular size.

另一种可能是，由于在错误发生的方式中有一些人为因素，所以数据包比较小的确认符包更加容易出错，例如，有可能在路由器内部出现错误，而这种错误不能被链路层校验和发现，只会将其视为一个尺寸较大的数据包。

Finally, we note that bit errors in packets transmitted using CSLIP can result in surprising artifacts when the CSLIP receiver reconstructs the packet header—such as introducing the appearance of in-sequence data, when none was actually sent!

最后我们发现，当CSLIP接收者重构包头时，通过应用CSLIP进行数据包传输时出现的比特位的错误会引起令人吃惊的人为错误，例如引入数据序列的出现，此时什么也没有发出。

In summary, we cannot offer a definitive answer as to overall Internet packet corruption rates: but the conflicting evidence that corruption may occur fairly frequently argues for further study in order to resolve the question.

总之，对于全网的数据包出错率，我们现在还不能给出一个确切的答案，但是冲突表明，错误会经常出现，这就要求我们为解决这些问题进行更加深入的研究。

4 Bottleneck Bandwidth 瓶颈的频带宽度

In this section we discuss how to estimate a fundamental property of a network connection, the bottleneck bandwidth that sets the upper limit on how quickly the network can deliver the sender' s data to the receiver. The bottleneck comes from the slowest forwarding element in the end-to-end chain that comprises the network path. We make a crucial distinction between bottleneck bandwidth and available bandwidth. The former gives an upper bound on how fast a connection can possibly transmit data, while the less-well-defined latter term denotes how fast the connection should transmit to preserve network stability. Thus, available bandwidth never exceeds bottleneck bandwidth, and can in fact be much smaller (x 6.3).

在这一部分，我们将要介绍如何测量网络连接的基本属性，网络将数据从发送方传输到接收方的速率最大值的瓶颈的频带宽度。瓶颈指的是由多条网络传输路径组成的端到端的传输链路的最小传输速率。我们对瓶颈带宽和有效带宽做了明确区分，前者给出了网络能够传输数据的最大值，后者则指出为保证网路的稳定运行时允许的最大传输速率。这样，有效带宽绝不会超过瓶颈带宽，事实上，它通常是很小的。

We will denote a path' s bottleneck bandwidth as Pb. For measurement analysis, _B is a fundamental quantity because it determines what we term the measures the amount of time required to forward a given packet through the bottleneck element. If a packet carries a total of b bytes and the bottleneck bandwidth is _B byte/sec, then:

in units of seconds. From a queueing theory perspective, Qb is simply the service time of a -byte packet at the bottleneck link. We use the term ―self-interference‖ because if the sender transmits two-byte packets with an interval _Ts < Qb between them, then the second one is guaranteed to have to wait behind the first one at the bottleneck element (hence the use of ―Q‖ to denote ―queueing‖). We will always discuss Qb in terms of user data bytes, i.e., TCP packet payload, and for ease of discussion will assume b is constant. We will not use the term for acks.

我们定义频带宽度为pB，在测量分析中该量是一个基本量，因为它决定了我们定义的Qb，Qb为通过瓶颈传输所给数据包所需的时间值。若果数据包含有bbyte，瓶颈带宽为，那么，公式。根据队列理论的观点，改时间为bbyte数据包在瓶颈处传输时所需时间。如果发送方发送两个bbyte的数据包的时间间隔小于Qb，那

么为保证第二个能够准确传输就需要在瓶颈处等待知道第一个传输完毕，我们将这个定义为自干扰。我们会一直按照用户数据byte讨论该变量，TCP数据包负荷，为了避免讨论，我们会假定改变量为一常量。这个理论不会应用到确认符上。

For our measurement analysis, accurate assessment of Qb is critical. Suppose we observe a sender transmitting packets p1 and p2an interval _Ts apart. Then if _Ts < Qb, the delays experienced by p1 and p2 are perforce correlated, and if _Ts _ Qb their delays, if correlated, are not due to self-interference but some other source (such as additional traffic from other connections, or processing delays). Thus, we need to know Qb so we can distinguish those measurements that are necessarily correlated from those that are not. If we do not do so, then we will skew our analysis by mixing together measurements with built-in delays (due to queueing at the bottleneck) with measurements that do not reflect built-in delays.

对于我们的测量分析，对的精确测量是很重要的。假设我们研究的发送方发送了两个数据包和，这两个数据包之间的时间间隔为，那么，如果，那么这两个数据包之间的延迟经历一定是相关的，如果，那么它们的延迟，即使相关，也是由于其他原因而不是自身的干扰。所以，我们需要去了解，这样我们就能够将与测量相关的因素与那些不相关的相区别。如果我们不这样做，那我们的研究就会倾向于将包含嵌入式延迟的测量与不包含嵌入式延迟的测量相混合。

4.1 Packet pair 包对

The bottleneck estimation technique used in previous work is based on ―packet pair‖ [Ke91, Bo93, CC96]. The fundamental idea is that if two packets are transmitted by the sender with an interval _Ts < Qb between them, then when they arrive at the bottleneck they will be spread out in time by the transmission delay of the first packet across the bottleneck: after completing transmission through the bottleneck, their spacing will be exactly Qb. Barring subsequent delay variations, they will then arrive at the receiver spaced not_Ts apart, but _Tr = Qb. We then compute _B via Eqn 1.

此前应用在工作中的瓶颈测量技术的基础是包对技术。原理是，如果发送方发送相邻的数据包的时间间隔为，那么当他们打瓶颈处时，就会通过传输第一个数据包经过瓶颈处的延迟将他们在瓶颈处及时的展开，通过瓶颈处的传输完成以后，他们之间的间隔就会是标准的。通过阻拦后续数据包的延迟变量，它们到达时的时间间隔不再是，而是。这样我们就可以通过进行计算了。

The principle of the bottleneck spacing effect was noted in Jacobson's classic congestion paper [Ja88], where it in turn leads to the ―self-clocking‖ mechanism. Keshav formally analyzed the behavior of packet pair for a network of routers that all obey the ―fair queueing‖ scheduling discipli ne (not presently used in the Internet), and developed a provably stable flow control scheme based on packet pair measurements [Ke91]. Both Jacobson and Keshav were interested in estimating available rather than bottleneck bandwidth, and for this variations from Qb due to queueing are of primary concern (6.3). But if, as for us, the goal is to estimate _B, then these variations instead become noise we must deal with.

在Jacobson有关拥塞的经典论文中介绍了瓶颈间隔影响的原理，它引起了自计时机制。Keshav对网络中的路由器进行了正式的包对行为分析，他发现，这些路由器都遵循公平排队的规范，此外他还根据包对测量方法设计了一个验证性的平稳的流控制算法。与瓶颈带宽相比，他们两人对有效带宽更感兴趣，对于从中得到的变量是他们首要关心。但是如果去测量，那么这些变量就会变成我们需要解决的噪声了。

Bolot used a stream of packets sent at fixed intervals to probe several Internet paths in order to characterize delay and loss [Bo93]. He measured round-trip delay of UDP echo packets and, among other analysis, applied the packet pair technique to form estimates of bottleneck bandwidths. He found good agreement with known link capacities, though a limitation of his study is that the measurements were confined to a small number of Internet paths.

为了对延迟和丢失进行定性，Bolot通过发送具有固定间隔的一连串数据包对一些网络传输路径进行了探查。他测量了UDP中回音数据包的双程延迟时间，并在其他分析中，应用数据包技术测量瓶颈带宽。虽然他的测量被限制在一个较小数量的网络传输路径中，但是他对已知的链路的能力有了很好的了解。Recent work by Carter and Crovella also investigates the utility of using packet pair in the Internet for estimating _B

[CC96]. Their work focusses on bprobe, a tool they devised for estimating _B by transmitting 10 consecutive ICMP echo packets and recording the interarrival times of the consecutive replies. Much of the effort in developing bprobe concerns how to filter the resulting raw measurements in order to form a solid estimate. bprobe currently filters by first widening each estimate into an interval by adding an error term, and then finding the point at which the most intervals overlap. The authors also undertook to calibrate bprobe by testing its performance for a number of Internet paths with known bottlenecks. They found in general it works well, though some paths exhibited sufficient noise to sometimes produce erroneous estimates.

Carter和Crovella最近的工作也是调查包对技术在网络测量中的有效性。他们将工作重心放在了bprobe上，bprobe是他们通过传输10个连续不断的ICMP数据包并记录连续回复的数据包的时间间隔而设计的测量方法。在开发bprobe中主要精力放在了为形成全面的测量如何过滤结果中的原始测量结果。bprobe currently filters by first widening each estimate into an interval by adding an error term, and then finding the point at which the most intervals overlap。作者也承担了通过利用已知的瓶颈测试网络传输路径对bprobe进行校正的工作。他们发现虽然一些路径会出现噪声造成错误的测量，但通常情况下它的工作还是不错的。

One limitation of both studies is that they were based on measurements made only at the data sender. (This is not an intrinsic limitation of the techniques used in either study). Since in both studies, the packets echoed back from the remote end were the same size as those sent to it, neither analysis was able to distinguish whether the bottleneck along the forward and reverse paths was the same. The bottleneck could differ in the two directions due to asymmetric routing, for example [Pa96], or because some media, such as satellite links, can have significant bandwidth asymmetries depending on the direction traversed [DMT96].

这些研究仅以发送方为基础进行测量。由于在研究中，从远端返回的数据包与发送的数据包大小相同，对它的分析并不能区分前进和返回的路径的瓶颈是否一致。在不对称的路线中，瓶颈在两个方向上是不同的，例如，由于传输介质不同，例如卫星链路，由于传播方向的不同，带宽的不对称性会很明显。

For estimating bottleneck bandwidth by measuring TCP traffic, a second problem arises: if the only measurements available are those at the sender, then ―ack compression‖ (x 6.1) can significantly alter the spacing of the small ack packets as they return through the network, distorting the bandwidth estimate. We investigate the degree of this problem below.

为了通过测量TCP链路估计瓶颈带宽，产生了另一个问题：如果唯一合适的测量方法仅存在于发送方，那么当确认符数据包在网络中返回时，确认符压缩会很明显的改变小的确认符数据包的空间间隔，扭曲了带宽的测量。我们将在下面研究这个问题的程度。

For our analysis, we consider what we term receiver-based packet pair (RBPP), in which we look at the pattern of data packet arrivals at the receiver. We also assume that the receiver has full timing information available to it. In particular, we assume that the receiver knows when the packets sent were not stretched out by the network, and can reject these as candidates for RBPP analysis. RBPP is considerably more accurate than sender -based packet pair (SBPP), since it eliminates the additional noise and possible asymmetry of the return path, as well as noise due to delays in generating the acks themselves. We find in practice this additional noise can be quite large.

对于我们的分析，我们认为如果包对以接收方为基础，那么我们就能在接收方出看到数据包到达时的形式。我们也可以假设接收方有充足的时间信息可用。特别是，我们可以假设接收方知道在网络中数据包什么时候不会被展开，而且为了分析能够拒绝这些数据报而作为候选。RBPP比SBPP更加精确，因为它能够消除返回路径中的附加噪声和可能存在的不对称，就像由于在产生确认符中造成的延迟而产生的噪声。我们发现这些附加噪声在实际中会非常大。

4.2 Difficulties with packet pair 包对困难

As shown in [Bo93] and [CC96], packet pair techniques often provide good estimates of bottleneck bandwidth. We find, however, four potential problems (in addition to noise on the return path for SBPP). Three of these problem can often be addressed, but the fourth is more fundamental.

在^介绍，包对技术能够很好的测量瓶颈的频域宽度，然而，我们也发现了四个问题，其中前三个能被解决，第四个更fundamental.

Out-of-order delivery. The first problem stems from the fact that for some Internet paths, out-of-order packet delivery occurs quite frequently (3.1). Clearly, packet pairs delivered out of order completely destroy the packet pair technique, since they result in 0, which then leads to a negative estimate for. Out-oforder delivery is symptomatic of a more general problem, namely that the two packets in a pair may not take the same route through the network, which then violates the assumption that the second queues behind the first at the bottleneck.

无序传输.第一个问题主要是由于在网络传输中经常会出现无序传输。很明显，包对的无序传输完全破坏了包对技术，因为它们会引起时间差为零，而这回导致测量值为负。无需传输是一个很普通的问题，也就是说，组中的这两个包可能会经过不同的路径传输，这与我们的设想相矛盾，即在瓶颈处，第二个序列会在第一个序列之后。

Limitations due to clock resolution. Another problem relates to the receiver' s clock resolution, Cr, meaning the minimum difference in time the clock can report. Cr can introduce large margins of error around estimates of _B. For example, if Cr = 10 msec,then for b = 512 bytes, packet pair cannot distinguish between 51,200 byte/sec, and We had several sites in our study with Cr = 10 msec. A technique for coping with large Cr is to use packet bunch, in which back-to-back packets are used, rather than just two. Thus,the overall arrival interval spanned by the packets will be about 1 times larger than that spanned by a single packet pair,diminishing the uncertainty due to Cr.

始终分辨率引起的限制.

Changes in bottleneck bandwidth. Another problem that any bottleneck bandwidth estimation must deal with is the possibility that the bottleneck changes over the course of the connection. Figure 2 shows a sequence plot of data packets arriving at the receiver for a trace in which this happened. The eye immediately picks out a transition between one overall slope to another, just after The first slope corresponds to 6,600 byte/sec, while the second is 13,300 byte/sec, and increase of a factor of two. Here, the change is due to lbli' s ISDN link activating a second channel to double the link bandwidth, but in general bottleneck shifts can occur due to other mechanisms, such as routing changes.

瓶颈频带宽度的变化.对于任一个瓶颈宽度的测量，要解决的另一个问题是在连接过程中瓶颈可能会发生变化。图2显示了这一现象。Eye很快分辨出了从一个到另一个跨度的转变，相当于第一个为6600byte/s，第二个为13300byte/s，增长了两倍。在这里，这类变化是由于lbli的ISDN连接使用了另一条线路从而使链路带宽翻倍，但是通常情况下，其他因素例如线路改变也会引起瓶颈的偏移。

Multi-channel bottleneck links. A more fundamental problem with packet-pair techniques arises from the effects of multi-channel links, for which packet pair can yield incorrect overestimates even in the absence of any delay noise. Figure 3 expands a portion of Figure 2. The slope of the large linear trend in the plot corresponds to 13,300 byte/sec, as earlier noted. However, we see that the line is actually made up of pairs of packets. The slope between the pairs corresponds to a data rate of 160 Kbyte/sec. However, this trace involved lbli, a site with an ISDN link that has a hard limit of 128 Kbit/sec = 16 Kbyte/sec, a factor of ten smaller! Clearly, an estimate of _B _ 160 Kbyte/sec

must be wrong, yet that is what a packet-pair calculation will yield.

多通道瓶颈连接.由于多通道连接的影响引起了另一个更加基本的问题，因为即使是没有任何噪声导致的延迟，包对也会引起不正确的测量，图3将图2中的一点放大。如前所述，图中最大的斜率相当于13300byte/s。然而，可以看到，这条线是由很多组数据包组成，两组之间的斜率表示数据速率为160Kbyte/s，而该跟踪所涉及的lbli网站的连接的最大传输速率为16Kbyte/s，小了十倍。所以，该速率是错误的，但在包对测量中会发生。

What has happened is that the bottleneck ISDN link uses two channels that operate in parallel. When the link is idle and a packet arrives, it goes out over the first channel, and when another packet arrives shortly after, it goes out over the other channel. They don't queue behind each other! Multi-channel links violate the assumption that there is a single end-to-end forwarding path, with disastrous results for packet-pair, since in their presence it can form completely misleading overestimates for pB.

瓶颈的ISDN连接使用并行的两条链路。当链路空闲时，数据包经过第一条链路传输，紧接其后若还有一个数据包到达，则会选择另一条线路。多通道连接与我们仅有一个端到端的传输路径的设想相违背，包对可能会出现严重的错误，因为在他们出现时会导致错误的测量。

We stre ss that the problem is more general than the circumstances shown in this example. First, while in this example the parallelism leading to the estimation error came from a single link with two separate physical channels, the exact same effect could come from a router that balances its outgoing load across two different links. Second, it may be tempting to dismiss this problem s correctable by using packet bunch with k=3 instead of packet pair. This argument is not compelling without further investigation,however, because packet bunch could

be more prone to error for regular bottlenecks; and, more fundamentally, k=3 only works if the parallelism comes from two channels. If it came from three channels (or load-balancing links), then k=3 will still yield misleading estimates.

需要注意的是，这个问题比在例子中出现的情况更加常见。首先，在这个例子中，由于一个单独连接中有两个相互分离的物理性链路，他们之间的并行性使得测量可能会出错，这也可能是由于路由器为平衡两个连接的输出引起。第二，通过设置包束k=3而不是包对也可以消除这个问题。这个论点并不是强制性的，因为对于比较正式的瓶颈，包束更加容易出现错误，k=3只对两条链路的并行性起作用，如果是三条链路，k=3也会引起错误的测量。

4.3 Robust bottleneck estimation 有效的瓶颈测量

Motivated by the shortcomings of packet pair, we developed a significantly more robust procedure, ―packet bunch modes‖ (PBM).The main observation behind PBM is that we can deal with packetpair's shortcomings by forming estimates for a range of packet bunch sizes, and by allowing for multiple bottleneck values or apparent bottleneck values. By considering different bunch sizes, we can accommodate limited receiver clock resolutions and the possibility of multiple channels or load-balancing across multiple links, while still avoiding the risk of underestimation due to noise diluting larger bunches, since we also consider small bunch sizes. By allowing

for finding multiple bottleneck values, we again accommodate multi-channel (and multi-link) effects, and also the

possibility of a bottleneck change.

由于包对技术的一些不足，我们设计了一个很有效的程序，PBM，包群模式。PBM的主要研究在于我们能够通过测量一系列包群的大小，允许多瓶颈值或者明显的瓶颈值来解决包对技术的不足。考虑到不同群的尺寸，我们可以调节被限制的接收方的时钟解决方案以及多通道或者通过多条链路进行的下载平衡，同时我们也可以避免由于噪声对较大群组的影响而低估的风险，因为我们也能够考虑小的群尺寸。通过获得的多路瓶颈值，我们也能够调整多通道的影响以及瓶颈改变的可能性。

Allowing for multiple bottleneck values rules out use of the most common robust estimator, the median, since it presupposes unimodality. We instead focus on identifying modes, i.e., local maxima in the density function of the distribution of the estimates. We then observe that:

(i)If we find two strong modes, for which one is found only at the beginning of the connection and one at

the end, then we have evidence of a bottleneck change.

(ii)If we find two strong modes which span the same portion of the connection, and if one is found only for

a packet bunchsize of m and the other only for bunch sizes > m, then we have evidence for an m-channel

bottleneck link.

(iii)We can find both situations, for a link that exhibits both a change and a multi-channel link, such as shown in Figure 2.

鉴于多路瓶颈值对具有单峰特性的估计量—中值，的使用的排斥，我们并没有将注意力放在识别模式等的上面，而是放在测量值的分布的密度函数中的最大值。此后我们研究了以下内容：

1、如果我们发现了两种形式，一种仅在连接的起始端被发现，另一个在终端处被发现，那么我们就能确信瓶颈改变了；

2、如果我们发现的两种形式分布在连接的同一部分，其中一个的群尺寸为m，另一个则大于m，那么我们就能确信一个m通路的瓶颈连接；

3、这两种情况都出现了，如图2所示，既有改变也有多通道连接。

Turning these observations into a working algorithm entails a great degree of niggling detail, as well as the use of a number of heuristics. Due to space limitations, we defer the particulars to [Pa97b]. We note, though, that one salient aspect of PBM is that it forms its final estimates in terms of error bars that nominally encompass_20% around the bottleneck estimate, but might be narrower if estimates cluster sharply around a particular value, or wider if limited clock resolution prevents finer bounds. PBM always tries bunch sizes ranging from two packets to five packets. If required by limited clock resolution or the failure to find a compelling bandwidth estimate (about one quarter of all of the traces, usually due to limited clock resolution), it tries progressively larger bunch sizes, up to a maximum of 21 packets. We also note that nothing in PBM is specific to analyzing TCP traffic. All it requires is knowing when packets were sent relative to one another, how they arrived relative to one another, and their size. 在工作算法中加入这些观测会在研究中加入大量的繁琐的细节，就像大量的试探法的应用。由于篇幅的限制，细节介绍放在97页b。我们注意到，PBM的一个突出之处在于以误差线的形式完成最后的测量，而这个测量值在瓶颈测量的正负20%范围内浮动，这个测量范围有可能因为突然局限在一个特殊的值上而变窄或

者因为受限的时钟解决方案妨碍了边界中断的正常运行而变宽。如果需要，他可能会逐渐尝试较大的群尺寸，一直达到最大值21个数据包。我们也发现PBM中对于分析TCP传输，没有什么特殊的东西，我们所需要知道的就是数据包何时必发送至另一端，一什么样的方式发送，以及他们的尺寸。

We applied PBM to N1 and N2 for those traces for which tcpanaly' s packet filter and clock analysis did not uncover any uncorrectable problems [Pa97a, Pa97b]. After removing lbli,which frequently exhibited both bottleneck changes and multichannel effects, PBM detected a single bottleneck 95–98% of the time; failed to produce an estimate 0-2% of the time (due to excessive noise or reordering); detected a bottleneck change in about 1 connection out of 250; and inferred a multi-channel bottleneck in 1-2% of the connections (though some of these appear spurious). Since all but single bottlenecks are rare, we defer discussion of the others to [Pa97b], and focus here on the usual case of finding a single bottleneck.

由于tcpanaly的包过滤和时钟分析不能发现错误的问题，所以我们将PBM应用到N1、N2跟踪中。Lbli会经常引起瓶颈的改变和多通道影响，当将lbli去除后，PBM在95-98%的时间里都检测到一个单独的瓶颈，在250个连接中检测到一次瓶颈的改变，有1-2%的链路出现多路瓶颈。因为出现一次，所以将对它的讨论放在后面，而将工作重点放在有一个瓶颈的常见情况中。

Unlike [CC96], we do not know a priori the bottleneck bandwidths for many of the paths in our study. We thus must fall back on self-consistency checks in order to gauge the accuracy of PBM. Figure 4 shows a histogram of the estimates formed for N2. (The N1 estimates are similar, though lower bandwidth estimates are more common.) The 170 Kbyte/sec peak clearly dominates, and corresponds to the speed of a T1 circuit after removing overhead.. The 7.5 Kbyte/sec corresponds to 64 Kbit/sec links and the 13–14 Kbyte/sec peak reflects 128 Kbit/sec links. The 30 Kbyte/sec peak corresponds to a 256 Kbit/sec link, seen almost exclusively for connections involving a U.K. site. The 1 Mbyte/sec peaks are due to Ethernet bottlenecks, and likely reflect T3-connectivity beyond the limiting Ethernet.

在我们的研究中，我们不可能一开始就知道大部分链路的瓶颈带宽。为了验证PBM的准确性，我们就需要借助于一致性校验上。图4为N2测量结果的直方图。170 Kbyte/sec占主要部分，出去其他使用，相当于T1，7.5 Kbyte/sec 相当于64 Kbit/sec links，13–14 Kbyte/sec相当于128 Kbit/sec。The 30 Kbyte/sec相当于256 Kbit/sec的连接，这也几乎被认为是唯一一个涉及英国网站的速率。1 Mbyte/sec被认为是以太网的瓶颈速率，出受限的以太网外，他似乎反映了T3的连通性。

We speculate that the 330 Kbyte/sec peak reflects use of two T1 circuits in parallel, 500 Kbyte/sec reflects three

T1 circuits (not half an Ethernet, since there is no easy way to subdivide an Ethernet' s bandwidth), and 80

Kbyte/sec comes from use of half of a T1. Similarly, the 100 Kbyte/sec peak most likely is due to splitting a single E1 circuit in half. Indeed, we find non-North American sites predominating these connections, as well exhibiting peaks at 200–220 Kbyte/sec, above the T1 rate and just a bit below E1. This peak is absent from North American connections.

我们推断330 Kbyte/sec为两条T1循环链路的并行速率，500 Kbyte/sec为三条，80 Kbyte/sec为T1的一半. 同样，100 Kbyte/sec很有可能是将E1分割成的。事实上，我们在这些连接中并没有发现非北美的网站，就像200–220 Kbyte/sec，比T1高但只比E1低一bit。在北美的连接中没有这个峰值。

In summary, we believe we can offer plausible explanations for all of the peaks. Passing this self-consistency test in turn argues that PBM is indeed detecting true bottleneck bandwidths.

总之，我们相信我们能够对所有的峰值给出合理的解释。这些前后一致的测试反过来也说明PBM确实能够检测到真正的瓶颈带宽。

We next investigate the stability of bottleneck bandwidth over time. If we consider succe ssive estimates for the same sender/receiver pair, then we find that 50% differ by less than 1.75%; 80%, by less than 10%; and 98% differ by less than a factor of two. Clearly, bottlenecks change infrequently.

下面我们来看瓶颈带宽随时间的稳定性。如果我们对相同的发送或者接收方对进行连续的测量，就会发现，

当低于1.75%时，50%的不同，低于10%时，为80%，20%时为98%，很明显，瓶颈很少发生改变。

The last property of bottleneck bandwidth we investigate is symmetry: how often is the bottleneck from host A to host B the same as that from B to ? Bottleneck asymmetries are an important consideration for sender-based ―echo‖ measurement techniques, since these will observe the minimum bottleneck of the two directions [Bo93, CC96]. We find that for a given pair of hosts, the median estimates in the two directions differ by more than _ 20% about 20% of the time. This finding agrees with the observation that Internet paths often exhibit major routing asymmetries [Pa96]. The bottleneck differences can be quite large, with for example some paths T1-limited in one direction but Ethernet-limited in the other. In light of these variations, we see that sender-based bottleneck measurement will sometimes yield quite inaccurate results.

对于瓶颈带宽，我们要考虑的最后一个方面为它的对称性，从A到B的瓶颈与从B到A的瓶颈何时相同。对于一发送方为基础的反射测量技术，瓶颈的对称性是很重要的一个值得考虑的因素，因为这能够帮助观测两个方向上的瓶颈的最小值。我们发现，对于给定的一对用户，在两个方向上的测量中值在20%的时间中有超过20%的不同。这个发现也印证了网络传输路径经常出现主要路线的不对称性的结论。例如由于在某一个方向上的速率限制为T1，而在另一个方向上为E，所以瓶颈的差距会很大。鉴于这些变量，我们可以认为以发送方为基础的瓶颈测量有时会产生不正确的结果。

4.4 Efficacy of packet-pair 包对的有效性

We finish with a look at how packet pair performs compared to PBM. We confine our analysis to those traces for which PBM found a single bottleneck. If packet pair produces an estimate lying within _ 20% of PBM's, then we consider it agreeing with PBM, otherwise not.

最后我们来讨论一下与PBM相比，包对技术是如何运行的。我们将分析范围放在PBM仅有一个瓶颈的情况下。如果包对的测量结果在PBM的正负20%内，就认为它符合PBM，否则不是。

We evaluate ―receiver-based packet pair‖ (RBPP, per 4.1) by considering it as PBM limited to packet bunch sizes of 2 packets (or larger, if needed to resolve limited clock resolutions). We find RBPP estimates almost always (97–98%) agree with PBM. Thus, if (1) PBM' s general clustering and filtering algorithms are applied to packet pair, (2) we do packet pair estimation at the receiver, (3) the receiver benefits from sender timing information, so it can reliably detect out-of-order delivery and lack of bottleneck ―expansion,‖ and (4) we are not concerned with multi-channel effects, then packet pair is a viable and relatively simple means to estimate the bottleneck bandwidth.

我们将RBPP类比成仅含有两个数据包的PBM数据包群，来对其进行测量。我们发现RBPP的测量结果很接近于PBM。这样，如果PBM的集合滤除算法也可应用在包对技术中，我们就可以在接收方处对包对进行测量，根据发送方的时间信息，接收方能够可靠的侦测到无序传输以及瓶颈扩展的缺乏，我们不关心多条通道的影响，这样包对技术成为一个测量瓶颈带宽的可行的相当简单的方法。

We also evaluate ―sender-based packet pair‖ (SBPP), in which the sender makes measurements by itself. SBPP is of considerable interest because a sender can use it without any cooperation from the receiver, making it easy to deploy in the Internet. To fairly evaluate SBPP, we assume use by the sender of a number of considerations for forming sound bandwidth estimates, detailed in [Pa97b]. Even so, we find, unfortunately, that SBPP does not work especially well. In both datasets, the SBPP bottleneck estimate agrees with PBM only about 60% of the time. About one third of the estimates are too low, reflecting inaccuracies induced by excessive delays incurred by the acks on their return. The remaining 5–6% are overestimates (typically 50% too high), reflecting ack compression (6.1).

我们也对SBPP进行了估计，发送方可以自己进行测量。SBPP很重要，因为发送方能够在不与接收方合作的情况下使用它，也使得它更加容易在网络中推广。为了公平的估测SBPP，我们假设为了测量声音带宽使用它，详见97页b。即使这样，我们发现SBPP的工作并不是很好。在两者的数据集中，在时间上，SBPP的瓶颈测量仅有60%的与PBM的相同。大约三分之一的测量结果太低，这些不正确的感应是由于较多的延迟

引起的，而这些延迟是由于返回的确认符所致。剩下的还有5-6%的被估计过高，说明有确认符被压缩。

5 Packet Loss 数据包丢失

In this section we look at what our measurements tell us about packet loss in the Internet: how frequently it occurs and with what general patterns (5.1); differences between loss rates of data packets and acks (5.2); the degree to which loss occurs in bursts (5.3); and how well TCP retransmission matches genuine loss (5.4).

在本单元，我们将从我们的测量方法中得到网络中数据包丢失的相关信息：在第一节介绍它的发生频率及方式；第二节介绍数据包和确认符的损失率的不同；丢失爆发的程度将在第三节介绍；第四节介绍TCP重传是如何与丢失相匹配的。

5.1 Loss rates 损失率

A fundamental issue in measuring packet loss is to avoid confusing measurement drops with genuine losses. Here is where the effort to ensure that tcpanaly understands the details of the TCP implementations in our study pays off [Pa97a]. Because we can determine whether traces suffer from measurement drops, we can exclude those that do from our packet loss analysis and avoid what could otherwise be significant inaccuracies.

在测量数据包丢失的中，一个基础性的问题是避免因真正存在的损失使得测量变得复杂。这就是值得我们付出努力的地方，保证tcpanaly了解在我们的研究中TCP安装启用的细节。因为我们能够决定是否使跟踪经历measurement drops，所以我们能够消除那些在数据包丢失分析中确实存在的东西，避免会出现的错误。For the sites in common, in N1, 2.7% of the packets were lost, while in N2, 5.2%, nearly twice as many. However, we need to address the question of whether the increase was due to the use of bigger windows in N2 (2). With bigger windows, transfers will often have more data in flight and, consequently, load router queues much more.

对与普通的网站，在N1中，由2.7%的数据包丢失，在N2中有5.2%出现这种情况，几乎是N1中的两倍。需要我们注意的是是否是因为N2中窗口的变大引起的丢失率的增长。窗口比较大使得发送方有许多待发送的数据包，所以路由器中有许多待排队的数据包。

We can assess the impact of bigger windows by looking at loss rates of data packets versus those for ack packets. Data packets stress the forward path much more than the smaller ack packets stress the reverse path, especially since acks are usually sent at half the rate of data packets due to ack-every-other-packet policies. On the other hand, the rate at which a TCP transmits data packets adapts to current conditions, while the ack transmission rate does not unless an entire flight of acks is lost, causing a sender timeout. Thus, we argue that ack losses give a clearer picture of overall Internet loss patterns, while data losses tell us specifically about the conditions as perceived by TCP connections.

通过将数据包丢失率与确认符数据包的相比，我们能够估计出大窗口的影响。数据包主要是在前行的路径上，小确认符数据包主要是在返回的路径上，特别是由于数据包每隔一个确认一次的规则，确认符数据包的传输速率是数据包传输速率的一半。另一方面，TCP中的数据包传输速率能够适应当前的网络环境，而确认符不能，除非整个确认符序列丢失，导致发送方超时。这样，确认符数据包的丢失反映了全网的丢包形式，而数据丢失反映了通过TCP连接感知的传输环境。

In N1 2.88% of the acks were lost and 2.65% of the data packets, while in N2 the figures are 5.14% and 5.28%. Clearly, the bulk of the difference between the N1 and N2 loss rates is not due to the use of bigger windows in N2. Thus we conclude that, overall, packet loss rates nearly doubled during 1995. We can refine these figures in a significant way, by conditioning on observing at least one loss during a connection. Here we make a tacit assumption that the network has two states, ―quiescent‖ and ―busy,‖ and that we can distinguish between the two because when it is quiescent, we do not observe any (ack) loss.

在N1中，确认符的丢失率为2.88%，数据包丢失率为2.65%，在N2中，则分别为5.14%和5.28%。很明显，N1和N2中的不同并不是因为在N2中使用了较大的窗口。这样，我们认为1995年的包丢失率翻倍了。通过在连接中条件性的观察至少一此丢失，我们能够以很有效的方式获得这些数据。这里我们得出一个明显的

假设，网络中存在静止和忙碌两种状态，我们能够很轻松的区分出这两种状态，因为当处于静止状态时，包丢失率为零。

In both N1 and N2, about half the connections had no ack loss. For ―busy‖ connections, the loss rates jump to

5.7% in N1 and 9.2% in N2. Thus, even in N1, if the network was busy (using our simplistic definition above), loss rates were quite high, and for N2 they shot upward to a level that in general will seriously impede TCP performance.

在N1、N2中，大约有一半的连接未出现确认符丢失现象。对于处于忙碌状态的连接，N1、N2的丢失率分别为5.7%和9.2%。这样，即使在N1中，如果网络处于忙碌状态，丢包率也会非常高，对于N2，他们所达到的程度通常会严重影响TCP传输功能的实现。

So far, we have treated the Internet as a single aggregated network in our loss analysis. Geography, however, plays a crucial role. To study geographic effects, we partition the connections into four groups: ―Europe,‖ ―U.S.,‖ ―Into Europe,‖ and ―Into U.S.‖ European connections have both a European sender and receiver, U.S. have both in the United States. ―Into Europe‖ connections have European data senders and U.S. data receivers. The terminology is backwards here because what we assess are ack loss rates, and these are generated by the receiver. Hence, ―Into Europe‖ loss rates reflect those experienced by packet streams traveling from the U.S. into Europe. Similarly,

―Into U.S.‖ are connections with U.S. data senders and European receivers.

到目前为止，在我们的研究中，我们一直将网络视为一个单独的聚合网络。然而地理因素也发挥着重要作用。为了研究地理因素的影响，我们将这些连接分为四组：―Europe,‖ ―U.S.,‖ ―Into Europe,‖ and ―Into U.S.‖四组。其中―Europe‖发送和接收方均在欧洲的连接，―U.S.‖为发送接收方均在美国的，―Into Europe‖为发送方在欧洲接收方在美国。定义这样的属于是因为我们想研究一下确认符数据包的丢失率，而这由接收方产生。因此，―Into Europe‖的丢失率反应的是那些通过数据包流从美国发送至欧洲的确认符。同样，―Into U.S.‖为发送方在在美国，而接收方在欧洲的连接。

Table 1 summarizes loss rates for the different regions, conditioning on whether any acks were lost (―quiescent‖ or ―busy‖). The second and third columns give the proportion of all connections that were quiescent in N1 and N2, respectively. We see that except for the trans-Atlantic links going into the U.S., the proportion of quiescent connections is fairly stable. Hence, loss rate increases are primarily due to higher loss rates during the

already-loaded ―busy‖ periods. The fourth and fifth columns give the proportion of acks lost for ―busy‖ periods, and the final column summarizes the relative change of these figures. None of the busy loss rates is especially heartening, and the trends are all increasing. The 17% N2 loss rate going into Europe is particularly.

表1总结了不同区域的丢包率，在是否有确认符丢失的条件下。第二和第三组数据分别给出了在N1、N2中所有连接的特性，我们发现除了经过大西洋到达美国的连接外，其他连接的特性都很平稳。因此，丢失率的增高主要是因为在忙碌状态下的较高的丢失率。第四和第五组数据给出了在忙碌期间确认符丢失的特性，最后一组数据总结了这些数据的相对变化。忙碌状态下的丢失率并没有很特别的，他们的趋势都是在增长。17%的丢失率是比较特殊的。

Within regions, we find considerable site-to-site variation in loss rates, as well as variation between loss rates for packets inbound to the site and those outbound (5.2). We did not, however, find any sites that seriously skewed the above figures.

在这些区域中，我们在丢失率中发现了值得注意的站到站的变量，就像针对入站和出站的丢失率之间的变