当前位置：文档之家› The Adaptive Bubble Router 1

The Adaptive Bubble Router 1

The Adaptive Bubble Router1

V.Puente,C.Izu,R.Beivide,J.A.Gregorio,F.Vallejo and J.M.Prellezo

Universidad de Cantabria,39005Santander,Spain

University of Adelaide,SA5005Australia

The design of a new adaptive virtual cut-through router for torus networks is presented in this paper.With much lower VLSI costs than adaptive wormhole

routers,the adaptive Bubble router is even faster than deterministic wormhole routers

based on virtual channels.This has been achieved by combining a low-cost deadlock

avoidance mechanism for virtual cut-through networks,called Bubble?ow control,

with an adequate design of the router’s arbiter.

A thorough methodology has been employed to quantify the impact that this

router design has at all levels,from its hardware cost to the system performance

when running parallel applications.At the VLSI level,our proposal is the adaptive

router with the shortest clock cycle and node delay when compared with other state-

of-the-art alternatives.This translates into the lowest latency and highest throughput

under standard synthetic loads.At system level,these gains reduce the execution time

of the benchmarks https://www.doczj.com/doc/8514170668.html,pared with current adaptive wormhole routers,

the execution time is reduced by up to27%.Furthermore,this is the only router that

improves system performance when compared with simpler static designs.

Key Words:Interconnection subsystem;packet deadlock;crossbar arbitration;hardware routers; VLSI design;performance evaluation;cc-NUMA systems;parallel application benchmarks.

1.INTRODUCTION

Parallel and distributed computing performance is generally a fraction of the sum of the computational power provided by the processors that make up the parallel system. This performance degradation can be attributed to two sources:i)the impossibility of adequately balancing the computing load among processors,and ii)the overheads involved in communication between processors and memories.An ef?cient mapping can be found for a number of applications,resulting in a uniform computational load among processors. However,the cost of communication is unavoidable and we can only do our best to minimize it.

Scalable distributed shared-memory multiprocessors with hardware data cache coherence (cc-NUMA)are nowadays the trend in building parallel computers because they provide the most desirable programmability[20].Other existing distributed shared-memory parallel

This work is supported in part by the Spanish CICYT under contract TIC98-1162-C02-01

2V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO computers rely on the programmer to maintain data coherency throughout the whole system [28].In both cases,system performance can be seriously limited due to the network delays when accessing remote data.The communication subsystem of these machines is composed of a number of interconnected routers arranged in a speci?c topology.The performance of these direct interconnection networks is governed by that of the router and the interconnect. With respect to network topology,Dally[11]and Agarwal[1]recommended the use of low-degree networks belonging to the class of the-ary-cubes.Rings,meshes, tori and hypercubes are representative networks of this class.Some parallel computer manufacturers followed their advice and machines such as the Cray T3D and Cray T3E use three-dimensional tori.In relation to experimental ASCI ultracomputers,the Option Red employs two interconnected2-D meshes and the Blue Mountain uses a three-dimensional torus in the top level of the machine[19,4].

Both T3D and T3E routers[28]use wormhole routing and a set of four and?ve virtual channels per direction respectively.The former uses oblivious dimension order routing (DOR)and avoids network deadlock by changing to the next virtual channel when crossing a wrap-around link.The latter employs an additional virtual channel in order to provide fully adaptive routing.The router employed by the O-2000is the SGI Spider[15]which uses wormhole table-based deterministic routing.Both,T3E and Spider routers,manage messages divided into packets,the maximum size being700bits(10?its)and800bits(5 micropackets,equivalent to the notion of?its)respectively.Moreover,both routers employ memory queues associated to each virtual channel able to store up to22?its in the T3E router and up to5micropackets in the case of the Spider.

In this research,we are going to propose a new packet router with the same functionality as the above commercial routers,but using a different design philosophy.Our goal is to produce the simplest possible hardware design in order to minimize the pin-to-pin router latency without compromising the other performance?gure of merit:the ability to handle large volumes of data,in other words to support high packet throughput.To achieve this goal,we will apply the following architectural strategies:i)Virtual Cut-through?ow control(VCT)[18]will be used instead of wormhole;ii)A new and very simple deadlock avoidance mechanism,based on VCT,called Bubble?ow control will be employed;and iii) A new arbitration policy supporting multiple output requests will be used which is simpler than current policies.

As a consequence of strategies i)and ii)the router uses a minimum number of virtual channels,thus reducing hardware cost and complexity.In addition,strategy iii)allows for a low cost implementation of adaptive routers.The resulting design for a-ary-cube network is a particularly simple router which adequately balances base latency and network throughput.

The results presented in this paper show that our adaptive Bubble router outperforms state-of-the-art commercial solutions.In addition,and due to its good trade-off between simplicity and performance,the adaptive Bubble router is a serious candidate to implement on-chip network routers.Manufacturers are currently designing a new generation of microprocessor supporting cc-NUMA style multiprocessing and including network router and interface on-chip,such as the new Compaq/Alpha EV7[2].

To demonstrate the suitability of the adaptive Bubble router,a quantitative analysis comparing different router designs has been carried out.We adopted packet latency and maximum sustained throughput as the basic parameters to validate the goodness of our proposal.An accurate measurement of these merit?gures can only be obtained by

THE ADAPTIVE BUBBLE ROUTER3 monitoring the low-level characteristics of the corresponding VLSI designs.Therefore, we have designed our router and its corresponding counterparts using high-level VLSI synthesis tools,which provide us with adequate estimations in terms of required silicon area and circuit delays.These metrics are fed into a very detailed register-transfer level simulator to observe the behavior of the network built from each router design. Traditionally,synthetic benchmarks have been extensively used in the technical literature to assess the performance exhibited by the interconnection subsystems.Although this methodology can be useful in order to stress the communication capabilities,it is also very desirable to observe the behavior of the network as a part of a whole system executing real loads.In order to provide such a realistic scenario we will use a set of three benchmarks from the SPLASH-2suite[31]running on a state-of-the-art cc-NUMA multiprocessor.To build this experimental testbed an execution-driven simulator based on Rsim[22]has been developed.In this way,in addition to providing metrics under synthetic traf?c we are also able to show the performance of several designs under the pressure of various real parallel applications.

The rest of this paper is organized as follows.In Section2,both theoretical and intuitive approaches are employed in order to demonstrate deadlock freedom in our adaptive Bubble router;we achieve this goal by using a new?ow control function based on restricted injection.In Section3,we present a detailed description of the design of the Bubble router, with special emphasis on a new crossbar arbitration policy.Section4is devoted to the design of alternative router organizations in order to compare them with the Bubble one.In Section5,a thorough analysis of the performance exhibited by each solution is presented. The conclusions drawn from this research are discussed in Section6.

2.BUBBLE FLOW CONTROL IN-ARY-CUBE NETWORKS

If the routing algorithm and the?ow control function are not carefully designed,the network may suffer from packet deadlock.Furthermore,there are topologies composed of a set of rings,such as a torus network,which are specially deadlock-prone.Consequently,a great deal of research has been devoted to either preventing or resolving network deadlock. Deadlock can be prevented by ordering the acquisition of resources in such a way that static dependencies among packets cannot form a cycle.This can be achieved by restricting the routing such as in the Turn model[16]or by splitting the traf?c into two or more virtual channels so that cyclic dependencies are eliminated[10].There are multiple proposals of adaptive wormhole routers based on these approaches[21,8].Common drawbacks of these proposals are the high hardware complexity of the resulting device and the unbalanced use of the link resources.

A second approach is deadlock recovery.When a deadlock situation is detected,one or more messages are removed so that the cycle is broken,allowing the rest of packets in the network to?nally progress towards their destination[24].

Resource dependencies are not only based on the routing function but on the current status of the network.Duato’s theory[12]determined the necessary and suf?cient conditions for deadlock freedom.As a consequence of this theory,under certain conditions,we can add fully-adaptive virtual channels to any deadlock-free network knowing that the resulting network is deadlock-free as well.Current adaptive router implementations such as the CrayT3E router[28]use virtual channels both to avoid deadlock and to provide fully adaptive channels.

4V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO Focusing on the particular characteristics of VCT networks,we can?nd simpler deadlock-freedom mechanisms.Under VCT?ow control,the basic resource is the queue attached to the physical(or virtual)channel.This means that message dependencies are localized between adjacent queues.We will see how to take advantage of this to provide a low cost deadlock avoidance mechanism.In order to adequately present this new mechanism,let’s ?rst introduce some de?nitions and after that,we will concentrate on the establishment of a new deadlock free routing for-ary-cube networks.

2.1.Some Formal De?nitions

In this section,we include some basic de?nitions mainly derived from[10],[14]and [6]to make the paper self-contained and to be able to provide a formal de?nition of our proposed?ow control mechanism.

An interconnection network,,is a strongly connected directed graph .The vertices of the graph,,represent the set of processing nodes.The arcs of the graph,,represent the set of queues associated to the communication links interconnecting the nodes.Each queue has capacity for packets.The number of packets currently stored in that queue is denoted.

The set of queues is divided into three subsets:injection queues,delivery queues and network queues.Each processor uses a queue from to send messages that travel through the network employing queues from and when they reach their destination they enter a queue from.Therefore,packets are routed from a queue in the set to a queue in the set.Obviously,.

A routing function,,where is the power set of,provides a set of alternative queues to route a packet located at the head of the queue to its destination node.We also denoted as the destination node of the?rst packet enqueued at.A deterministic routing function provides a single alternative.

A routing subfunction,,for a given routing function is a routing function de?ned in the same domain as but its range(set of alternative next queues)is restricted to a subset.A simple and convenient way to de?ne the routing subfunction is as follows

A?ow control function,de-termines the access permission for a packet located at queue to enter the queue .Thus,packet is allowed to advance from to if. The?ow control function limits the set of alternative next queues provided by the routing function and obviously depends on network status.

THE ADAPTIVE BUBBLE ROUTER5 Similarly to the routing case we can de?ne a?ow control subfunction for a given?ow control function as follows

A legal con?guration is an assignment of a set of packets to each queue in a set such that,and the packets stored in each queue have been routed from some using the routing function,and verifying the?ow control function.

In short,a legal con?guration represents a network state which is reachable under the selected routing and?ow control functions.

A deadlocked con?guration for a given interconnection network, routing function,and?ow control function is a nonempty legal con?guration verifying the following conditions:

In a deadlocked con?guration,packets cannot advance because the?ow control function does not allow the use of any of the alternative queues provided by the routing function.

2.2.Routing and Flow Control Subfunctions for-ary-cube Deadlock-free

Networks

If we are able to?nd out a pair of routing and?ow control subfunctions,and, that provides deadlock freedom,we can apply Duato’s theory[12]for developing a fully adaptive routing algorithm for-ary-cube networks:cube with dimension and nodes in each dimension.

To begin with,we can initially assume that the nodes of the-ary-cube network are linked by just two communication queues in opposite directions.Each processing node, ,can be denoted as,where is the position in the n-cube and therefore represents the node coordinate in the dimension.This means that,given two neighbor nodes along the dimension,and,where, the bidirectional queues connecting both nodes will be denoted as and.This notation unequivocally determines the direction of the channel,the dimension where it is located and the nodes it connects.

Dimensional Order Routing(DOR)is our candidate for the routing subfunction,. We select this function because it is very simple and it also eliminates any resource cyclic dependencies between packets at different dimensions.Formally,this function can be represented as,

where,with,is the?rst dimension in which the coordinates of nodes and differ,and is the neighbor of in such dimension;in other words,the message is routed to.

6V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO Thus,the message’s path enters at most once into each traveling dimension.Although applying the subfunction eliminates any resource cyclic dependencies between packets at different dimensions,deadlock can still occur within packets traveling along the same dimension.A classical solution[10]is to split each physical link in two(or more) virtual queues and restrict their use in some way.

Our solution does not require any more queues but a?ow control subfunction,, restricting the access of packets into a new network dimension.This function is a minor modi?cation of the virtual cut-through?ow control and it is formally de?ned as follows,

where and for each queue,the condition must be ful?lled.

Packets moving along a dimensional ring advance as per cut-through?ow control. However,the injection of new packets as well as packets changing dimension are only allowed if there is room for not just one but two packets.

Using the pair of subfunctions in a-ary-cube network is a suf?cient condition to prevent deadlock.In fact,if we suppose that under these conditions a deadlock arises in dimension,necessarily some packet was incorporated to this dimension without verifying the function;note that according to the de?nition of this function,it is not possible to reach a legal con?guration in which where

In other words,applying and it is impossible to reach a legal con?guration in which all queues of a dimension and direction are completely full.

It is important to note that the second free packet space that prevents deadlock in a dimension does not have to be in the queue requested by the packet entering that dimension but it could be in any of the other queues at that unidirectional ring,.This allows us to generalize the subfunction as follows,

This subfunction captures all the safe scenarios but the complexity of examining all queues from a unidirectional ring is considerable.Instead,we could look for the second free buffer at the local queue which belongs to the same unidirectional ring as. This strategy is particularly useful for routers with input buffering as it allows the evaluation of based only on local information.

Intuitively,we can understand how Bubble?ow control works by looking at the unidirec-tional ring shown in Figure1.Packets(shaded queue units)are allowed to move(shaded arrows)from one queue to another inside the ring as per virtual cut-through switching. However,packet injection is only allowed at a given router if there are at least two empty packet buffers in the dimension(and direction)requested by the packet.By doing so,we guarantee that there is always at least an empty packet buffer in the ring.That free buffer

THE ADAPTIVE BUBBLE ROUTER

acts as a bubble,guaranteeing that at least one packet is able to progress.Besides,changing dimension is also treated as an injection into the next dimensional ring.

In short,Bubble ?ow control avoids deadlock inside rings by preventing packets from using the potentially last free buffer of a dimensional ring.Thus,combined with DOR routing it prevents deadlock without requiring any virtual

channels.

Injection

From k-1dimension

Injection Injection From k-1dimension

From k-1dimension

Injection

From k-1dimension

FIG.1.Deadlock avoidance in a unidirectional ring by using Bubble ?ow control (

2.3.Fully-Adaptive Routing Based on and

The previous section has shown how to build a deadlock-free -ary -cube network applying the pair and ,on the set of queues .Now,we can add new queues (virtual channels)between nodes where packets can move using any minimal path (no misrouting).If packets are blocked in these queues,they will use as escape queues.We know,according to [14],the resulting network is deadlock-free as https://www.doczj.com/doc/8514170668.html,ing the previous notation,the ?ow control function ,which includes ,will be applied over all queues as follows,

This means that Bubble is the type of ?ow control applied when accessing escape queues and virtual cut-through when accessing adaptive ones.That is,hops from deterministic queues to adaptive ones will take place freely,while hops from adaptive queues into escape ones are treated as injections into the escape rings.The routing function associated with queues provides a set of possible next queues following a path of minimum distance,including the escape queue from the routing subfunction .For 2-D networks provides up to three queues that are chosen in a ?xed order according to this selection function :

1.The adaptive queue of the neighbor along the incoming dimension,if the correspond-ing offset is not zero.That is,the number of hops a packet must travel from the current node to its destination node,along the incoming dimension,is not zero.

2.The adaptive queue of the neighbor along the other dimension,if the corresponding offset of that dimension is not zero.

3.The escape queue of the neighbor along the ?rst dimension,if the offset of this dimension is not zero or the escape queue of the neighbor along the second dimension if the offset of the ?rst dimension is zero.

8V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO Thus,priority is given to the adaptive queues over the escape ones.Obviously,the function must return true for a packet to progress to the next queue.This selection policy provides a high routing?exibility,allowing packets to follow any minimal path available in the network.

The adaptive routing algorithm resulting from applying and is deadlock-free.As we previously indicated,adding queues to a deadlock-free routing algorithm cannot induce deadlock,regardless of how packets are routed in the additional queues.The only constraint is that the routing algorithm should not consider the queue currently storing the packet when computing the set of routing options.This constraint is met by our routing algorithm because a packet can be routed on both escape or adaptive queues at any router,regardless of the queue storing it.Taking into account that DOR with Bubble?ow control is deadlock-free, it follows that the whole routing algorithm is also deadlock-free.

2.3.1.Starvation

Controlling the injection?ow and the changes of dimension using the Bubble algorithm is an ef?cient technique for avoiding deadlock.However,under certain conditions,this technique may introduce another anomaly:starvation.This problem arises when there is different priority for accessing some of resources.In the escape queues,packets advancing along a dimension have more priority than those that are trying to enter it.

)*,

FIG.2.Example of starvation in deterministic Bubble?ow control(which is resolved by our adaptive proposal).

For example,in Figure2,packets going from router to router have a higher priority than packets trying to be injected at router.Thus,if traf?c between and is not stopped,a packet can be inde?nitely waiting to enter.However,starvation cannot occur for the adaptive Bubble algorithm because the access to the adaptive queues is not prioritized.This means that packets in transit have the same priority to acquire the adaptive queue as incoming packets.In the previous example,the packet to be injected at will succeed entering the next node’s adaptive queue.Even if that queue is temporarily full,a packet unit will eventually become free because the network is deadlock-free.

3.DESIGN ISSUES FOR THE ADAPTIVE BUBBLE ROUTER:ARBITRATION

SCHEMES

This section discusses the design of the Adaptive Bubble Router,a router for-ary -cube networks based on the adaptive routing and?ow control functions described in the previous section.

The complexity of adaptive routers compared with oblivious ones comes mainly from two sources.The?rst one is the extra resources necessary to prevent/recover from dead-lock.Bubble?ow control reduces the minimum number of virtual channels for deadlock prevention to only two.The second one is the arbiter,whose role is to match the input requests with the available output ports.

THE ADAPTIVE BUBBLE ROUTER9 The arbitration among incoming packets is a simple task in a DOR router as each input requests only one of the outputs and,therefore,each output port assignment is independent of the https://www.doczj.com/doc/8514170668.html,plexity rises with adaptivity because each incoming packet can request more than one output port.Independent arbitration for each output port may lead to multiple output resources being assigned to the same packet.Furthermore,adaptive routers add one or more virtual channels in order to avoid deadlock.This,in the best case,duplicates the number of possible requests.

There are various arbitration schemes that deal with this complexity.Among others,we started analyzing the following alternatives:

1.A centralized arbiter that accepts multiple input requests from each incoming message, matching them to all available output ports.Obviously,this approach allows for the maximum number of packets to be routed from input to output in the same cycle.

2.A distributed arbiter,called Output Arbiter Crossbar(OAC).We have limited its complexity by forcing each incoming packet to request a single output port per cycle from its set of possible outputs.Hence,our approach has a complexity similar to a DOR’s arbiter.

3.An arbiter,called Sequential Input Crossbar(SIC),accepting multiple output requests but servicing only one input channel per cycle.A similar option was used in the Cray T3E router[28].

Although the?rst alternative provides the best result from the functional point of view, it has a high VLSI cost in terms of delay that has a signi?cant impact on the router clock cycle and delay.In fact,in[26]we explored a design based on Tamir’s Wave Front arbiter [30],which increased the clock cycle by50%and37%with respect to the second and third schemes.Hence,we discarded that scheme and focused on the remaining two.

As the arbiter is a central building block,its implementation will in?uence the design of other router components such as the routing units and/or the output controllers.We will ?rst present the detailed implementation of the Bubble router based on the OAC arbiter, and then the other arbitration alternatives.

3.1.Bubble Router with OAC Scheme

The main blocks of the proposed Bubble router for a-ary2-cube are shown in Figure3. Extensions for a higher dimension network are straightforward.Internal router components are clocked synchronously but communications with neighboring routers are asynchronous in order to avoid clock skew problems.There is an escape queue and an adaptive queue for each physical link.They behave as described in the previous section.Both of them are implemented as FIFO queues with an additional synchronization module to support the asynchronous communication between routers.Thus,we have9input queues:four escape and four adaptive queues respectively and one injection queue.

Input queues and output ports are connected through a crossbar.As Bubble?ow control is an extension of VCT we can multiplex the physical link among its escape and adap-tive virtual channels on a packet-by-packet basis.Therefore,it is not necessary to send acknowledgment signals for data transmitted each cycle,and so,communication is less sensitive to wire delays.This also means that the arbiter itself can resolve the contention between the virtual channels to use a common output port.Consequently,we eliminate the need for virtual channel controllers to connect the crossbar to the output links and we also reduce the number of crossbar outputs from9to5.This diminishes the latency of packets crossing the router.

10V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO

FIG.3.OAC Router organization.

The approach of the OAC scheme is an extension to the arbitration strategy used in static routers in which each input requests a single output.Thus,the arbitration for each output is an independent module that applies a round robin policy to prevent starvation of any of the requesting inputs.In our adaptive router there are up to3pro?table outputs so we must issue their requests in a sequential manner,that is,one request per cycle.In other words, we transfer part of the complexity from the arbiter back to the routing unit(RU)attached to its input queue.In each cycle,the RU will request one of the possible outputs,in the order given by the selection function described in Section2.3,until one of them is granted.It is very unlikely that a packet could be inde?nitely requesting access to its output port due to the timing between multiple input requests.A simple policy that deals with this problem after a?xed number of cycles,consists of setting the request to the same output channel until it is granted.As this is a rare event,we set its threshold to a large number so that it has no impact on network performance.

FIG.4.Partial Router Structure for Output Arbiter Crossbar(OAC).

The set of requests from the9RUs is processed by the OAC arbiter,based on the available local output ports and the status of the neighbor’s queues.In short,it grants as many requests as possible,arbitrating among inputs contending for the same output.Thus, arbitration and switching logic is distributed for each output port as shown in Figure4.

THE ADAPTIVE BUBBLE ROUTER11 Thus,the x crossbar can be subdivided into multiplexers of inputs.

As a packet advances towards its destination,its header phit must be updated by decre-menting the offset in the selected dimension.As the RU is requesting a single output,it can issue the correct header phit in the same cycle as it makes the request.

3.2.Bubble Router with SIC Scheme

The Sequential Input Crossbar simpli?es arbitration by servicing only one input queue per cycle.Thus,incoming messages request all pro?table outputs simultaneously and they are sequentially serviced in a round-robin fashion.This means that at most,one input-to-output connection is established per cycle.A similar option was used in the Cray T3E router[28].

When we replace the OAC with the SIC scheme the resulting router,shown in Figure5, differs not only in the crossbar and arbiter elements but also in the routing units that generate the output requests.

FIG.5.SIC Router organization.

The Header Decoder(HD)is a very simple routing unit that decodes the header phit and generates all its output request signals.It also generates header updates for each possible output dimension.After the arbiter grants access to one output,the corresponding header phit is propagated through the crossbar.

Figure6shows the internal structure of the SIC arbiter and its relation to the other router components.Each SIC arbitration cycle involves the following task sequence:

1.Find the next input to be serviced.This is implemented with a token structure that moves in round-robin fashion to the next busy input.

2.Check the availability of the requested outputs.

3.Select one of those available outputs in the order given by our selection function.

4.Establish the crossbar connection and propagate the updated header.

This approach eliminates the contention among multiple inputs but the remaining tasks involved in the arbitration process cannot be carried out in parallel.In short,it is not

12V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.

PRELLEZO

FIG.6.(a)Partial Router Structure for Sequential Input Crossbar(SIC)(b)Arbiter Detail. possible to implement this functionality in a single cycle without penalizing the router’s clock cycle.Thus,we can anticipate the need to split this sequence into two cycles,one for arbitration and a second one for crossbar connection.

3.2.1.Time and Area of the Alternative Bubble Router Designs

In order to evaluate the cost of the Bubble router employing either of the two arbitration schemes we have described both routers using VHDL at RTL level.Feeding these de-scriptions into a high level synthesis tool(Synopsys v1997.08)we have obtained their logic level implementation.These designs have been mapped into0.7m(two metal layers) technology from the ATMEL/ES2foundry.

Physical data links were set17bits wide(1phit),16bits for data and1bit indicating the packet tail.Packets are20phits long,including the header phit which represents the destination address as a tuple of offsets,i.e.,the and8-bit offsets for a2-D torus.We have selected a queue capacity of4packets(80phits);this is a reasonable size for current VLSI technologies without constraining the router clock cycle.Furthermore,longer queues will increase area demands without signi?cantly improving network throughput.

For these features and under typical working conditions with standard cells from Es2 Synopsys design kit V5.2we have obtained the time and area costs for each router.Although these values are estimated by Synopsys in a conservative way[29],they provide us with

THE ADAPTIVE BUBBLE ROUTER13 values very close to the physical domain.Figures7.(a)and7.(b)show the component delays and the resulting pipelines for OAC and SIC respectively.

FIG.7.(a)Adaptive Bubble router pipeline with OAC arbitration,(b)Adaptive Bubble router pipeline with SIC arbitration.

Obviously,the component with highest delay determines the maximum clock frequency of the device.As mentioned before,the SIC implementation has an additional stage,and in both cases the arbitration stage is the one dictating the clock cycle.The FIFO and routing stages consume one cycle each.Finally,because of its asynchronous behavior,the synchronization module spends a variable amount of time(1cycle on average).

Once we had identi?ed the critical component,we optimized the other components focusing on area minimization.In this way we reduce the router’s area and balance the time spent in each pipeline stage.Table1summarizes the cost of both routers.There is a trade-off between the two arbiter schemes,OAC being the fast one and SIC using the less area.SIC needs less area because it has only one round-robin in comparison with the5of OAC(one per port).On the other hand,in OAC the round-robin completes the arbitration while SIC must complete other tasks as mentioned before.Thus,the different delays.

TABLE1

Time and Area for each router implementation under typical conditions.

Router Critical Router Router Area Arbiter and Crossbar

Path()delay()()Area()

BAda-OAC 5.6522.6025.958.98

BAda-SIC 6.1930.9023.34 4.57

Adaptive Bubble router with OAC scheme.

Adaptive Bubble router with SIC scheme.

14V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO

FIG.8.(a)Pipeline structure for VCAda-OAC router,(b)Pipeline structure for VCAda-SIC router.

4.DESIGN OF ALTERNATIVE ROUTER ORGANIZATIONS

Once the new router proposals have been de?ned,the next step is to assess their per-formance for speci?c networks.However,it is not enough to obtain absolute performance ?gures for both routers.We also need to compare them with other existing proposals using the same design style and methodology.

To achieve this goal,four additional routers have been designed for the same topology.

4.1.Virtual Channel Based Adaptive Wormhole Routers

We have selected two alternative routers based on Duato’s routing algorithm[13].This scheme requires a minimum of three virtual channels per link.Two channels use the Dally&Seitz algorithm[10]and provide the escape way for the third fully-adaptive channel. These two adaptive routers represent standard solutions because they use wormhole ?ow control and virtual channels to avoid deadlock.The?rst one,VCAda-SIC uses SIC arbitration,thus being very close in design philosophy to the Cray T3E router.As we have already seen the bene?ts of OAC arbitration in terms of speed,our second router,VCAda-OAC uses this scheme.In this way,we can isolate the impact that the two arbitration and deadlock avoidance mechanisms have on network performance.

Note that now we have three virtual channels,so for a2D torus there are13crossbar inputs.Moreover,traditional wormhole?ow control implies that channel multiplexing is performed in a?it by?it basis so this task cannot be coupled with the output arbitration as we did in the Bubble routers.Consequently,the crossbar size is and there is one virtual channel controller per output that determines in each cycle which VC has access to the physical link.

As in the router for the Cray T3E,the maximum message length is limited.By doing so,the restriction on the use of adaptive channels can be eliminated,and therefore,the presence in the same queue of?its belonging to different messages is allowed.Although other alternatives for?ow control have been tested,this is the one that offers the best results. The pipeline structure for both routers is similar to those shown for Bubble routers except that they require an additional stage for the VC controller.These pipelined structures are

THE ADAPTIVE BUBBLE ROUTER15 shown in Figures8.(a)and8.(b).Component delays were calculated under the same conditions as those in Section3.2.1.

Although the Cray T3E uses similar arbitration to that of the VCAda-SIC,it uses a multiplexed crossbar.This approach reduces silicon area with respect to our full crossbar implementation.Notwithstanding,?it-level multiplexing imposed by traditional wormhole ?ow control would considerably increase the arbiter complexity.Although multiplexing at a larger granularity could increase overall performance,this solution would further increase arbiter complexity.

As the routing unit attached to the input queue multiplexes?its from the different VCs,those?its will follow different crossbar paths.Thus,multiplexing must be done in coordination with the crossbar module that must rearrange its internal connections to send the current?its to their selected outputs.In other words,the55crossbar must emulate in each multiplexing round the connections of a135non-multiplexed crossbar.A partial solution to cope with this complexity is to increase the?it size from the traditional1phit to a small multiple.In fact,as the Cray T3E has channel pipelining,its?it size is not1but phits.

4.2.DOR Routers:the Baseline Case

In order to measure the cost of adaptivity,two routers using dimensional order routing (DOR)have been designed.

The?rst router,based on[9],uses wormhole switching and two virtual channels to avoid deadlock.We have not used more advanced algorithms for handling virtual channels,like the one proposed in[13],because it increases the router complexity and diminishes the main advantage of deterministic routers:low latency.This router is denoted by VCDOR. The second one,denoted by BDOR,is a VCT router that applies the Bubble mechanism; thus,it requires a single queue/virtual channel per link.Although this router is a good baseline for adaptive Bubble routers,we cannot ignore the risk of starvation as discussed in Section2.3.1.

Figure9shows the pipeline and delays for VCDOR and BDOR obtained when using the conditions and methodology as per previous router implementations.Both routers are simpler than their adaptive counterparts as they have a lower number of VCs and a simple arbiter.

4.3.A Comparison of VLSI Router Cost

This section summarizes the VLSI cost of the six routers under study.To fairly compare among designs,all routers have the same buffer capacity per physical link.As the adaptive Bubble router has two virtual channels,each with a80-phit queue,we have set the BDOR input queue to be160-phit long.The VC-based adaptive routers have three virtual channels so we have employed40-phit queues for the escape channels and an80-phit queue for the adaptive one.

The results for silicon area and time delays for each of these six designs,obtained from the hardware synthesis tools,are shown in Table2.Note that the building blocks differ in each implementation as was described in each original proposal but we apply the same VLSI design style.

In all the designs the arbiter sets the cycle time,except for BDOR in which the FIFO queues are the critical component.As can be seen in this table,the cycle time for BAda-OAC

16V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO

FIG.9.(a)Pipeline structure for deterministic Bubble router,(b)Pipeline structure for deterministic Virtual Channel router.

TABLE2

Characteristics of router designs.

Router Clock Crossbar Queue Cycle time Area Hop Time

Cycles Size Sizes

BAda-OAC4 5.6525.9522.60

BAda-SIC5 6.1923.3430.90

VCAda-OAC.5 6.2850.6531.40

VCAda-SIC.67.5044.5945.00

BDOR4 5.2511.0121.00

VCDOR5 5.5719.0827.85

is similar to that of the deterministic wormhole router.This is not surprising as both have a similar arbitration philosophy and use the same number of virtual channels.BAda-OAC clock cycle is slightly higher because it also multiplexes incoming packets,however this is compensated for by the fact that VCDOR requires a?fth stage to multiplex packets at the ?it level.Besides,this reduces silicon area for the adaptive Bubble router which uses only 56%of the required area for the adaptive wormhole counterpart.

The hop time is a?gure of merit because together with the average distance,it provides the network base latency.Thus,BDOR will exhibit the lowest latency,followed by BAda-OAC.Note that their hop times are smaller than with VCDOR.Thus,we have achieved our goal of implementing an adaptive VCT router as fast as current static ones.

Finally,it is clear that methods based on Bubble?ow control are,at least,a realistic alternative to the classical wormhole solutions based on virtual channels.

https://www.doczj.com/doc/8514170668.html,WORK ANALYSIS

Besides comparing the routers from the VLSI point of view,we must compare their behavior in terms of the network performance they provide.Obviously,network perfor-

THE ADAPTIVE BUBBLE ROUTER17 mance depends greatly on the VLSI implementation as it determines the internal contention among incoming packets and the speed at which they are forwarded.Thus,any evaluation of network performance shouldn’t ignore the VLSI costs.

We could analyze the network using a VHDL simulator,like Leapfrog(from Cadence) [3],with the VHDL descriptions already developed at the synthesis phase.On the one hand,this method obtains very precise results and allows for the monitoring of any router element.On the other hand,it requires large computational resources,both in simulation time and memory capacity.For instance,the average time for simulating20,000clock cycles of an88torus on an UltraSparc2with128Mbytes is about2.5hours.

With the aim of reducing the design cycle,an object oriented simulator has been devel-oped in C++.This simulator,called SICOSYS(SImulator for COmmunication SYStems) [25],captures the low-level characteristics of each router,including the pipeline stages and the component delays obtained by using VLSI tools.This approach delivers accurate results which are very close to those obtained through VHDL simulation(showing discrepancies of less than4%and with a signi?cant reduction of simulation times).SICOSYS has a mes-sage generator that can produce synthetic loads.Besides,it provides a network interface for the simulation of real application loads as will be described later.The following two sections will present the simulation results obtained in both scenarios.

https://www.doczj.com/doc/8514170668.html,work Performance Under Synthetic Loads

This section presents simulations from an-ary-cube obtained by SICOSYS for each router under the component delays presented in the previous sections.In all the experiments, the traf?c generation rate is uniform and randomly distributed over time.We considered two different message destination patterns:random and speci?c permutations.

In the?rst case,all the nodes have the same probability of becoming the destination for a given message.Two message length distributions were considered for this pattern:?xed length traf?c(40bytes or20phits)and bimodal traf?c:a mix of short and long messages (20phits and200phits).Note that in VCT routers the long messages are fragmented into 10small packets.

In the second case,each node exchanges messages with another particular node.We selected three permutation traf?c patterns:matrix transpose,bit reversal,and perfect-shuf?e.All three cases are frequently used in numerical applications,such as in ADI methods to solve differential equation systems and in FFT algorithms,among others.

https://www.doczj.com/doc/8514170668.html,work Latency at Zero Load

There are many parallel applications that are latency sensitive,so it is important that the network exhibits low latency at low loads.Base latency is a good parameter to estimate this latency,as network contention at such loads is minimal.

Table3shows the base latency for all the router designs and traf?c patterns considered. The corresponding values have been obtained when the load applied to the network was around0.05%of the bisection bandwidth.

As base latency depends on the router complexity,it is logical to see that the adaptive routers exhibit higher values that their static counterparts.However,BAda-OAC presents lower base latency than VCDOR,and it is only8%slower that BDOR.

https://www.doczj.com/doc/8514170668.html,work Latency and Throughput at Variable Load

18V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO

TABLE3

Base latency(ns)for40-byte messages in an torus network.

Router Random Bimodal M.Transpose P.Shuf?e Bit Reversal

BAda-OAC229.5350.0238.3230.4239.0

BAda-SIC282.7418.9295.3284.6296.2

VCAda-OAC281.7419.0293.7286.3296.6

VCAda-SIC374.4541.5391.9376.8392.9

BDOR212.9330.5221.4212.0225.2

VCDOR248.7373.9260.2247.3264.8

Network throughput and latency are the two metrics used to measure the performance of an interconnection network.

Maximum network throughput gives an idea of the network behavior under heavy loads. Its value,expressed in functional terms(phits accepted per network cycle)and technological terms(phits accepted per nanosecond),can be seen in Table4for the different traf?c patterns and routers considered.

TABLE4

Maximum achievable throughput in an torus network.

Traf?c Random Bimodal M.Transpose P.Shuf?e Bit Reversal Router Phits/cycle Phits/ns Phits/cycle Phits/ns Phits/cycle Phits/ns Phits/cycle Phits/ns Phits/cycle Phits/ns BAda-OAC43.67.7136.8 6.5130.6 5.4028.7 5.0834.1 6.03 BAda-SIC41.2 6.6633.0 5.3427.7 4.4726.5 4.2833.3 5.38 VCAda-OAC38.0 6.0534.5 5.4926.2 4.1828.8 4.5932.3 5.15 VCAda-SIC39.4 5.2434.7 4.6327.3 3.6429.1 3.8732.7 4.36 BDOR38.77.3829.8 5.6714.0 2.6619.0 3.6112.5 2.35 VCDOR36.7 6.5928.1 5.0514.7 2.6420.6 3.7012.4 2.35

First we will focus on the functional performance measured in phits/cycle.This is the value commonly used in the literature.

If we compare adaptive versus deterministic routing,we can see that all adaptive routers slightly improved their static counterparts under random traf?c.As this traf?c is already well balanced the gain can only be attributed to the fact that the adaptive routers have an additional virtual channel which reduces the impact of FIFO blocking.

Bimodal traf?c has multiple packets(ten packets for a200phit message)heading to the same destination node.Thus,there is more temporal contention and adaptive routers can speed up message delivery by sending the packets in parallel over multiple paths.Thus, the gains between the adaptive and the static routers widens.Obviously,adaptive routers outperform the static ones for any of the three permutations.

In relation to the arbitration scheme for adaptive routers,OAC or SIC have a minor impact on the functional performance of either router.In the Bubble-based routers,packet multiplexing is carried out during arbitration;thus,contention to access the multiple virtual channels of an output port is re?ected in a higher number of requesting messages.In such cases,OAC improves on SIC by approximately10%because it is able to grant,when possible,more than one output port per cycle.In the VC-based router,?it multiplexing means that virtual channel contention is resolved by the VC controller,reducing the pressure

THE ADAPTIVE BUBBLE ROUTER

200

400600800100012001400012

34567

L a t e n c y (n s .)

Applied Load (phits/ns.)

BAda-OAC BAda-SIC VCAda-OAC VCAda-SIC

BDOR VCDOR

12345678

0123

45678910

A c c e p t e d L o a d (p h i t s /n s .)

Applied Load (phits/ns.)

BAda-OAC BAda-SIC VCAda-OAC VCAda-SIC

BDOR VCDOR

https://www.doczj.com/doc/8514170668.html,tency and Throughput for a 64-node 2-D Torus under Random Traf?c.

200

40060080010001200140001

2345

L a t e n c y (n s .)

Applied Load (phits/ns.)

BAda-OAC BAda-SIC VCAda-OAC VCAda-SIC

BDOR VCDOR

012

345678

A c c e p t e d L o a d (p h i t s /n s .)

Applied Load (phits/ns.)

BAda-OAC BAda-SIC VCAda-OAC VCAda-SIC

BDOR VCDOR

https://www.doczj.com/doc/8514170668.html,tency and Throughput for a 64-node 2-D Torus under Matrix Transpose Traf?c.

200

40060080010001200140001

23456L a t e n c y (n s .)

Applied Load (phits/ns.)

BAda-OAC BAda-SIC VCAda-OAC VCAda-SIC

BDOR VCDOR

1234567

012

345678

A c c e p t e d L o a d (p h i t s /n s .)

Applied Load (phits/ns.)

BAda-OAC BAda-SIC VCAda-OAC VCAda-SIC

BDOR VCDOR

https://www.doczj.com/doc/8514170668.html,tency and Throughput for a 64-node 2-D Torus under Bit Reversal Traf?c.

on arbitration.Besides,SIC provides for faster adaptivity as each message can request any of the potential outputs in the same cycle,while in OAC every adaptive turn takes at least an extra cycle.This improves performance by only 3%.

In relation to the deadlock mechanism,BAda-OAC and BDOR outperform their VC-based counterparts not only in latency but also in throughput,in spite of having one less virtual lane.In particular,the throughput gain for the BAda-OAC under random traf?c is of 15%when compared with VCAda-OAC.We should note that Bubble is limiting

20V.PUENTE,C.IZU,R.BEIVIDE,J.A.GREGORIO,F.V ALLEJO AND J.M.PRELLEZO

the injection of packets at high loads and this is a mechanism which improves network throughput for both wormhole[23]and cut-through networks[17].In this case,limiting injection proves to be more effective than the additional static virtual channel of VC-based routers.The exception is for SIC-based routers in which the differences in channel multiplexing distort the result.

However,when we incorporate the VLSI cost,by expressing the network throughput in phits consumed per nanosecond,we observe a different scenario.Figures10to12show the network latency versus applied load for three of those patterns.It should be noted that these latency values include injection buffer delays.Injection delays must be taken into account in order to produce reliable results because they depend not only on the traf?c load but also on the traf?c pattern.

The?rst conclusion from these results is that any router proposal cannot be fairly evaluated without considering its hardware implementation.Many adaptive routers which manage to exploit the network resources from a functional point of view cannot compete with implemented static networks as their complexity forfeits any of the gains observed at the upper level.This trade-off between the potential functional gains and the complexity they entail is evident in the VC-based adaptive routers:although they appear to improve VCDOR for random and bimodal traf?c,their throughput is15to30%worse in phits/ns. Consequently,adaptive router implementation should focus on minimizing both its router clock cycle and node delay,as the two?gures together with the functional properties of the routing strategy will determine network performance.

In relation to the arbiter alternatives,the functional difference between SIC and OAC is minimal as discussed before,but their technological difference is considerable.Conse-quently,adaptive routers using OAC outperform their SIC counterparts.Similarly,both VCAda-SIC and BAda-SIC lose ground with respect to their static counterparts due to their technological cost.

The other source of complexity for adaptive routers is their deadlock avoidance mecha-nism.By looking at Table4and Figures10to12we can see that as Bubble has a nearly null cost,all Bubble-based routers outperform their VC-based counterparts for any traf?c pattern.Therefore,BAda-OAC which combines the two low-cost mechanisms is not only the best adaptive router,but it also outperforms both static routers for any traf?c pattern. In short,the BAda-OAC provides the highest throughput with only a minimal increment of latency compared to its static counterpart(which is not starvation free).

5.2.System Performance Under Real Loads

Finally,we have evaluated the real impact that each router design has on the execution time of parallel applications.

Rsim,a DSM multiprocessor simulator[22],emulates a cc-NUMA architecture with state-of-the-art ILP processors.Although Rsim provides a large range of parameters to specify the processor and memory hierarchy characteristics,the network con?guration choices are more limited,as it only emulates a wormhole static mesh.Thus,by replacing its network simulator(Netsim)with our RTL network simulator SICOSYS,we can obtain a realistic evaluation of network performance under real loads.Figure13shows the role that SICOSYS plays in this simulation environment,and the main con?guration parameters of the emulated system.

We use each router alternative to implement a pair of networks:one for request traf?c and the other for reply traf?c.This approach resolves application deadlock due to the

路由器实验(4-1)

实验四：综合路由配置(1) （内容一）：路由器 RIP 动态路由配置 1. 实验目的掌握RIP协议的配置方法：掌握查看通过动态路由协议 RIP学习产生的路由；熟悉广域网线缆的链接方式； 2. 实验背景假设校园网通过一台三层交换机连到校园网出口路由器上，路由器再和校园外的另一台路由器连接。现要做适当配置，实现校园网内部主机与校园网外部主机之间的相互通信。为了简化网管的管理维护工作，学校决定采用 RIPV2 协议实现互通。 3. 技术原理 RIP(Routing Information Protocols,路由信息协议)是应用较早、使用较普遍的IGP内部网管协议，使用于小型同类网络，是距离矢量协议；RIP 协议跳数作为衡量路径开销的，RIP协议里规定最大跳数为15；RIP 协议有两个版本：RIPv1 和RIPv2，RIPv1 属于有类路由协议，不支持VLSM，以广播形式进行路由信息的更新，更新周期为30 秒；RIPv2属于无类路由协议，支持VLSM，以组播形式进行路由更新。4.实验步骤建立 packet tracer 拓扑图（1）在本实验中的三层交换机上划分VLAN10 和VLAN20，其中VLAN10 用于连接校园网主机，VLAN20 用于连接 R1。（2）路由器之间通过V.35 电缆通过串口连接，DCE端连接在R1上，配置其时钟频率64000。（3）主机和交换机通过直连线，主机与路由器通过交叉线连接。（4）在 S3560 上配置 RIPV2 路由协议。（5）在路由器 R1、R2 上配置 RIPV2 路由协议。（6）将PC1、PC2 主机默认网关设置为与直连网路设备接口IP地址。（7）验证 PC1、PC2 主机之间可以互相同信； 5. 实验设备 PC 2 台；Switch_3560 1 台；Router-PT 2 台；直连线；交叉线；DCE 串口线

路由器端口映射的原理及设置方法介绍

路由器端口映射的原理及设置方法介绍端口映射其实就是我们常说的NAT地址转换的一种，其功能就是把在公网的地址转翻译成私有地址，采用路由方式的ADSL宽带路由器拥有一个动态或固定的公网IP，ADSL直接接在HUB 或交换机上，所有的电脑共享上网。这时ADSL的外部地址只有一个，比如61.177.0.7。而内部的IP是私有地址，比如ADSL设为192.168.0.1,下面的电脑就依次设为192.168.0.2到192.168.0.254。在宽带路由器上如何实现NAT功能呢？一般路由器可以采用虚拟服务器的设置和开放主机(DMZ Host)。虚拟服务器一般可以由用户自己按需定义提供服务的不同端口，而开放主机是针对IP 地址，取消防火墙功能，将局域网的单一IP地址直接映射到外部IP之上，而不必管端口是多少，这种方式只支持一台内部电脑。最常用的端口映射是在网络中的服务器使用的是内部私有IP地址，但是很多网友希望能将这类服务器IP地址通过使用端口映射能够在公网上看到这些服务器，这里，我们就需要搞清楚所用服务的端口号，比如，HTTP服务是80，FTP服务则是20和21两个端口，而安仕达连锁店软件需要使用到是3个端口，默认是8088、9000、9001，通过MstarDB服务来设置，见下图。这里我们以最常用的80端口为例，设置一个虚拟HTTP服务器，假设路由器IP地址为192.168.2.1。

第一步，在浏览器中输http://192.168.2.1，进行网络配置。第二步，进入虚拟服务器页面设置NAT端口映射，依次填入3个端口，端口类型为TCP或者ALL，主机IP地址填写内部DB服务器所在的IP，然后保存并重新启动路由器，设置就完成了。以后外网安仕达软件访问路由器的IP，就可以直接连接到我们定义的内部服务器上，实现了路由器内的端口映射，有2个理由需要使用这个技术（英文简称是NAT），（1）服务器直接上网容易被黑客和病毒攻击；（2）许多情况，企业需要通过路由器来支持更多的电脑一起使用因特网。注意：NAT技术是要求服务器IP是固定的，因为NAT里面只绑定了IP，所以要求实施人员要保证服务器的IP固定，要实现这个目标有以下几个方案：（1）关闭路由器的DHCP服务，手动设置服务器的IP （2）打开路由器的DHCP服务，通过网卡绑定的方式来将服务器地址设置为静态地址（见下图）

1路由基本配置实训

实训1 路由器基本配置实训目的：掌握路由器的简单配置、DHCP的配置实训准备知识 ●IP地址、子网掩码、DHCP、网关实训环境 ●Packet Tracer模拟软件实训内容 1.配置接口IP地址【命令格式】 1)Interface <接口> 2)IP address <子网掩码> 3)No shutdown 【说明】命令1用于选择接口，接口一般有fastethernet0/0、0/1，serial 0/0/0等，根据路由器上加载的模块不同而不同；命令2设置IP地址和子网掩码；命令3用于开启接口【举例】设置路由器R1的0号Fastethernet模块0号端口的IP地址为192.168.50.1/24，设置0号Serial模块0号接口的IP地址为10.0.0.1/8 Router(config)#interface f0/0 Router(config-if)#ip address 192.168.50.1 255.255.255.0 Router(config-if)#exit Router(config)#interface s0/0/0 Router(config-if)#ip address 10.0.0.1 255.0.0.0 Router(config-if)#exit

2.配置接口封装【命令格式】 1)Interface <接口> 2)Encapsulation [dot1q][frame-relay][hdlc][ppp] 【说明】命令1用于选择接口，如选择以太网接口，则必须选择子接口，如f0/0.1；命令2设置在该线路上传输数据的封装模式，如果是以太网，封装模式应为802.1Q 标准+VID，如果是广域网，封装模式可以是帧中继、HDLC（高级数据链路控制）和PPP（点到点协议）【举例】将下层交换机的VLAN 100的数据，封装在路由器的Fastethernet0/0的接口 Router(config)#interface f0/0.1 Router(config-subif)#encapsulation dot1q 100 将路由器Serial接口的封装模式设置为PPP Router(config)#interface s0/0/0 Router(config-if)#encapsulation ppp Router(config-if)#no shutdown 3.单臂路由实现多VLAN互通配置如下：

路由器的工作原理

路由器的工作原理路由器的是实现网络互连，在不同网络之间转发数据单元的重要网络设备。路由器主要工作在OSI参考模型的第三层（网络层），路由器的主要任务就是为经过路由器的每个数据帧寻找一条最佳传输路径，并将该数据有效地传送到目的站点。为了完成这项工作，在路由器中保存着各种传输路径的相关数据——路由表（Routing Table），供路由选择时使用。由此可见，选择最佳路径的策略即路由算法是路由器的关键所在。因此，当路由器接收到来自一个网络接口的数据包时，首先根据其中所含的目的地址查询路由表，决定转发路径（转发接口和下一跳地址），然后从ARP缓存中调出下一跳地址的MAC地址，将路由器自己的MAC 地址作为源MAC，下一跳地址的MAC作为目的MAC,封装成帧头，同时IP数据包头的TTL（Time To Live）也开始减数,最后将数据发送至转发端口，按顺序等待，传送到输出链路上去。在这个过程中，路由器被认为了执行两个最重要的基本功能：路由功能与交换功能。路由功能路由功能是指路由器通过运行动态路由协议或其他方法来学习和维护网络拓扑结构，建立，查询和维护路由表。路由表里则保存着路由器进行路由选择时所需的关键信息，包含了目的地址、目的地址的掩码、下一跳地址、转发端口、路由信息来源、路由优先级、度量值（metric）等。路由信息可通过多种协议的学习而来，其来源方式可分为直连路由、静态路由、缺省路由和动态路由。一个路由器上可以同时运行多个不同的路由协议，每个路由协议都会根据自己的选路算法计算出到达目的网络的最佳路径，但是由于选路算法不同，不同的路由协议对某一个特定的目的网络可能选择的最佳路径不同。此时路由器根据路由优先级（决定了来自不同路由来源的路由信息的优先权）选择将具有最高路由优先级（数值最小）的路由协议计算出的最佳路径放置在路由表中，作为到达这个目的网络的转发路径。（优先级顺序：直连路由>静态路由>动态路由（OSPF>RIP））而对于一个特定的路由协议，可以发现到达目的网络的所有路径，根据选路

cisco路由器配置及维护手册

cisco路由器配置及维护手册作者：pixfire 一、路由器简单配置 1. 用串行电缆将PC机串口与路由器CONSOLE口连接，用WIN95的超级终端或NETTERM 软件进行配置。PC机串口设置为波特率9600 数据位8 停止位1。 2. 新出厂的路由器启动后会进入自动配置状态。可按提示对相应端口进行配置。 3 . 命令行状态。若不采用自动配置，可在自动配置完成后，题问是否采用以上配置时，回答N。此时进入命令行状态。 4 进入CONFIG模式。在router>键入enable , 进入router # ,再键入config t ,进入 router(config)# 。 5 配置广域端口。在正确连接好与E1端口的电缆线后，在router # 下键入sh controller cbus 。检验端口物理特性，及连线是否正确。之后，进入config模式。进行以下配置。 int serial <端口号> E1端口号。 ip address <掩码> 广域网地址 bandwidth 2000 传输带宽 clock source line 时钟设定。 6 检验配置。设置完成后，按ctrl Z退出配置状态。键入write mem 保存配置。用sh conf 检查配置信息。用sh int 检查端口状态。二、路由器常用命令 2.1 Exec commands: <1-99> 恢复一个会话 bfe 手工应急模式设置 clear 复位功能

clock 管理系统时钟 configure 进入设置模式 connect 打开一个终端 copy 从tftp服务器拷贝设置文件或把设置文件拷贝到tftp服务器上 debug 调试功能 disable 退出优先命令状态 disconnect 断开一个网络连接 enable 进入优先命令状态 erase 擦除快闪内存 exit 退出exce模式 help 交互帮助系统的描述 lat 打开一个本地传输连接 lock 锁定终端 login 以一个用户名登录 logout

TP-Link路由器端口映射

TP-Link路由器端口映射端口映射前需要明确一下几个概念： A一旦用路由器把几台电脑连接起来，那么这几台电脑+路由器，就相当于形成了一个局域网，每台设备都会分配到一个内部的IP地址。 B.路由器和外部链接，会得到一个外部IP地址，所有设备对外都是这个IP地址。 C.由于大家对外都是同一个IP，所以当数据到达路由器时，需要通过设置，告诉路由器，哪些端口来的数据应该被发布到哪个内部IP上去，也就是通过端口设定，把外部IP和内部的IP在这个端口上统一起来。 D由于端口映射统一的是外部和内部IP，所以想要建主的机器，其内部IP必须固定。 E本教程只针对TP-LINK 的路由器，其他牌子的，可以借鉴，大体相同。 ———————————————————————— 正文开始： 1.进入路由器设置，TP-LINK的一般地址是http://19 2.168.1.1/， 2.用户名和密码一般在你的说明书上标注了

3.选择工具栏的“DHCP服务器”——>“静态地址分配表” 4.选择添加新条目

5.按你键盘上的Win+R键，或者从“开始”菜单选择“运行”，在出现的框中输入CMD，并按回车 6.在出现的窗口输入ipconfig /all，回车

7.找到你对应网卡名字，如果是有线的100M网卡，多半是Realtek出品的，很好识别，无线网卡的公司多，但根据名字也可以识别。 8.找到该名字项下，有个叫做“Physical Address”的，这就是你网卡的“物理地址”，也叫做“MAC地址”，每个设备之间的MAC地址一般是不同的，就像指纹一样，可以用来识别标定设备。 9.在静态地址分配表页面，输入对应的MAC地址和你想要设定的内部IP地址“192.168.1.XXX”最后一位可以自由选择，从2~255，保存，并选择该条目生效。 ——————————至此你的电脑对路由器绑定内部IP完成—————————————— ————————如果你还有其他游戏需要建主，重复以下步骤即可————————10.选择工具栏的“转发规则”——>“虚拟服务器”，并添加新条目（此步骤为进行端口映射）

路由器的基本配置

实验一：路由器的基本配置实验目的 1、掌握配置线、直通线、交叉线的使用 2、子网划分（地址块192.168.1.0/24） 3、路由器不同模式之间的关系和转换 4、通过超级终端对路由器进行基本配 5、路由器密码的配置 6、利用telnet登陆路由器并对其配置 7、验证并保存配置实验设备 1、路由器1台 2、PC机4台 3、配置线（console）线1条、交叉线1条、直通线3条实验过程分解 1、一个网段的配置（1）目标：PC2能ping通路由器对应接口或PC1\PC0能ping通路由器对应接口使用的命令（略） 2、两个网段的配置（1）目标：PC0、PC1、PC2能相互ping通（2）配置PC0、PC1、PC2网关前：PC0、PC1、PC2互ping的结果（3）配置PC0、PC1、PC2网关后：PC0、PC1、PC2互ping的结果实验过程 1.先向界面中添加一台router0 ，4台pc机和一台switch，先用控制线（console）连接router0的console接口和pc0的RS232接口，用交叉线（copper cross-over）连接router0的FastEthernet 0/1接口和pc3的FastEthernet接口，用直通线（copper straight-through）连接router0的FastEthernet 0/0接口和switch，pc1，pc2的FastEthernet接口。如图（配置钱的网络拓扑）

2.点击pc0进入Desktop的Terminal，如图 3.进入Terminal后，如图 4.点击OK进入代码，进入代码编辑界面，如图

01-第1章 IP路由概述

目录第1章 IP路由概述.................................................................................................................1-1 1.1 IP路由和路由表.................................................................................................................1-1 1.1.1 路由和路由段...........................................................................................................1-1 1.1.2 通过路由表选路.......................................................................................................1-2 1.2 路由协议概述.....................................................................................................................1-3 1.2.1 静态路由与动态路由................................................................................................1-4 1.2.2 动态路由协议的分类................................................................................................1-4 1.2.3 路由协议及路由优先级............................................................................................1-5 1.2.4 负载分担与路由备份................................................................................................1-5 1.2.5 路由信息的共享.......................................................................................................1-6 1.3 路由管理.............................................................................................................................1-6 1.3.1 路由表的显示...........................................................................................................1-6 1.3.2 路由管理模块的显示和调试.....................................................................................1-7

路由器、交换机常用维护手册

第一章路由器常用维护手册 1.路由器简单配置 1.1用串行电缆将PC机串口与路由器CONSOLE口连接，用WIN95的超级终端或NETTERM软件进行配置。PC机串口设置为波特率9600 数据位8 停止位1。 1.2新出厂的路由器启动后会进入自动配置状态。可按提示对相应端口进行配置。 1.3命令行状态。若不采用自动配置，可在自动配置完成后，问题是否采用以上配置时，回答N。此时进入命令行状态。 1.4进入CONFIG模式。在router>键入enable , 进入router # ,再键入config t ,进入router(config)# 。 1.5配置广域端口。在正确连接好与E1端口的电缆线后，在router # 下键入sh controller cbus 。检验端口物理特性，及连线是否正确。之后，进入config 模式。进行入下配置。 int serial <端口号> E1端口号。 ip address <掩码> 广域网地址 bandwidth 2000 传输带宽 clock source line 时钟设定。 1.6检验配置。设置完成后，按ctrl Z退出配置状态。键入write mem 保存配置。用sh conf 检查配置信息。用sh int 检查端口状态。第二节路由器常用命令 2.1Exec commands:

<1-99> 恢复一个会话 bfe 手工应急模式设置 clear 复位功能 clock 管理系统时钟 configure 进入设置模式 connect 打开一个终端 copy 从tftp服务器拷贝设置文件或把设置文件拷贝到tftp服务器上debug 调试功能 disable 退出优先命令状态 disconnect 断开一个网络连接 enable 进入优先命令状态 erase 擦除快闪内存 exit 退出exce模式 help 交互帮助系统的描述 lat 打开一个本地传输连接 lock 锁定终端 login 以一个用户名登录 logout 退出终端 mbranch 向树形下端分支跟踪多路由广播 mrbranch 向树形上端分支跟踪反向多路由广播 name-connection 给一个存在的网络连接命名 no 关闭调试功能 pad 打开X.29 PAD连接 ping 发送回显信息 ppp 开始点到点的连接协议 reload 停机并执行冷启动 resume 恢复一个活动的网络连接 rlogin 打开远程注册连接 rsh 执行一个远端命令 send 发送信息到另外的终端行 setup 运行setup命令 show 显示正在运行系统信息 slip 开始SLIP协议 start-chat 在命令行上执行对话描述

各种路由器设置端口映射

各种路由器，内网映射外网大全 2009-04-04 18:06 各种ADSL MODEM端口影射端口映射技术是建站过程中一项关键技术，本次我们通过在论坛征集来自众多建站高手提供的各种不同品牌ADSL MODEM端口映射设置方法和相关应用参数介绍。整理成一份完整的文档资料供更多需要建站的用户参考。在此特别鸣谢提供资料及信息的朋友们！什么情况下需要做端口映射？如果网络情况是下面这样的： internet<--->adsl router<--->hub<--->web server internet<--->adsl modem<--->gateway<--->hub<--->web server 那么internet用户想浏览你的web server，80的请求只能到adsl router或gateway，就过不去了。你就要做一个转发，让80请求到adsl router或gateway 后，达到web server，那web server才有可能回应，并返回给你正确内容。这就是端口转发，也叫端口映射。以下为部分路由器基于Web管理界面的访问地址： 1. DLINK出厂定义的路由器地址是19 2.168.0.1 2. Linksys出厂定义的路由器地址是192.168.1.1 3. 3com出厂定义的路由器地址是192.168.2.1 4. 微软出厂定义的路由器地址是192.168.2.1 5. Netgear出厂定义的路由器地址是192.168.1.1 6. asus出厂定义的路由器地址是192.168.1.1 上面几个是厂家定义地址，如果是带有猫的路由，地址不一定一样。不同厂商的产品也有可能有些差别。例如speedtouch的出厂定义是10.0.0.138，更多型号设备管理界面访问地址，用户可参考产品说明书或者登陆相关产品官方网站了解。以下为在本介绍文档中出现的设备：TP-LINK TD-8800 TP-Link TD8830 TP-LINK TL-R410 Cyrix686 D-Link DI-704P D-Link DSL-500 阿尔卡特（ALCATEL） S6307KH 合勤 642R GREENNET 1500c 腾达（TENDA）TED 8620

实验三路由器基本配置

上机报告姓名学号专业班级计科普1002 课程名称网络系统集成指导教师机房名称（I520）上机日期 2012 年10 月26 日上机项目名称实验三：路由器基本配置上机步骤及内容：实验目的 1．使用路由器配置静态路由、RIP协议、OSPF协议。掌握使用相应协议实现路由选择的方法。 2．通过在路由器上使用相关的检查和排错命令学习如何维护和分析RIP协议、OSPF 协议。 3．通过RIPv1和RIPv2配置过程的不同体会两者在实际使用上的差别。实验要求 1．在路由器上配置静态路由、缺省路由； 2．配置RIP协议；配置OSPF协议； 3．掌握路由器接口配置，测试网络连通性； 4．理解每一步实验的作用，把实验的每一步所完成的任务详细地叙述清楚，并记录在实验报告上； 5．实验结束后上缴实验报告实验仪器设备和材料清单 1．具备两个以太网端口的路由器三台，交换机两台； 2．两台具备以太网接口的PC机，分别连接路由器的内外网口，路由器端口、PC的IP地址可自己分配和设置 3．路由器拓扑结构成环状或线性连接； 4. 实验组网图。

图实验组网图实验内容 1．完成路由器的基本端口配置，静态路由，缺省路由配置； 2．RIPv2协议配置； 3. OSPF协议配置； 4．通过在路由器上使用相关的检查和排错命令学习如何维护和分析RIP协议、OSPF 协议。实验步骤 1．作路由器的端口IP地址配置，代码如下：路由器r1 [H3C]sysname r1 [r1]int e0/1 [r1-Ethernet0/1]ip add 24 [r1-Ethernet0/1] %Oct 28 16:55:42:805 2012 r1 IFNET/4/UPDOWN: Line protocol on the interface Ethernet0/1 is UP [r1-Ethernet0/1]int e0/0 [r1-Ethernet0/0]ip add 24 路由器r2 [H3C]sysname r2 [r2]int e0/1 [r2-Ethernet0/1]ip add 24 [r2-Ethernet0/1] %Oct 28 16:40:33:465 2012 r2 IFNET/4/UPDOWN: Line protocol on the interface Ethernet0/1 is UP [r2-Ethernet0/1]int e0/0

路由器的工作原理及性能指标

路由器的工作原理及性能路由器是一种典型的网络层设备。它是两个局域网之间接帧传输数据，在O SI／RM之中被称之为中介系统，完成网络层中继或第三层中继的任务。路由器负责在两个局域网的网络层间接帧传输数据，转发帧时需要改变帧中的地址。它在OSI／RM中的位置如图1所示。一、原理与作用路由器（Router）是用于连接多个逻辑上分开的网络，所谓逻辑网络是代表一个单独的网络或者一个子网。当数据从一个子网传输到另一个子网时，可通过路由器来完成。因此，路由器具有判断网络地址和选择路径的功能，它能在多网络互联环境中，建立灵活的连接，可用完全不同的数据分组和介质访问方法连接各种子网，路由器只接受源站或其他路由器的信息，属网络层的一种互联设备。它不关心各子网使用的硬件设备，但要求运行与网络层协议相一致的软件。路由器分本地路由器和远程路由器，本地路由器是用来连接网络传输介质的，如光纤、同轴电缆、双绞线；远程路由器是用来连接远程传输介质，并要求相应的设备，如电话线要配调制解调器，无线要通过无线接收机、发射机。一般说来，异种网络互联与多个子网互联都应采用路由器来完成。路由器的主要工作就是为经过路由器的每个数据帧寻找一条最佳传输路径，并将该数据有效地传送到目的站点。由此可见，选择最佳路径的策略即路由算法是路由器的关键所在。为了完成这项工作，在路由器中保存着各种传输路径的相

关数据——路径表（Routing Table），供路由选择；时使用。路径表中保存着子网的标志信息、网上路由器的个数和下一个路由器的名字等内容。路径表可以是由系统管理员固定设置好的，也可以由系统动态修改，可以由路由器自动调整，也可以由主机控制。 1．静态路径表由系统管理员事先设置好固定的路径表称之为静态（static）路径表，一般是在系统安装时就根据网络的配置情况预先设定的，它不会随未来网络结构的改变而改变。 2．动态路径表动态（Dynamic）路径表是路由器根据网络系统的运行情况而自动调整的路径表。路由器根据路由选择协议（Routing Protocol）提供的功能，自动学习和记忆网络运行情况，在需要时自动计算数据传输的最佳路径。二、路由器的优缺点 1．优点适用于大规模的网络；复杂的网络拓扑结构，负载共享和最优路径；能更好地处理多媒体；安全性高；隔离不需要的通信量；节省局域网的频宽；减少主机负担。 2．缺点它不支持非路由协议；安装复杂；价格高。三、路由器的功能（1）在网络间截获发送到远地网段的报文，起转发的作用。（2）选择最合理的路由，引导通信。为了实现这一功能，路由器要按照某种路由通信协议，查找路由表，路由表中列出整个互联网络中包含的各个节点，以及节点间的路径情况和与它们相联系的传输费用。如果到特定的节点有一条以上路径，则基于预先确定的准则选择最优（最经济）的路径。由于各种网络段和其相互连接情况可能发生变化，因此路由情况的信息需要及时更新，这是由所使用的路由信息协议规定的定时更新或者按变化情况更新来完成。网络中的每个路由器按照这一规则动态地更新它所保持的路由表，以便保持有效的路由信息。（3）路由器在转发报文的过程中，为了便于在网络间传送报文，按照预定的规则把大的数据包分解成适当大小的数据包，到达目的地后再把分解的数据包

2.实验二、路由器的日常维护与管理(详解版)资料

实验二、路由器的日常维护与管理 1、实验目的通过本实验可以： 1)掌握路由接口IP地址的配置及接口的激活 2)掌握telnet的使用及配置 3)熟悉CDP的使用及配置 4)了解基本的debug调试命令 5)理解并实现设备之间的桥接 6)绘制基本的网络拓扑图 7)掌握数据通信的可达性测试 8)掌握路由器的密码恢复步骤 9)熟悉TFTP服务器的使用 10)掌握路由器配置文件的备份与恢复 11)掌握路由器IOS文件的备份、升级和恢复 2、拓扑结构路由器的日常维护与管理拓扑 3、实验需求 1)设置主机名，并关闭域名解析、关闭同步、关闭控制台超时 2)使用相关命令查看当前配置信息，并保存当前的配置文件 3)桥接PC到机架路由器，配置路由器接口的IP地址，开启接口并测试路由器与本机的连通性，开启debug观察现象 4)使用TFTP传送文件，分别实现拷贝路由器的配置文件到TFTP服务器和从TFTP

服务器导入配置文件到路由器 a)将当前配置文件保存到本机，并在本机打开并修改所保存的配置文件 b)将当前配置文件保存到同学电脑 c)将保存在本机的配置文件导入所使用的设备 d)将同学保存的配置文件导入所使用的设备 e)注意观察导入配置文件时设备提示信息的变化 5)使用TFTP备份路由器的IOS文件 6)IOS文件的升级和灾难恢复 7)路由器的密码恢复 8)使用CDP发现邻居设备，实现telnet远程登入到邻居设备 9)用主机名绑定IP，实现telnet主机名与telnet IP一致的效果 10)实现GNS3模拟器与本机之间的桥接，并将模拟器的配置文件保存到本机4、参考配置 1.配置基本命令设置主机名、关闭域名解析、同步、控制台超时 Router>enable Router#config terminal Enter configuration commands, one per line. End with CNTL/Z. Router(config)#hostname r14//命名主机 r14(config)#no ip domain-lookup//关闭域名解析 r14(config)#line console 0 r14(config-line)#logging synchronous //关闭日志同步 r14(config-line)#exec-timeout 0 0//关闭控制台超时 r14(config-line)#end r14# 2.查看当前配置信息，并保存当前的配置文件 r14#show running-config //查看当前运行的配置文件 Building configuration... Current configuration : 420 bytes ! version 12.2 service timestamps debug uptime service timestamps log uptime no service password-encryption ! hostname r14 ! ! ip subnet-zero ! ! no ip domain-lookup !

小米路由器端口映射该怎么设置才好

小米路由器端口映射该怎么设置才好路由器系统构成了基于TCP/IP 的国际互联网络Internet 的主体脉络，也可以说，路由器构成了Internet的骨架。小米路由器上不了网，该怎么设置，在网上说要设置端口映射，但是不知道怎么设置，于是在学着设置的时候写了这篇文章，需要的朋友可以参考下方法步骤 1、打开小米路由器的登陆界面，地址可以查看小米路由器背面的贴标。 2、查看IP地址：登陆进去之后，找到网络高级设置查看网络结构。一般就一级 3、找到上网设置---上网信息。查看公网地址是固定的还是动态的(动态的要借助花生壳做解析)

4、端口映射：点击高级设置---端口转发。添加规则开始设置映射 5、添加如图，端口信息和映射的端口(一般是11对应) 6、依次添加端口映射，比如邮件常见是25 110 143等端口。可以点击删除进行删除配置 7、应用规则：特别要注意提醒一下，设置好端口转发规则或者修改。都要点击一下【应用规则】 8、配置好端口映射之后，可以通过telnet或者站长工具扫描端口查看是否正常相关阅读：路由器安全特性关键点由于路由器是网络中比较关键的设备，针对网络存在的各种安全隐患，路由器必须具有如下的安全特性： (1)可靠性与线路安全可靠性要求是针对故障恢复和负载能力而提出来的。对于路由器来说，可靠性主要体现在接口故障和网络流量增大两种情况下，为此，备份是路由器不可或缺的手段之一。

当主接口出现故障时，备份接口自动投入工作，保证网络的正常运行。当网络流量增大时，备份接口又可承当负载分担的任务。 (2)身份认证路由器中的身份认证主要包括访问路由器时的身份认证、对端路由器的身份认证和路由信息的身份认证。 (3)访问控制对于路由器的访问控制，需要进行口令的分级保护。有基于IP地址的访问控制和基于用户的访问控制。 (4)信息隐藏与对端通信时，不一定需要用真实身份进行通信。通过地址转换，可以做到隐藏网内地址，只以公共地址的方式访问外部网络。除了由内部网络首先发起的连接，网外用户不能通过地址转换直接访问网内资源。 (5)数据加密为了避免因为数据窃听而造成的信息泄漏，有必要对所传输的信息进行加密，只有与之通信的对端才能对此密文进行解密。通过对路由器所发送的报文进行加密，即使在Internet上进行传输，也能保证数据的私有性、完整性以及报文内容的真实性。 (6)攻击探测和防范

路由基础配置命令

路由基础配置命令UTP双绞线类型：直通线：交叉线：广域网线缆连接类型： DTE： DCE: 时钟速率：路由器的接口：（作用） 1. 广域网接口: serial 2. 内网接口： fastethernet 3. 管理接口： console （控制台） AUX(备份) 连接两个设备： 1. 物理连接 2. 逻辑连接配置网络设备： 1. 管理属性：用户名密码描述警告 2. 协议地址：IP IPX 3. 协议策略：vlan 静态访问控制交换机直接可以应用路由器需要初始配置才能应用设备启动过程： 1. 加电自检 2. 查找加载操作系统 3. 查找加载配置文件配置网络设备方式： 1. 初始配置：console 2. 初始配置后通过interface（拥有ip地址的接口）： 1）telnet 2）TFTP 3）WEB 4）网络管理工具： SNA SDM 配置直接应用到内存

登陆路由器： Router> Router>enable /*进入特权模式 Router# Router#disable /*退出特权模式 Router> Router>logout /*退出路由器 Router>exit /*退出路由器路由器IOS帮助功能：？的三个用法 1/ 直接问号 2/ Router#cl? clear clock Router#cl 3/ Router#clock % Incomplete command. Router#clock ? set Set the time and date Router#clock 问号的帮助功能：（查找路由器设置时间的命令并设置时间） Router#cl? clear clock Router#clock % Incomplete command. Router#clock ? set Set the time and date Router#clock set % Incomplete command. Router#clock set ? hh:mm:ss Current Time Router#clock set 11:04:50 % Incomplete command. Router#clock set 11:04:50 ? <1-31> Day of the month MONTH Month of the year Router#clock set 11:04:50 15

网工作业之交换机和路由器配置过程总结(1)

网工大作业学院：专业: 姓名：学号：

交换机和路由器配置过程总结第一部分交换机配置一、概述一层、二层交换机工作在数据链路层，三层交换机工作在网络层，最常见的是以太网交换机。交换机一般具有用户模式、配置模式、特权模式、全局配置模式等模式。二、基本配置命令(CISCO) Switch >enable 进入特权模式 Switch #config terminal 进入全局配置模式 Switch (config)#hostname 设置交换机的主机名 Switch(config)#enable password 进入特权模式的密码（明文形式保存） Switch(config)#enable secret 加密密码（加密形式保存）（优先） Switch(config)#ip default-gateway 配置交换机网关 Switch(config)#show mac-address-table 查看MAC地址 Switch(config)logging synchronous 阻止控制台信息覆盖命令行上的输入 Switch(config)no ip domain-lookup 关闭DNS查找功能 Switch(config)exec-timeout 0 0 阻止会话退出使用Telnet远程式管理 Switch (config)#line vty 0 4 进入虚拟终端 Switch (config-line)# password 设置登录口令 Switch (config-line)# login 要求口令验证控制台口令 switch(config)#line console 0 进入控制台口 switch(config-line)# password xx switch(config-line)# 设置登录口令login 允许登录恢复出厂配置 Switch(config)#erase startup-config Switch(config)delete vlan.dat Vlan基本配置 Switch#vlan database 进去vlan配置模式 Switch(vlan)#vlan 号码name 名称创建vlan及vlan名 Switch(vlan)#vlan号码mtu数值修改MTU大小 Switch(vlan)#exit 更新vlan数据并推出 Switch#show vlan 查看\验证 Switch#copy running-config startup-config 保存配置 VLAN 中添加删除端口 Switch#config terminal 进入全局配置 Switch(config)#interface fastethernet0/1 进入要分配的端口 Switch(config-if)#Switchport mode access 定义二层端口 Switch(config-if)#Switchport acces vlan 号把端口分给一个vlan Switch(config-if)#switchport mode trunk 设置为干线 Switch(config-if)#switchport trunk encapsulation dot1q 设置vlan 中继协议

内网通过路由器端口映射实现外网的远程访问

宽带路由器端口映射远程控制电脑利用宽带路由器端口映射或者是DMZ主机，再配合微软自带的终端服务实现在家里远程操作办公室电脑。图解宽带路由器端口映射远程控制电脑，关于很多的宽带路由器端口映射问题，都是涉及到远程控制电脑的，所以要先了解远程控制电脑的一些基本知识才能为以后的日志配置做好准备。宽带路由器端口映射方案一：由于小的电脑在局域网，只有利用反弹型远程控制软件（如：灰鸽子等）；但是，考虑到它属于后门程序，使用也比较复杂，便放弃这个方案。宽带路由器端口映射方案二：利用宽带路由器端口映射或者是DMZ 主机，再配合微软自带的终端服务实现在家里远程操作办公室电脑。宽带路由器端口映射实现步骤： 1、查看小办公室电脑IP网关地址（一般情况下网关地址就是路由器LAN口地址，这个就是要访问的网电脑），在浏览器里面输入192.168.1.1。输入相应的用户以及密码进入路由器（默认情况下都是admin），如下图所示：

2、依次点击网络参数——WAN口设置。一般情况下WAN口连接类型是PPPoE，请注意选定“自动连接，在开机和断线后自动进行连接”，再点击保存，如下图所示：

3、为了方便，不使用第二步，直接点击设置向导：如下图点击下一步，出现：如果是宽带拨号上网选择第二个，PPPoE（ADSL虚拟拨号），下一步输入上网账号和密码：

输入完成，点击下一步，设置无线上网账号和密码：完成： 4、接下来设置路由器映射：依次点击转发规则——虚拟服务器，添加端口和IP地址

继续添加远程桌面端口3389。同时，还可以通过DMZ主机实现；但是，这个方法有很大的风险，容易受到来自外部的攻击。打开DMZ主机的方法如下；依次点击转发规则——DMZ主机，填写相应的IP地址，并勾选启用，如下图所示：

路由器实验1 路由器基本配置

实验1 路由器的基本配置实验目标掌握路由器几种常用配置方法和基本配置命令；掌握采用 Console线缆配置路由器的方法；掌握采用 Telnet方式配置路由器的方法；熟悉路由器不同的命令行操作模式以及各种模式之间的切换；实验背景作为网络管理员，须对新购进的路由器设备进行了初次配置后，希望以后在办公室或出差时可以对设备进行远程管理，现要在路由器上做适当配置。实验拓扑: (见本实验中所附的图R1.JPG) 实验设备 Router_2621 1 台；PC 1 台；交叉线；配置线说明：交叉线：路由器与计算机相连路由器与交换机相连直连线：计算机与交换机相连实验要求: 按照图R1.JPG创建packet tracer 拓扑图后进行以下各项操作: （1）在计算机上启用超级终端，并配置超级终端的参数(模拟器最好按默认的)，是计算机（RS 232）与路由器通过console 接口建立连接；更改路由器的主机名为R1；计算机（FastEthernet）与路由器FastEthernet0/0接口建立连接. （2）配置路由器的管理的 IP地址为192.168.1.254，并为 Telnet 用户配置用户密码5ijsj和enable 登录口令123456。配置计算机的 IP地址192.168.1.1（与路由器管理 IP地址在同一个网段），通过网线将计算机和路由器相连，通过计算机 Telnet到路由器上对交换机进行验证；（3）先为路由器添加有串行口的模块,然后配置路由器的串行口S0/0为DCE端,并设定其时钟频率是64000 ,接口IP地址和网络掩码为 192.168.10.1,255.255.255.0. （4）保存配置信息，显示配置信息；（5）显示历史命令。 1、按上图连线，更改路由器的主机名为R1； Router#conf terminal Router(config)#hostname R1 2、（1）配置Int F0/0 IP Router(config)#interface f0/0 Router(config-if)#ip address 192.168.1.254 255.255.255.0 Router(config-if)#no shutdown %LINK-5-CHANGED: Interface FastEthernet0/0, changed state to up