--- 1/draft-ietf-idmr-traceroute-ipm-05.txt 2006-02-04 23:29:02.000000000 +0100 +++ 2/draft-ietf-idmr-traceroute-ipm-06.txt 2006-02-04 23:29:02.000000000 +0100 @@ -1,17 +1,17 @@ Internet Engineering Task Force Inter-Domain Multicast Routing Working Group INTERNET-DRAFT W. Fenner -draft-ietf-idmr-traceroute-ipm-05.txt AT&T Research +draft-ietf-idmr-traceroute-ipm-06.txt AT&T Research S. Casner Cisco Systems - June 25, 1999 - Expires December 1999 + March 10, 2000 + Expires August 2000 A "traceroute" facility for IP Multicast. Status of this Memo This document is an Internet Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet Drafts are working docu- ments of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working doc- uments as Internet-Drafts. @@ -24,27 +24,26 @@ The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Distribution of this document is unlimited. Abstract - This draft describes the IGMP multicast traceroute facility. As - the deployment of IP multicast has spread, it has become clear that - a method for tracing the route that a multicast IP packet takes - from a source to a particular receiver is absolutely required. + This draft describes the IGMP multicast traceroute facility. Unlike unicast traceroute, multicast traceroute requires a special packet type and implementation on the part of routers. This speci- - fication describes the required functionality. + fication describes the required functionality in multicast routers, + as well as how management applications can use the new router func- + tionality. This document is a product of the Inter-Domain Multicast Routing working group within the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at idmr@cs.ucl.ac.uk and/or the author(s). Key Words The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this @@ -276,126 +275,125 @@ if unknown. 5.4. Previous-Hop Router Address This field specifies the router from which this router expects packets from this source. This may be a multicast group if the previous hop is not known because of the workings of the multicast routing protocol. However, it should be 0 if the incoming inter- face address is unknown. -5.5. Input packet count on incoming interface +5.5. Packet counts + + Note that these packet counts SHOULD be as up to date as possible. + If packet counts are not being maintained on the processor that + handles the traceroute request in a multi-processor router archi- + tecture, the packet SHOULD be delayed while the counters are gath- + ered from the remote processor(s). If this occurs, the Query + Arrival Time should be updated to reflect the time at which the + packet counts were learned. + +5.6. Input packet count on incoming interface This field contains the number of multicast packets received for all groups and sources on the incoming interface, or 0xffffffff if - no count can be reported. + no count can be reported. This counter should have the same value + as ifInMulticastPkts from the IF-MIB for this interface. -5.6. Output packet count on outgoing interface +5.7. Output packet count on outgoing interface This field contains the number of multicast packets that have been - transmitted for all groups and sources on the outgoing interface, - or 0xffffffff if no count can be reported. + transmitted or queued for transmission for all groups and sources + on the outgoing interface, or 0xffffffff if no count can be + reported. This counter should have the same value as ifOutMulti- + castPkts from the IF-MIB for this interface. -5.7. Total number of packets for this source-group pair +5.8. Total number of packets for this source-group pair This field counts the number of packets from the specified source forwarded by this router to the specified group, or 0xffffffff if no count can be reported. If the S bit is set, the count is for the source network, as specified by the Src Mask field. If the S bit is set and the Src Mask field is 63, indicating no source-spe- cific state, the count is for all sources sending to this group. + This counter should have the same value as ipMRoutePkts from the + IPMROUTE-STD-MIB for this forwarding entry. -5.8. Rtg Protocol: 8 bits +5.9. Rtg Protocol: 8 bits This field describes the routing protocol in use between this router and the previous-hop router. Specified values include: - 1 DVMRP - 2 MOSPF - 3 PIM - 4 CBT - 5 PIM using special routing table - 6 PIM using a static route - 7 DVMRP using a static route - 8 PIM using MBGP (aka BGP4+) route - 9 CBT using special routing table - 10 CBT using a static route - 11 PIM using state created by Assert processing + l l. 1 DVMRP 2 MOSPF 3 PIM 4 CBT 5 PIM using spe- + cial routing table 6 PIM using a static route 7 DVMRP using a + static route 8 PIM using MBGP (aka BGP4+) route 9 CBT using + special routing table 10 CBT using a static route 11 PIM using + state created by Assert processing -5.9. FwdTTL: 8 bits +5.10. FwdTTL: 8 bits This field contains the TTL that a packet is required to have before it will be forwarded over the outgoing interface. -5.10. MBZ: 1 bit +5.11. MBZ: 1 bit Must be zeroed on transmission and ignored on reception. -5.11. S: 1 bit +5.12. S: 1 bit If this bit is set, it indicates that the packet count for the source-group pair is for the source network, as determined by mask- ing the source address with the Src Mask field. -5.12. Src Mask: 6 bits +5.13. Src Mask: 6 bits This field contains the number of 1's in the netmask this router has for the source (i.e. a value of 24 means the netmask is 0xffffff00). If the router is forwarding solely on group state, this field is set to 63 (0x3f). -5.13. Forwarding Code: 8 bits +5.14. Forwarding Code: 8 bits This field contains a forwarding information/error code. Defined values include: - Value Name Description - -------------------------------------------------------------------- - 0x00 NO_ERROR No error - 0x01 WRONG_IF Traceroute request arrived on an interface to - which this router would not forward for this - source,group,destination. - - 0x02 PRUNE_SENT This router has sent a prune upstream which - applies to the source and group in the tracer- - oute request. - 0x03 PRUNE_RCVD This router has stopped forwarding for this - source and group in response to a request from - the next hop router. - 0x04 SCOPED The group is subject to administrative scoping - at this hop. - 0x05 NO_ROUTE This router has no route for the source or - group and no way to determine a potential - route. + expand; l l lw(3i) . Value Name Description _ + 0x00 NO_ERROR No error 0x01 WRONG_IF T{ Traceroute request + arrived on an interface to which this router would not forward for + this source,group,destination. T} 0x02 PRUNE_SENT T{ This + router has sent a prune upstream which applies to the source and + group in the traceroute request. T} 0x03 PRUNE_RCVD T{ This + router has stopped forwarding for this source and group in response + to a request from the next hop router. T} 0x04 SCOPED T{ The + group is subject to administrative scoping at this hop. T} + 0x05 NO_ROUTE T{ This router has no route for the source or group + and no way to determine a potential route. T} 0x06 WRONG_LAST_HOP This router is not the proper last-hop router. - 0x07 NOT_FORWARDING This router is not forwarding this - source,group out the outgoing interface for an - unspecified reason. - 0x08 REACHED_RP Reached Rendez-vous Point or Core - 0x09 RPF_IF Traceroute request arrived on the expected RPF - interface for this source,group. - 0x0A NO_MULTICAST Traceroute request arrived on an interface - which is not enabled for multicast. - 0x0B INFO_HIDDEN One or more hops have been hidden from this - trace. - 0x81 NO_SPACE There was not enough room to insert another - response data block in the packet. - 0x82 OLD_ROUTER The previous hop router does not understand - traceroute requests. - 0x83 ADMIN_PROHIB Traceroute is administratively prohibited. + 0x07 NOT_FORWARDING T{ This router is not forwarding this + source,group out the outgoing interface for an unspecified reason. + T} 0x08 REACHED_RP Reached Rendez-vous Point or Core + 0x09 RPF_IF T{ Traceroute request arrived on the expected RPF + interface for this source,group. T} 0x0A NO_MULTICAST T{ Tracer- + oute request arrived on an interface which is not enabled for mul- + ticast. T} 0x0B INFO_HIDDEN T{ One or more hops have been hid- + den from this trace. T} 0x81 NO_SPACE T{ There was not enough + room to insert another response data block in the packet. T} + 0x82 OLD_ROUTER T{ The previous hop router does not understand + traceroute requests. T} 0x83 ADMIN_PROHIB Traceroute is adminis- + tratively prohibited. Note that if a router discovers there is not enough room in a packet to insert its response, it puts the 0x81 error code in the previous router's Forwarding Code field, overwriting any error the - previous router placed there. It is expected that a multicast - traceroute client, upon receiving this error, will restart the - trace at the last hop listed in the packet. + previous router placed there. A multicast traceroute client, upon + receiving this error, MAY restart the trace at the last hop listed + in the packet. The 0x80 bit of the Forwarding Code is used to indicate a fatal error. A fatal error is one where the router may know the previous hop but cannot forward the message to it. 6. Router Behavior All of these actions are performed in addition to (NOT instead of) for- warding the packet, if applicable. E.g. a multicast packet that has TTL remaining MUST be forwarded normally, as MUST a unicast packet that has @@ -416,21 +414,25 @@ the given source onto that subnet. If the router determines that it is not the proper last-hop router, or it cannot make that determination, it does one of two things depending if the Query was received via multicast or unicast. If the Query was received via multicast, then it MUST be silently dropped. If it was received via unicast, a forwarding code of WRONG_LAST_HOP is noted and processing continues as in section 6.2. Duplicate Query messages as identified by the tuple (IP Source, - Query ID) SHOULD be ignored. + Query ID) SHOULD be ignored. This MAY be implemented using a sim- + ple 1-back cache (i.e. remembering the IP source and Query ID of + the previous Query message that was processed, and ignoring future + messages with the same IP Source and Query ID). Duplicate Request + messages MUST NOT be ignored in this manner. 6.1.2. Normal Processing When a router receives a traceroute Query and it determines that it is the proper last-hop router, it treats it like a traceroute Request and performs the steps listed in section 6.2. 6.2. Traceroute Request A traceroute Request is a traceroute message with some number of @@ -446,23 +448,27 @@ 6.2.2. Normal Processing When a router receives a traceroute Request, it performs the fol- lowing steps. Note that it is possible to have multiple situations covered by the Forwarding Codes. The first one encountered is the one that is reported, i.e. all "note forwarding code N" should be interpreted as "if forwarding code is not already set, set forward- ing code to N". - 1. Insert a new response block into the packet and fill in the - Query Arrival Time, Outgoing Interface Address, Output Packet - Count, and FwdTTL. + 1. If there is room in the current buffer (or the router can effi- + ciently allocate more space to use), insert a new response + block into the packet and fill in the Query Arrival Time, Out- + going Interface Address, Output Packet Count, and FwdTTL. If + there was no room, fill in the response code "NO_SPACE" in the + *previous* hop's response block, and forward the packet to the + requester as described in "Forwarding Traceroute Requests". 2. Attempt to determine the forwarding information for the source and group specified, using the same mechanisms as would be used when a packet is received from the source destined for the group. State need not be instantiated, it can be "phantom" state created only for the purpose of the trace. If using a shared-tree protocol and there is no source-specific state, or if the source is specified as 0xFFFFFFFF, group state should be used. If there is no group state or the group is @@ -518,34 +524,35 @@ 6.3. Traceroute response A router must forward all traceroute response packets normally, with no special processing. If a router has initiated a traceroute with a Query or Request message, it may listen for Responses to that traceroute but MUST still forward them as well. 6.4. Forwarding Traceroute Requests - If the Previous-hop router is known for the source and group (or, - if no group is specified, the previous-hop router for the source, - or if no source is specified, the previous-hop router for the - group) and the number of response blocks is less than the number - requested, the packet is sent to that router. If the Incoming - Interface is known but the Previous-hop router is not known, the - packet is sent to an appropriate multicast address on the Incoming - Interface. The appropriate multicast address may depend on the - routing protocol in use, MUST be a link-scoped group (i.e. - 224.0.0.x), MUST NOT be ALL-SYSTEMS.MCAST.NET (224.0.0.1) and may - be ALL-ROUTERS.MCAST.NET (224.0.0.2) if the routing protocol in use - does not define a more appropriate group. Otherwise, it is sent to - the Response Address in the header, as described in "Sending - Traceroute Responses". + If the Previous-hop router is known for this request and the number + of response blocks is less than the number requested, the packet is + sent to that router. If the Incoming Interface is known but the + Previous-hop router is not known, the packet is sent to an appro- + priate multicast address on the Incoming Interface. The appropri- + ate multicast address may depend on the routing protocol in use, + MUST be a link-scoped group (i.e. 224.0.0.x), MUST NOT be ALL-SYS- + TEMS.MCAST.NET (224.0.0.1) and MAY be ALL-ROUTERS.MCAST.NET + (224.0.0.2) if the routing protocol in use does not define a more + appropriate group. Otherwise, it is sent to the Response Address + in the header, as described in "Sending Traceroute Responses". + Note that it is not an error for the number of response blocks to + be greater than the number requested; such a packet should simply + be forwarded to the requester as described in "Sending Traceroute + Responses". 6.5. Sending Traceroute Responses 6.5.1. Destination Address A traceroute response must be sent to the Response Address in the traceroute header. 6.5.2. TTL @@ -560,21 +567,21 @@ interface addresses as the source address. Since some multicast routing protocols forward based on source address, if the Response Address is multicast, the router MUST use an address that is known in the multicast routing table if it can make that determination. 6.5.4. Sourcing Multicast Responses When a router sources a multicast response, the response packet MUST be sent on a single interface, then forwarded as if it were received on that interface. It MUST NOT source the response packet - individually on each interface, since that causes duplicate pack- + individually on each interface, in order to avoid duplicate pack- ets. 6.6. Hiding information Information about a domain's topology and connectivity may be hid- den from multicast traceroute requests. The exact mechanism is not specified here; however, the INFO_HIDDEN forwarding code may be used to note that, for example, the incoming interface address and packet count are for the entrance to the domain and the outgoing interface address and packet count are the exit from the domain. @@ -628,23 +635,23 @@ Details of performing a multicast traceroute: 7.2. Last hop router The traceroute querier may not know which is the last hop router, or that router may be behind a firewall that blocks unicast packets but passes multicast packets. In these cases, the traceroute request should be multicasted to the group being traced (since the last hop router listens to that group). All routers except the correct last hop router should ignore any multicast traceroute - request received via multicast. Traceroute requests which are mul- - ticasted to the group being traced must include the Router Alert IP - option [Katz97]. + request received via multicast. Traceroute requests which are + multicasted to the group being traced must include the Router Alert + IP option [Katz97]. Another alternative is to unicast to the trace destination. Traceroute requests which are unicasted to the trace destination must include the Router Alert IP option [Katz97], in order that the last-hop router is aware of the packet. If the traceroute querier is attached to the same router as the destination of the request, the traceroute request may be multicas- ted to 224.0.0.2 (ALL-ROUTERS.MCAST.NET) if the last-hop router is not known. @@ -725,26 +732,27 @@ 7.7. Multicast Traceroute and shared-tree routing protocols When using shared-tree routing protocols like PIM-SM and CBT, a more advanced client may use multicast traceroute to determine paths or potential paths. 7.7.1. PIM-SM When a multicast traceroute reaches a PIM-SM RP and the RP does not for- -ward the trace on, it means that the RP has not performed a source-spe- -cific join so there is no more state to trace. However, the path that -traffic would use if the RP did perform a source-specific join can be -traced by setting the trace destination to the RP, the trace source to -the traffic source, and the trace group to 0. This trace Query may be -unicasted to the RP. +ward the trace on, it means that the RP has not performed a source- + +specific join so there is no more state to trace. However, the path +that traffic would use if the RP did perform a source-specific join can +be traced by setting the trace destination to the RP, the trace source +to the traffic source, and the trace group to 0. This trace Query may +be unicasted to the RP. 7.7.2. CBT When a multicast traceroute reaches a CBT Core, it must simply stop since CBT does not have source-specific state. However, a second trace can be performed, setting the trace destination to the traffic source, the trace group to the group being traced, and the trace source to the Core (or to 0, since CBT does not have source-specific state). This trace Query may be unicasted to the Core. There are two possibilities when combining the two traces: @@ -796,21 +804,21 @@ The forwarding error code can tell if a group is unexpectedly pruned or administratively scoped. 8.2. TTL problems By taking the maximum of (hops from source + forwarding TTL thresh- old) over all hops, you can discover the TTL required for the source to reach the destination. -8.3. Congestion +8.3. Packet Loss By taking two traces, you can find packet loss information by com- paring the difference in input packet counts to the difference in output packet counts at the previous hop. On a point-to-point link, any difference in these numbers implies packet loss. Since the packet counts may be changing as the trace query is propagat- ing, there may be small errors (off by 1 or 2) in these statistics. However, these errors will not accumulate if multiple traces are taken to expand the measurement period. On a shared link, the count of input packets can be larger than the number of output @@ -827,143 +835,178 @@ the specified receiver via the specified group. This measure is not affected by shared links. On a point-to-point link that is a multicast tunnel, packet loss is usually due to congestion in unicast routers along the path of that tunnel. On native multicast links, loss is more likely in the out- put queue of one hop, perhaps due to priority dropping, or in the input queue at the next hop. The counters in the response data do not allow these cases to be distinguished. Differences in packet counts between the incoming and outgoing interfaces on one node - cannot generally be used to measure queue overflow in the node - because some packets may be routed only to or from other interfaces - on that node. - - In the multicast extensions for SunOS 4.1.x from Xerox PARC, both - the output packet count and the packet forwarding count for the - source-group pair are incremented before priority dropping for rate - limiting occurs and before the packets are put onto the interface - output queue which may overflow. These drops will appear as (posi- - tive) loss on the link even though they occur within the router. - - In release 3.3/3.4 of the UNIX multicast extensions, a multicast - packet generated on a router will be counted as having come in an - interface even though it did not. This can create the appearance - of negative loss even on a point-to-point link. - - In releases up through 3.5/3.6, packets were not counted as input - on an interface if the reverse-path forwarding check decided that - the packets should be dropped. That causes the packets to appear - as lost on the link if they were output by the upstream hop. This - situation can arise when two routers on the path for the group - being traced are connected by a shared link, and the path for some - other group does not flow between those two routers because the - downstream router receives packets for the other group on another - interface, but the upstream router is the elected forwarder to - other routers or hosts on the shared link. + cannot generally be used to measure queue overflow in the node. 8.4. Link Utilization Again, with two traces, you can divide the difference in the input or output packet counts at some hop by the difference in time stamps from the same hop to obtain the packet rate over the link. If the average packet size is known, then the link utilization can also be estimated to see whether packet loss may be due to the rate limit or the physical capacity on a particular link being exceeded. 8.5. Time delay If the routers have synchronized clocks, it is possible to estimate - propagation and queueing delay from the differences between the - timestamps at successive hops. + propagation and queuing delay from the differences between the + timestamps at successive hops. However, this delay includes con- + trol processing overhead, so is not necessarily indicative of the + delay that data traffic would experience. -9. Acknowledgments +9. Implementation-specific Caveats + +Some routers with distributed forwarding architectures may not update +the main processor's packet counts often enough for the packet counters +to be meaningful on a small time scale. This can be recognized during a +periodic trace by seeing positive loss in one trace and negative loss in +the next, with no (or small) net loss over a longer interval. The sug- +gested solution to this problem is to simply collect statistics over a +longer interval. + +In the multicast extensions for SunOS 4.1.x from Xerox PARC, which are +the basis for many UNIX-based multicast routers, both the output packet +count and the packet forwarding count for the source-group pair are +incremented before priority dropping for rate limiting occurs and before +the packets are put onto the interface output queue which may overflow. +These drops will appear as (positive) loss on the link even though they +occur within the router. + +In release 3.3/3.4 of the UNIX multicast extensions, a multicast packet +generated on a router will be counted as having come in an interface +even though it did not. This can create the appearance of negative loss +even on a point-to-point link. + +In releases up through 3.5/3.6, packets were not counted as input on an +interface if the reverse-path forwarding check decided that the packets +should be dropped. That causes the packets to appear as lost on the +link if they were output by the upstream hop. This situation can arise +when two routers on the path for the group being traced are connected by +a shared link, and the path for some other group does not flow between +those two routers because the downstream router receives packets for the +other group on another interface, but the upstream router is the elected +forwarder to other routers or hosts on the shared link. + +10. Acknowledgments This specification started largely as a transcription of Van Jacobson's slides from the 30th IETF, and the implementation in mrouted 3.3 by Ajit Thyagarajan. Van's original slides credit Steve Casner, Steve Deering, Dino Farinacci and Deb Agrawal. A multicast traceroute client, mtrace, has been implemented by Ajit Thyagarajan, Steve Casner and Bill Fenner. The idea of unicasting a multicast traceroute Query to the destination of the trace with Router Alert set is due to Tony Ballardie. The idea of the "S" bit to allow statistics for a source subnet is due to Tom Pusateri. -10. IANA Considerations +11. IANA Considerations -10.1. Routing Protocols +11.1. Routing Protocols The IANA is responsible for allocating new Routing Protocol codes. The Routing Protocol code is somewhat problematic, since in the case of protocols like CBT and PIM it must encode both a unicast routing algorithm and a multicast tree-building protocol. The space was not divided into two fields because it was already small and some combinations (e.g. DVMRP) would be wasted. Routing Protocol codes should be allocated for any combination of protocols that are in common use in the Internet. -10.2. Forwarding Codes +11.2. Forwarding Codes New Forwarding codes must only be created by an RFC that modifies this document's section 7, fully describing the conditions under which the new forwarding code is used. The IANA may act as a cen- tral repository so that there is a single place to look up forward- ing codes and the document in which they are defined. -11. Security Considerations +12. Security Considerations -11.1. Topology discovery +12.1. Topology discovery mtrace can be used to discover any actively-used topology. If your network topology is a secret, mtrace may be restricted at the bor- der of your domain, using the ADMIN_PROHIB forwarding code. -11.2. Traffic rates +12.2. Traffic rates mtrace can be used to discover what sources are sending to what groups and at what rates. If this information is a secret, mtrace may be restricted at the border of your domain, using the ADMIN_PROHIB forwarding code. -11.3. Unicast replies +12.3. Unicast replies The "Response address" field may be used to send a single packet (the traceroute Reply packet) to an arbitrary unicast address. It is possible to use this facility as a packet amplifier, as a small multicast traceroute Query may turn into a large Reply packet. -12. References +13. References Brad88 Braden, B., D. Borman, C. Partridge, "Computing the Internet Checksum", RFC 1071, ISI, September 1988. Brad97 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119/BCP 14, Harvard University, March 1997. Katz97 Katz, D., "IP Router Alert Option," RFC 2113, Cisco Sys- tems, February 1997. Pusa99 Pusateri, T., "DVMRP Version 3", work in progress, June 1999. Thal99a Thaler, D., "PIM MIB", work in progress, June 1999. Thal99b Thaler, D., "DVMRP MIB", work in progress, May 1998. -13. Authors' Addresses +14. Authors' Addresses William C. Fenner AT&T Labs -- Research 75 Willow Rd. Menlo Park, CA 94025 United States Email: fenner@research.att.com Stephen L. Casner Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134 United States Email: casner@cisco.com + +15. Changes from the last revision: + +- Changes section added. + +- Updated abstract + +- Added mention of up-to-date packet counts, in particular allowing + the delay of an mtrace packet while the counts are fetched in a + distributed architecture. + +- Added mention of ifInMulticastPkts, ifOutMulticastPkts, and ipM- + RoutePkts for clarification of what counts should be used. + +- Note that the dropping of duplicate Queries MAY be a 1-back cache + and that duplicate Requests MUST NOT be dropped + +- Add no-space processing rule + +- Note that it's not an error for there to be more blocks than + requested, just send it back after adding yours. + +- Clean up some of section 8 - move implementation-specific stuff to + a separate section, rename "Congestion" to "Packet Loss", note that + time delay isn't actually that useful.