Inter-Domain Multicast Routing (IDMR)                       A. Ballardie
INTERNET-DRAFT                                 University College London
                                                     S.  Reeve & N. Jain
                                                      Bay Networks, Inc.

                                                          September 1996                                                Consultant

                                                              March 1997

                Core Based Trees (CBT) Multicast Routing

                      -- Protocol Specification --

Status of this Memo

   This document is an Internet Draft.  Internet Drafts are working doc-
   uments of the Internet Engineering Task Force (IETF), its Areas, and
   its Working Groups. Note that other groups may also distribute work-
   ing documents as Internet Drafts).

   Internet Drafts are draft documents valid for a maximum of six
   months. Internet Drafts may be updated, replaced, or obsoleted by
   other documents at any time.  It is not appropriate to use Internet
   Drafts as reference material or to cite them other than as a "working
   draft" or "work in progress."

   Please check the I-D abstract listing contained in each Internet
   Draft directory to learn the current status of this or any other
   Internet Draft.

Abstract

   This document describes the Core Based Tree (CBT) network layer mul-
   ticast routing protocol. CBT is builds a next-generation shared multicast distribution
   tree per group, and is suited to inter- and intra-domain multicast
   routing.

   CBT is protocol independent in that it makes use of a shared delivery tree rather than separate per-sender
   trees utilized by most other multicast schemes [1, 2, 3]. unicast routing
   to establish paths between senders and receivers.  The CBT
   architecture architec-
   ture is described in [4a].

   This specification includes an optimization whereby unencapsulated
   (native) IP-style multicasts are forwarded by CBT routers, resulting
   in very good forwarding performance.  This mode of operation is
   called CBT "native mode".  Native mode can only be used in CBT-only
   domains (footnote 1).
_________________________
   This revision contains two appendices; Appendix A describes simple
   CBT add-on mechanisms for dynamically migrating a CBT tree to one
   whose core is directly attached to a source's subnetwork, thereby
   allowing CBT to emulate shortest-path trees.  Appendix B describes a
   group state aggregation scheme. [1].

   This document is progressing through the IDMR working group of the
   IETF.  CBT related documents include [4, 5]. [1, 5, 6]. For all IDMR-related
   documents, see http://www.cs.ucl.ac.uk/ietf/idmr.

   NOTE that core placement

TABLE OF CONTENTS

  1. Changes Since Previous Revision............................ 3

  2. Introduction & Terminology................................. 4

  3. CBT Functional Overview.................................... 5

  4. CBT Protocol Specificiation Details........................ 8

     4.1 CBT HELLO Protocol..................................... 8

         4.1.1 Sending HELLOs................................... 9

         4.1.2 Receiving HELLOs................................. 9

     4.2 JOIN_REQUEST Processing................................ 10

         4.2.1 Sending JOIN_REQUESTs............................ 10

         4.2.2 Receiving JOIN_REQUESTs.......................... 10

     4.3 JOIN_ACK Processing.................................... 11

         4.3.1 Sending JOIN_ACKs................................ 11

         4.3.2 Receiving JOIN_ACKs.............................. 12

     4.4 QUIT_NOTIFICATION Processing........................... 12

         4.4.1 Sending QUIT_NOTIFICATIONs....................... 12

         4.4.2 Receiving QUIT_NOTIFICATIONs..................... 13

     4.5 CBT ECHO_REQUEST Processing............................ 14

         4.5.1 Sending ECHO_REQUESTs............................ 14

         4.5.2 Receiving ECHO_REQUESTs.......................... 14

     4.6 ECHO_REPLY Processing.................................. 15

         4.6.1 Sending ECHO_REPLYs.............................. 15

         4.6.2 Receiving ECHO_REPLYs............................ 15
     4.7 FLUSH_TREE Processing.................................. 16

         4.7.1 Sending FLUSH_TREE Messages...................... 16

         4.7.2 Receiving FLUSH_TREE Messages.................... 16

  5. Timers and management is not discussed in this doc-
   ument. Default Values.................................. 16

  6. CBT Packet Formats and Message Types....................... 17

     6.1 CBT Common Control Packet Header....................... 18

     6.2 HELLO Packet Format.................................... 19

     6.3 JOIN_REQUEST Packet Format............................. 19

     6.4 JOIN_ACK Packet Format................................. 20

     6.5 QUIT_NOTIFICATION Packet Format........................ 21

     6.6 ECHO_REQUEST Packet Format............................. 21

     6.7 ECHO_REPLY Packet Format............................... 22

     6.8 FLUSH_TREE Packet Format............................... 23

  7. Core Router Discovery...................................... 23

     7.1  Bootstrap Message Format.............................. 25

     7.2  Candidate Core Advertisement Message Format........... 25

  8. Interoperability Issues.................................... 25

  Acknowledgements.............................................. 26

  References.................................................... 26

  Author Information............................................ 27

1.  Changes since Previous Revision (05)

   This note summarizes the changes to this document since the previous revision (revision 05).

   +o    inclusion of "first hop router" and "primary core" fields in the CBT mode data packet header.

   +o    removal of the term "non-core" router, replaced by "on-tree"
        router.

   +o    removal of protocol specification differs significantly
   from the term "default DR (D-DR)", replaced simply by DR.

   +o    inclusion previously released revision (05). Consequently, this revi-
   sion represents version 2 of T and S bits in the CBT control and data packet
        headers (type of service, protocol.  CBT version 2 is not,
   and security, respectively).

   +o was not, intended to be backwards compatible with version 1; we
   do not expect this to cause extensive compatibility problems because
   we do not believe CBT control messages are now carried directly over IP rather
        than UDP (for is at all implementations).

   +o    inclusion widely deployed at this stage. How-
   ever, any future versions of an Appendix (A) describing extensions to the CBT
        protocol can be expected to achieve dynamic source-migration of core routers for
        shortest-path tree emulation. be backwards com-
   patible with this version.

   The most significant changes to version 2 compared to version 1
   include:

   +o    inclusion    new LAN mechanisms, including the incorporation of an Appendix (B) describing a group state aggrega-
        tion scheme.
_________________________
  1 The term "domain" should be  considered  synonymous HELLO pro-
        tocol.

   +o    new simplified packet formats, with "routing domain" throughout, as are the terms "re-
gion" and "cloud". definition of a common
        CBT control packet header.

   +o    editorial changes    a generic intra-domain core discovery ("bootstrap") mechanism,
        to be specified separately, and some re-organisation throughout for extra
        clarity. published soon.

   This specification revision is a complete re-write of the previous
   revision.

2.  Some  Introduction & Terminology

   In CBT, the core routers for a particular group are categorised into
   PRIMARY CORE, and NON-PRIMARY (secondary) CORES.

   The "core tree" router" (or just "core") is the part of a tree linking all core routers of router which configured
   to act as a
   particular "meeting point" between a sender and group together.

   On-tree routers are those with receivers. The
   term "rendezvous point (RP)" is used equivalently in some contexts
   [2]. Each core router is configured to know it is a forwarding database entry for the
   corresponding group.

3.  Protocol Specification

3.1.  Tree Joining Process -- Overview core router.

   A CBT router that is notified part of a local host's desire to join a group via
   IGMP [6].  We refer to a CBT router with directly attached hosts distribution tree is known as a
   "leaf CBT router", or just "leaf" router.

   The following CBT control messages come into play subequent to a sub-
   net's CBT leaf router receiving an IGMP membership report (also
   termed "IGMP join"):

   +o    JOIN_REQUEST

   +o    JOIN_ACK

   If the CBT leaf "on-
   tree" router. An on-tree router is maintains active state for the subnet's designated router (see next
   section), it generates a CBT join-request in response group.

   We refer to receiving an
   IGMP group membership report from a directly connected host. The CBT
   join broadcast interface as any interface that supports mul-
   ticast transmission.

   An "upstream" interface (or router) is one which is sent to the next-hop on the unicast path to a target core,
   specified in
   towards the join packet; a group's core router elects a "target core" based
   on a static configuration. If, with respect to this router. A "down-
   stream" interface (or router) is one which is on receipt of an IGMP-join, the
   locally-elected DR has already joined path away from
   the corresponding tree, then it
   need do nothing more group's core router with respect to joining. this router.

   Other terminology is introduced in its context throughout the text.

3.  CBT Functional Overview

   The join CBT protocol is processed designed to build and maintain a shared multicast
   distribution tree that spans only those networks and links leading to
   interested receivers.

   To achieve this, a host first expresses its interest in joining a
   group by each such multicasting an IGMP host membership report [3] across its
   attached link. On receiving this report, a local CBT aware router
   invokes the tree joining process (unless it has already) by generat-
   ing a JOIN_REQUEST message, which is sent to the next hop on the path to
   towards the core, until
   either group's core router (how the local router discovers which
   core to join reaches is discussed in section 7). This join message must be
   explicitly acknowledged (JOIN_ACK) either by the target core router itself,
   or hits a by another router that is on the unicast path between the sending
   router and the core, which itself has already part successfully joined the
   tree.

   The join message sets up transient join state in the routers it tra-
   verses, and this state consists of <group, incoming interface, outgo-
   ing interface>. "Incoming interface" and "outgoing interface" may be
   "previous hop" and "next hop", respectively, if the corresponding distribution tree (as identified
   by
   links do not support multicast transmission. "Previous hop" is taken
   from the group address). In both cases, incoming control packet's IP source address, and "next hop"
   is gleaned from the router concerned terminates routing table - the join, and responds next hop to the specified
   core address. This transient state eventually times out unless it is
   "confirmed" with a join-ack (join acknowledgement), which join acknowledgement (JOIN_ACK) from upstream. The
   JOIN_ACK traverses the reverse-path reverse path of the corresponding join. This join mes-
   sage, which is possi-
   ble possible due to the presence of the transient path state created by a join traversing a
   CBT router. The ack fixes that
   state.

3.2.  DR Election

   Multiple CBT routers may be connected Once the acknowledgement reaches the router that originated
   the join message, the new receiver can receive traffic sent to the
   group.

   Loops cannot be created in a multi-access subnetwork.
   In such cases it CBT tree because a) there is necessary only one
   active core per group, and b) tree building/maintenance scenarios
   which may lead to elect the creation of tree loops are avoided.  For exam-
   ple, if a subnetwork designated router's upstream neighbour becomes unreachable, the router
   (DR) that is responsible for generating and sending CBT joins
   upstream, on behalf
   immediately "flushes" all of hosts its downstream branches, allowing them
   to individually rejoin if necessary.  Transient unicast loops do not
   pose a threat because a new join message that loops back on itself
   will never get acknowledged, and thus eventually times out.

   The state created in routers by the subnetwork.

   CBT DR election happens "on the back" sending or receiving of IGMP [6]; on a subnet with
   multiple multicast routers, an IGMP "querier"
   JOIN_ACK is elected as part bi-directional - data can flow either way along a tree
   "branch", and the state is group specific - it consists of
   IGMP.  At start-up, the group
   address and a multicast router assumes list of local interfaces over which join messages for
   the group have previously been acknowledged. There is no other multicast
   routers are present on its subnetwork, and so begins by believing concept of
   "incoming" or "outgoing" interfaces, though it is necessary to be
   able to distinguish the subnet's IGMP querier.  It sends upstream interface from any downstream inter-
   faces. In CBT, these interfaces are known as the "parent" and "child"
   interfaces, respectively. We recommend the parent be distinguished as
   such by a small number IGMP-HOST-
   MEMBERSHIP-QUERYs in short succession single bit in order each multicast forwarding cache entry.

   With regards to quickly learn about
   any group memberships on the subnet. If other information contained in the multicast routers are
   present forwarding
   cache, on the same subnet, they will receive these IGMP queries; a link types not supporting native multicast transmission an
   on-tree router yields querier duty as soon as it hears an IGMP
   query from a lower-addressed router on the same subnetwork.

   The CBT DR is always must store the subnet's IGMP querier (footnote 2).  As address of a
   result, there parent and any children.
   On links supporting multicast however, parent and any child informa-
   tion is no protocol overhead whatsoever associated represented with
   electing a CBT D-DR.

3.3.  Tree Joining Process -- Details

   The receipt of local interface addresses (or similar iden-
   tifying information, such as an IGMP group membership report by interface "index") over which the
   parent or child is reachable.

   When a CBT DR for multicast data packet arrives at a CBT
   group not previously heard from triggers router, the tree joining process; router uses the DR unicasts a JOIN-REQUEST to
   group address as an index into the first hop on multicast forwarding cache. A copy
   of the (unicast) path incoming multicast data packet is forwarded over each inter-
   face (or to the target core specified each address) listed in the CBT join packet.

_________________________
  2 Or lowest addressed CBT router if entry except the subnet's IGMP
querier is non-CBT capable. incoming
   interface.

   Each CBT-capable router traversed on the path between the sending DR
   and that comprises a CBT multicast tree, except the core processes
   router, is responsible for maintaining its upstream link, provided it
   has interested downstream receivers, i.e. the join. However, if child interface list is
   non-NULL. A child interface is one over which a join hits member host is
   directly attached, or one over which a CBT downstream on-tree router
   that is already on-tree, the join
   attached.  This "tree maintenance" is not propogated further, but
   acknowledged achieved by each downstream from that point.

   JOIN-REQUESTs carry the identity of all
   router periodically sending a CBT "keepalive" message (ECHO_REQUEST)
   to its upstream neighbour, i.e. its parent router on the cores associated tree. One
   keepalive message is sent to represent entries with the
   group.  Assuming there same parent,
   thereby improving scalability on links which are no on-tree routers in between, once the
   join (subcode ACTIVE_JOIN) reaches shared by many
   groups.  On multicast capable links, a keepalive is multicast to the target core, if
   "all-cbt-routers" group (IANA assigned as 224.0.0.15); this has a
   suppressing effect on any other router for which the target
   core link is not the primary core (as indicated in its par-
   ent link.  If a separate field of the
   join packet) it first acknowledges the received join by means parent link does not support multicast transmission,
   keepalives are unicast.

   The receipt of a
   JOIN-ACK, then sends keepalive message over a JOIN-REQUEST, subcode REJOIN-ACTIVE, to the
   primary core router.

   If the rejoin-active reaches the primary core, it responds by sending valid child interface imme-
   diately prompts a JOIN-ACK, subcode PRIMARY-REJOIN-ACK, response (ECHO_REPLY), which traverses the reverse-
   path of the join (rejoin). The primary-rejoin-ack serves to confirm
   no loop is present, either unicast or
   multicast, as appropriate.

   The ECHO_REQUEST does not contain any group information; the
   ECHO_REPLY does, but only periodically. To maintain consistent infor-
   mation between parent and so explicit loop detection child,
    the parent periodically reports, in an ECHO_REPLY, all groups for
   which it has state, over each of its child interfaces for those
   groups. This group-carrying echo reply is not necessary.

   If some other on-tree router prompted explicitly by
   the receipt of an echo request message.  A child is encountered before notified of the rejoin-active
   reaches
   time to expect the primary, that router responds with next echo reply message containing group informa-
   tion in an echo reply prompted by a JOIN-ACK, subcode
   NORMAL.  On receipt child's echo request. The fre-
   quency of  parent group reporting is at the ack, subcode normal, granularity of minutes.

   It cannot be assumed all of the router sends routers on a
   join, subcode REJOIN-NACTIVE, which acts as multi-access link have a loop detection packet
   (see section 8.3).  Note that loop detection
   uniform view of unicast routing; this is not necessary subse-
   quent particularly the case when a
   multi-access link spans two or more unicast routing domains. This
   could lead to receiving multiple upstream tree branches being formed (an error
   condition) unless steps are taken to ensure all routers on the link
   agree which is the upstream router for a join-ack with subcode PRIMARY-REJOIN-ACK.

   To facilitate detailed protocol description, we use particular group. CBT
   routers attached to a sample topol-
   ogy, illustrated multi-access link participate in Figure 1 (shown over). Member hosts are shown an explicit
   election mechanism that elects a single router, the designated router
   (DR), as
   individual capital letters, routers are prefixed with R, and subnets
   are prefixed with S.

           A                               B
           |   S1              S4          |
   -------------------      -----------------------------------------------
             |                     |               |               |
           ------                 ------           ------           ------
           | R1 |                 | R2 |           | R5 |           | R6 |
           ------                 ------           ------           ------
      C     |  |                    |                |                 |
      |     |  |                    |    S2          |            S8   |
   ----------  ------------------------------------------        -------------
        S3                 |
                         ------
                         | R3 |
                 |       ------                       D
   | S9          |         |               S5         |
   |             |      ---------------------------------------------
   |  |----|     |                    |
   ---| R7 |-----|                  ------
   |  |----|     |------------------| R4 |
   |          S7 |                  ------            F
   |             |                    |         S6    |
   |-E           |            ---------------------------------
                      |                       |
                      |                     ------
             |---|    |---------------------| R8 |
             |R12 ----|                     ------      G
             |---|    |                       |         |  S10
                      | S14                ----------------------------
                      |                         |
                  I --|                       ------
                      |                       | R9 |
                                              ------
                                                |         S12
                     |             ----------------------------
                 S15 |                        |
                     |                      ------
                     |----------------------|R10 |
                J ---|                      ------      H
                     |                        |         |
                     |             ----------------------------
                     |                           S13

                    Figure 1. Example Network Topology
   Taking the example topology link's upstream router for all groups. Since the DR
   might not be the link's best next-hop for a particular core router,
   this may result in figure 1, host A wishes to join group
   G.  All subnets' routers have been configured to use core routers R4
   (primary core) and R9 (secondary core) for messages being re-directed back across a range of group
   addresses, including G.

   Router R1 receives an IGMP host membership report, and proceeds to
   multi-access link. If this happens, the re-directed join message is
   unicast a JOIN-REQUEST, subcode ACTIVE-JOIN to across the next-hop on link by the
   path DR to R4 (R3), the target core. R3 receives the join, caches the
   necessary group information (transient state), and forwards it best next-hop, thereby pre-
   venting a looping scenario. This re-direction only ever applies to R4
   -- the target of the join.

   R4, being the target of the join, sends
   join messages.  Whilst this is suboptimal for join messages, which
   are generated infrequently, multicast data never traverses a JOIN_ACK (subcode NORMAL)
   back out of link
   more than once (either natively, or encapsulated).

   In all but the receiving interface exception case described above, all CBT control mes-
   sages are multicast over multicast supporting links to the previous-hop sender "all-cbt-
   routers" group, with IP TTL 1. The IP source address of the
   join, R3. A JOIN-ACK, like a JOIN-REQUEST, CBT control
   messages is processed hop-by-hop by
   each router on the reverse-path outgoing interface of the corresponding join. sending router. The
   receipt IP des-
   tination address of a join-ack establishes CBT control messages is either the receiving "all-cbt-
   routers" group address, or the IP address of a router reachable over
   one of the sending router's interfaces, depending on whether the corre-
   sponding CBT tree, i.e.
   sender's outgoing link supports multicast transmission. All the router becomes nec-
   essary addressing information is obtained as part of tree set up.

   If CBT is implemented over a branch on the
   delivery tree. Finally, R3 sends tunnelled topology, when sending a join-ack to R1.  A new CBT branch
   has been created, attaching subnet S1 to the CBT delivery tree for
   the corresponding group.

   For the period between any CBT-capable router forwarding (or origi-
   nating) a JOIN_REQUEST and receiving
   control packet over a JOIN_ACK tunnel interface, the corresponding sending router is not permitted to acknowledge any subsequent joins received
   for uses as
   the same group; rather, packet's IP source address the router caches such joins till such
   time as it has itself received a JOIN_ACK for local tunnel end point address,
   and the original join. Only
   then can it acknowledge any cached joins. A router is said to be in a
   "pending-join" state if it is awaiting a JOIN_ACK itself.

   Note that remote tunnel end point address as the presence packet's IP destina-
   tion address.

4.  Protocol Specification Details

   Details of asymmetric routes in the underlying unicast
   routing does not affect the tree-building process; CBT tree branches
   are symmetric by the nature in which they protocol are built. Joins set up
   transient state (incoming and outgoing interface state) presented in all
   routers along the context of a path single
   router implementation.

4.1.  CBT HELLO Protocol

   The HELLO protocol is used to elect a particular core. The corresponding join-ack
   traverses the reverse-path of the join designated router (DR) on
   broadcast-type links. It is also used to elect a designated border
   router (BR) when interconnecting a CBT domain with other domains (see
   [5]).

   A router represents its status as dictated a link's DR by setting the transient
   state, and not necessarily the path DR-flag
   on that underlying routing would
   dictate. Whilst permanent asymmetric routes could pose interface; a problem for
   CBT, transient asymmetricity DR flag is detected by associated with each of a router's
   broadcast interfaces. This flag can only assume one of two values:
   TRUE or FALSE. By default, this flag is FALSE.

   HELLO messages are multicast periodically to the CBT protocol.

3.4.  Forwarding Joins on Multi-Access Subnets all-cbt-routers
   group, 224.0.0.15, using IP TTL 1. The DR election mechanism does not guarantee that advertisement period is
   [HELLO_TIMER] seconds. [HELLO_TIMER] comprises a configured
   [HELLO_INTERVAL], to which is added [RND_RSP] seconds - a random
   response interval.  This random response additive is required to
   avoid the DR will be potential problem of synchronisation between HELLO adver-
   tisements (or other control messages) from different routers. The
   HELLO protocol's convergence time is set at [HELLO_CONV] seconds -
   the time after which no further HELLOs are expected in any one round
   of the protocol.

   Each HELLO advertising router includes the upper bound of its
   [RND_RSP] timer in its HELLO advertisements. This is necessary so
   that actually forwards a join off a multi-access network; all routers attached to the
   first hop link can agree on a common HELLO
   convergence time [HELLO_CONV]; in any one round of the path to HELLO proto-
   col, a particular core might be via another router on the same subnetwork, which actually forwards off-subnet.

   Although very much assumes the same, let's see another example using our
   example topology of figure 1 minimum of a host joining a CBT tree for the
   case where more than one CBT router exists on the host subnetwork.

   B's subnet, S4, has 3 CBT routers attached. Assume also that R6 has
   been elected IGMP-querier and CBT DR.

   R6 (S4's DR) receives an IGMP group membership report. R6's upper bound of its config-
   ured information suggests R4 [RND_RSP] and that of any received advertisement's.  The minimum
   upper bound is then used as the target core for this group.  R6
   thus generates a join-request for target core R4, subcode
   ACTIVE_JOIN.  R6's routing table says the next-hop on router's [RND_RSP] upper bound in
   the path to R4
   is R2, which is on next round of the same subnet as R6. This protocol. [HELLO_CONV] is irrelevant to R6,
   which unicasts it to R2.  R2 unicasts it to R3, which happens set to be
   already on-tree this minimum
   upper bound + 2 seconds (the 2 seconds being a response "safety mar-
   gin") for the specified group (from R1's join). R3 there-
   fore can acknowledge the arrived join and unicast the ack back to R2.
   R2 forwards it to R6, the origin next round of the join-request.

   If an IGMP membership report is received by a DR with protocol.

   A network manager can preference a join for the
   same group already pending, or if the router's DR is already on-tree for eligibility by option-
   ally configuring a HELLO preference. Valid configuration values range
   from 1 to 254 (decimal), 1 representing the
   group, it takes no action.

3.5.  On-Demand "Core Tree" Building

   The "core tree" - "most eligible" value. In
   the part absence of explicit configuration, a CBT tree linking all router assumes the default
   HELLO preference value of its cores
   together, 255. The elected DR uses HELLO preference
   zero (0) in HELLO advertisements, irrespective of any configured
   preference.  The DR continues to use preference zero for as long as
   it is built on-demand. That is, the core tree running.

   The DR election winner is only built
   subsequent to a non-primary (secondary) core receiving a join-
   request. This triggers that which advertises the secondary core to join lowest HELLO
   preference, or the primary core; lowest-addressed in the primary need never join anything.

   Join-requests carry an list event of core a tie.

   The situation where two or more routers (and attached to the identity of same broad-
   cast link are advertising HELLO preference 0 should never arise. How-
   ever, should this situation arise, all but the
   primary core in lowest addressed zero-
   advertising router relinquishes its own separate field), making it possible for the
   secondary cores to know where to join when they themselves receive a
   join. Hence, the primary core must be uniquely identified claim as such
   across the whole group. A secondary joins DR immediately by unset-
   ting the primary subsequent to
   sending an ack for DR flag on the first join corresponding interface. The relinquishing
   router(s) subsequently advertise their previously used preference
   value in HELLO advertisements.

4.1.1.  Sending HELLOs

   When a router starts up, it receives.

3.6.  Tree Teardown

   There are multicasts two scenarios whereby HELLO messages over each
   of its broadcast interfaces in successsion. The DR flag is initially
   unset (FALSE) on each broadcast interface.

   A router sends a tree branch may be torn down:

   +o    During HELLO message whenever its [HELLO_TIMER] expires.

   Whenever a re-configuration. If router sends a router's best next-hop to the
        specified core is one HELLO message, it resets its [HELLO_TIMER].

4.1.2.  Receiving HELLOs

   On receipt of any HELLO message, a router adjusts its existing children, then before
        sending [RND_RSP] upper
   bound to the join it must tear down minimum of this router's configured [RND_RSP] upper
   bound and that particular downstream
        branch. It does so by sending a FLUSH_TREE message which is pro-
        cessed hop-by-hop down received in the branch.  All routers receiving this
        message must process it and forward it to all their children.
        Routers that have received HELLO. The router also
   adjusts its [HELLO_CONV] as described above.

   A router need not respond to a flush HELLO message will re-establish
        themselves on the delivery tree if they have directly connected
        subnets with group presence.

   +o    If a CBT router has no children it periodically checks all its
        directly connected subnets for group member presence. If no mem-
        ber presence the received HELLO is ascertained on any of
   "better" than its subnets it sends a
        QUIT_REQUEST upstream to remove itself from own. Thus, in steady state, the tree.

        The receipt of a quit-request triggers HELLO protocol
   incurs very little traffic overhead.

   If the receiving parent
        router to received HELLO message is "better" (lower preferenced, or
   equally preferenced but lower addressed) than it would send itself,
   it immediately query unsets its forwarding database to establish
        whether there remains any directly connected group membership,
        or any children, for the said group. If not, DR flag on the router itself
        sends a quit-request upstream.

   The following example, using arriving interface if the example topology of figure 1, shows
   how a tree branch DR
   flag is gracefully torn down using a QUIT_REQUEST.

   Assume group member B leaves group G set on subnet S4. B issues an IGMP
   HOST-MEMBERSHIP-LEAVE (relevant only to IGMPv2 and later versions) that interface. It also resets its [HELLO_TIMER].

   If the received HELLO message which is multicast to the "all-routers" group (224.0.0.2).
   R6, not "better" than this router would
   send itself, it sets its [RND_RSP] random response timer; on expiry,
   the subnet's DR and IGMP-querier, router responds with a group-specific-
   QUERY. No hosts respond its own HELLO message . If no "better" HELLO
   message is received within the required response interval, so current [HELLO_CONV], the router sets
   the DR
   assumes group G traffic is no longer wanted flag on subnet S4.

   Since R6 has no the corresponding interface.

4.2.  JOIN_REQUEST Processing

   A JOIN_REQUEST is the CBT children, and no other directly attached subnets
   with group G presence, it immediately follows on by sending a
   QUIT_REQUEST control message used to R2, its parent on register a member
   host's interest in joining the distribution tree for group G. R2 responds
   with a QUIT-ACK, unicast to R6; R2 removes the corresponding child
   information. R2 in turn sends group.

4.2.1.  Sending JOIN_REQUESTs

   A JOIN_REQUEST can only ever be originated by a QUIT upstream to R3 (since it has no
   other children or subnet(s) with group presence).

      NOTE: immediately subsequent to sending leaf router, i.e. a QUIT-REQUEST, the sender
      removes
   router with directly attached member hosts. This join message is sent
   hop-by-hop towards the corresponding parent information, i.e. it does not
      wait core router for the receipt group (see section 7).
   The originating router caches <group, NULL, upstream interface> state
   for each join it originates. This state is known as "transient join
   state".  The absence of a QUIT-ACK.

   R3 responds to "downstream interface" (NULL) indicates
   that this router is the QUIT by unicasting a QUIT-ACK to R2. R3 subse-
   quently checks whether it in turn can send a quit by checking group G
   presence on its directly attached subnets, join message originator, and is therefore
   responsible for any group G children. retransmissions of this message if a response is
   not received within [JOIN_RTX_INTERVAL].  It has the latter (R1 is its child on an error if no
   response is received after [JOIN_TIMEOUT] seconds.  If this error
   condition occurs, the group G tree), and so R3
   cannot itself send a quit. However, joining process may be re-invoked by the branch R3-R2-R6 has been
   removed from
   receipt of the tree.

4.  Tree Maintenance

   Once a tree branch has been created, i.e. a CBT router has received next IGMP host membership report from a
   JOIN_ACK for locally
   attached member host.

   Note that if the interface over which a JOIN_REQUEST previously sent (or forwarded), a child
   router is required to monitor be sent
   supports multicast, the status of its parent/parent link at
   fixed intervals by means of a "keepalive" mechanism operating between
   them.  The "keepalive" protocol JOIN_REQUEST is simple, and implemented by means
   of two CBT control messages: CBT_ECHO_REQUEST and CBT_ECHO_REPLY; a
   child unicasts a CBT-ECHO-REQUEST multicast to its parent, which unicasts a
   CBT-ECHO-REPLY in response.

   Adjacent CBT the all-cbt-
   routers only need to send one keepalive representing all
   children having group, using IP TTL 1.  If the same parent, reachable over a particular link,
   regardless of group.  This aggregation strategy link does not support multi-
   cast, the JOIN_REQUEST is expected unicast to con-
   serve considerable bandwidth the next hop on "busy" links, such as transit net-
   work, or backbone network, links.

   For any CBT router, if its parent router, or the unicast path
   to the parent,
   fails, group's core.

4.2.2.  Receiving JOIN_REQUESTs

   On broadcast links, JOIN_REQUESTs which are multicast may only be
   forwarded by the child is initially responsible for re-attaching itself,
   and therefore all link's DR. Other routers subordinate to it on the same branch, attached to the tree.

4.1.  Router Failure

   An on-tree router can detect a failure from the following two cases:

   +o    if the child responsible for sending keepalives across a partic-
        ular link stops receiving CBT_ECHO_REPLY messages. In this case
        the child realises that its parent has become unreachable and
        must therefore try and re-connect to the tree for all groups
        represented on may
   process the parent/child link. For all groups sharing a
        common core set (corelist), provided those groups can be speci-
        fied as a CIDR-like aggregate, an aggregated join can be sent
        representing the range of groups.  Aggregated joins (see below). JOIN_REQUESTs which are made
        possible multicast over
   a point-to-point link are only processed by the presence of a "group mask" field in router on the CBT con-
        trol packet header (footnote 3).

        If link
   which does not have a range of groups cannot local interface corresponding to the join's
   network layer (IP) source address. Unicast JOIN_REQUESTs may only be represented
   processed by a mask, then each
        group must be re-joined individually.

        CBT's re-join strategy is as follows: the rejoining router which
        is immediately subordinate has a local interface corresponding to
   the failure sends join's network layer (IP) destination address.

   With regard to forwarding a JOIN_REQUEST
        (subcode ACTIVE_JOIN received JOIN_REQUEST, if it has no children attached, the receiving
   router is not on-tree for the group, and subcode
        ACTIVE_REJOIN if at least one child is attached) not the group's core
   router, the join is forwarded to the best
        next-hop router next hop on the path to towards the elected
   core. If no JOIN-ACK The join is received after three retransmissions, each transmission being
        at PEND-JOIN-INTERVAL (5 secs) intervals, multicast, or unicast, according to whether the next-highest pri-
        ority core is elected from
   outgoing interface supports multicast.  The router caches the core list, and follow-
   ing information with respect to the process
        repeated.  If all cores have been tried unsuccessfully, the DR
        has no option but to give up.

   +o    if a parent stops receiving CBT_ECHO_REQUESTs from a child. In forwarded join: <group, down-
   stream interface, upstream interface>.

   If this case, if the parent has transient join state is not received an expected keepalive
        after CHILD_ASSERT_EXPIRE_TIME, all children reachable across
        that link are removed "confirmed" with a join acknowl-
   edgement (JOIN_ACK) message from upstream, the parent's forwarding database.

4.2.  Router Re-Starts

   There are two cases to consider here:

   +o    Core re-start. All JOIN-REQUESTs (all types) carry the identi-
        ties (i.e. IP addresses) of each of the cores for a group. state is timed out
   after 1.5 times [JOIN_RTX_INTERVAL].

   If a the receiving router is a core for a group, but has only recently re-started,
        it will not be aware that it is a core for any group(s). In such
        circumstances, a the group's core only becomes aware that it router, the join is such "ter-
   minated" and acknowledged by
        receiving a JOIN-REQUEST.  Subsequent to means of a core learning its
        status in this way, JOIN_ACK. Similarly, if it is not the primary core it acknowl-
        edges
   router is on-tree and the received join, then sends a JOIN_REQUEST (subcode
        ACTIVE_REJOIN) to arrives over an interface that
   is not the primary core. If upstream interface for the re-started router is group, the primary core, it need take no action, i.e. in all
_________________________
  3 There  are  situations  where it join is advantageous acknowl-
   edged.

   If  [RND_RSP] pertaining to
send a single join-request that represents  potentially
many  groups.  One  such  example JOIN_REQUEST is provided in [11],
whereby active (i.e. running),
   if a designated border router JOIN_REQUEST is required to  join
all groups inside a CBT domain.

        circumstances, received for the primary core simply waits to be joined by
        other routers.

   +o    Non-core re-start. In this case, same group over that group's
   parent interface, cancel [RND_RSP] for the impending JOIN_REQUEST.

   If this router can only join has a cache-deletion-timer [CACHE_DEL_TIMER] running
   on the
        tree again if arrival interface for the group specified in a downstream router sends multicast join,
   the timer is cancelled.

   If a multicast JOIN_REQUEST through
        it, or it is elected DR for one of its directly attached sub-
        nets, received and subsequently receives an IGMP membership report.

4.3.  Route Loops

   Routing loops are only a concern when a router with at least one
   child the QUIT_TIME bit (see
   section 4.4.1) is attempting to re-join a CBT tree. In this case set on the re-
   joining router sends a JOIN_REQUEST (subcode ACTIVE REJOIN) to arrival interface for the
   best next-hop on specified
   group, unset the path to an elected core. This join QUIT_TIME bit.

4.3.  JOIN_ACK Processing

   A JOIN_ACK is forwarded
   as normal until it reaches either the specified core, another core,
   or a on-tree router that mechanism by which an interface is already added to a
   router's multicast forwarding cache; thus, the interface becomes part
   of the group distribution tree. If the rejoin
   reaches the primary core, loop detection

4.3.1.  Sending JOIN_ACKs

   The JOIN_ACK is not necessary because sent over the
   primary never has a parent. same interface as the corresponding
   JOIN_REQUEST was received. The primary core acks an active-rejoin by
   means sending of a JOIN-ACK, subcode PRIMARY-REJOIN-ACK. This ack must be
   processed by each router on the reverse-path of acknowledgement causes
   the active-rejoin;
   this ack creates tree state, just like a normal join-ack.

   If an active-rejoin is terminated by any router on the tree other
   than to add the primary core, loop detection must take place, as we now
   describe.

   If, in response interface to an active-rejoin, a JOIN-ACK its child interface list in its
   forwarding cache for the group, if it is returned, subcode
   NORMAL (as opposed to an ack with subcode PRIMARY-REJOIN-ACK), not already. If the router receiving the ack subsequently generates a JOIN-REQUEST, sub-
   code NACTIVE-REJOIN (non-active rejoin). This packet serves only to
   detect loops; it
   does not create any transient yet have active state in the routers
   it traverses, other than the originating for this group, this router (in case retransmis-
   sions are necessary). Any on-tree must be
   the core router receiving a non-active
   rejoin is required to forward it over its parent interface for the
   specified group. In this way, it will either reach the primary core,
   which unicasts, directly to group; the sender, core creates a join ack with subcode PRI-
   MARY-NACTIVE-ACK (so forwarding cache
   entry and includes the sender knows no loop is present), or interface in its child interface list, and
   sends the
   sender receives the non-active rejoin it sent, via one of its child
   interfaces, in which case JOIN_ACK downstream.

   A JOIN_ACK is multicast or unicast, according to whether the rejoin obviously formed outgoing
   interface supports multicast transmission or not.

4.3.2.  Receiving JOIN_ACKs

   The group and arrival interface must be matched to a loop. <group, ....,
   upstream interface> from the router's cached transient state. If a loop no
   match is present, the non-active join originator immediately
   sends a QUIT_REQUEST to its newly-established parent and found, the loop JOIN_ACK is
   broken.

   Using figure 2 (over) to demonstrate this, if R3 discarded.  If a match is attempting to re-
   join found, a
   CBT forwarding cache entry for the tree (R1 group is created, with "upstream
   interface" marked as the core group's parent interface.

   If "downstream interface" in figure 2) and R3 believes its best
   next-hop to R1 the cached transient state is R6, and R6 believes R5 NULL, the
   JOIN_ACK has reached the originator of the corresponding
   JOIN_REQUEST; the JOIN_ACK is its best next-hop to R1,
   which sees R4 as its best next-hop to R1 -- a loop not forwarded downstream.  If "down-
   stream interface" is formed. R3
   begins by sending non-NULL, a JOIN_REQUEST (subcode ACTIVE_REJOIN, since R4 JOIN_ACK for the group is
   its child) to R6.  R6 forwards sent over
   the join to R5. R5 "downstream interface" (multicast or unicast, accordingly). This
   interface is on-tree for installed in the
   group, so responds to child interface list of the active-rejoin with a JOIN-ACK, subcode NOR-
   MAL (the ack traverses R6 on its way to R3).

   R3 now generates a JOIN-REQUEST, subcode NACTIVE-REJOIN, and forwards
   this group's
   forwarding cache entry.

   Once transient state has been confirmed by transferring it to its parent, R6.  R6 forwards the non-active rejoin to R5, its
   parent. R5 does similarly, as does R4. Now,
   forwarding cache, the non-active rejoin has
   reached R3, which originated it, so R3 concludes a loop transient state is present on deleted.

4.4.  QUIT_NOTIFICATION Processing

   A CBT tree is "pruned" in the parent direction downstream-to-upstream when-
   ever a CBT router's child interface list for the specified group. It immediately sends a
   QUIT_REQUEST group becomes NULL.

4.4.1.  Sending QUIT_NOTIFICATIONs

   A QUIT_NOTIFICATION is sent to R6, which in turn sends a quit if it has router's parent router on the tree
   whenever the router's child interface list becomes NULL.

   A QUIT_NOTIFICATION is not received
   an ACK acknowledged; once sent, all information
   pertaining to the group it represents is deleted from R5 already AND has itself the forwarding
   cache after a child or subnets with member
   presence. If so it does not send a quit -- the loop has been broken
   by R3 sending the first quit.

   QUIT_REQUESTs are typically acknowledged by means of short interval.

   To ensure consistency between a QUIT_ACK. A child removes its and parent information immediately subsequent to send-
   ing its first QUIT-REQUEST. The ack here serves to notify router given the (old)
   child that it (the parent) has in fact removed its child information.
   However,
   potential for loss of a QUIT_NOTIFICATION, there might be cases where, due to failure, is a QUIT_TIME bit
   associated with the parent can-
   not respond.  The child sends of each group entry; whenever a
   QUIT_NOTIFICATION is sent for a QUIT-REQUEST group, the QUIT_TIME bit for that
   group entry is set for a maximum of three
   times, at PEND-QUIT-INTERVAL (5 sec) intervals.

                   ------
                   | R1 |
                   ------
                     |
           ---------------------------
                     |
                   ------
                   | R2 |
                   ------
                     |
           ---------------------------
                     |                             |
                   ------                          |
                   | R3 |--------------------------|
                   ------                          |
                     |                             |
           ---------------------------             |
                     |                             |       ------
                   ------                          |       |    |
                   | R4 |                          |-------| R6 |
                   ------                          |       |----|
                     |                             |
           ---------------------------             |
                     |                             |
                   ------                          |
                   | R5 |--------------------------|
                   ------                          |
                                                   |

                      Figure 2: Example Loop Topology

   In another scenario [QUIT_TIME] seconds before the rejoin travels over a loop-free path,
   entry is deleted and the
   first on-tree router encountered QUIT_TIME bit unset. By default, this bit is
   unset.

   When the primary core, R1. In figure
   2, R3 sends QUIT_TIME bit is set, if the router detects multicast traf-
   fic for the group arriving over a join, subcode REJOIN_ACTIVE to R2, to-be-deleted parent interface (one
   over which a quit has recently been sent), the next-hop on router sends another
   QUIT_NOTIFICATION over that interface. This is multicast, or unicast,
   as appropriate for the
   path outgoing link. It continues to core R1. R2 forwards the re-join do so at
   [QUIT_RATE] second intervals so long as data continues to R1, the primary core,
   which returns arrive, and
   provided  [QUIT_TIME] has not yet expired.

   If, after sending a QUIT_NOTIFICATION a JOIN-ACK, subcode PRIMARY-REJOIN-ACK, multicast JOIN_REQUEST for
   the specified group arrives over the
   reverse-path of interface the rejoin-active. Whenever a quit was sent, the
   QUIT_TIME bit is immediately unset if it is set (any traffic arriving
   over the interface will be for/from another child router receives attached to
   the same link).

4.4.2.  Receiving QUIT_NOTIFICATIONs

   The group reported in the QUIT_NOTIFICATION must be matched with a PRI-
   MARY-REJOIN-ACK
   forwarding cache entry. If no loop detection match is necessary. found, the QUIT_NOTIFICATION
   is ignored and discarded.  If we assume R2 a match is on tree for found, if the corresponding group, R3 sends a
   join, subcode REJOIN_ACTIVE to R2, which replies with a join ack,
   subcode NORMAL. R3 must then generate a loop detection packet (join
   request, subcode REJOIN-NACTIVE) which arrival inter-
   face is forwarded to its parent,
   R2, which does similarly. On receipt of the rejoin-Nactive, the pri-
   mary core unicasts a join ack back directly to R3, with subcode PRI-
   MARY-NACTIVE-ACK.  This confirms to R3 that its rejoin does not form
   a loop.

5.  Data Packet Loops

   The CBT protocol builds a loop-free distribution tree. If all routers
   that comprise a particular tree function correctly, data packets
   should never traverse a tree branch more than once (footnote 4).

   CBT mode data packets from a non-member sender must arrive on a tree
   via an "off-tree" interface. The CBT mode data packet's header
   includes an "on-tree" field, which contains valid child interface in the value 0x00 until group entry, how the
   data packet reaches an on-tree router. The first on-tree router must
   convert this value to 0xff.  This value remains unchanged, and from
   here
   proceeds depends on whether the packet should traverse only on-tree interfaces. QUIT_NOTIFICATION was multicast or
   unicast.

   If an
   encapsulated packet happens to "wander" off-tree and back on again,
   an on-tree router will receive the CBT encapsulated packet via an
   off-tree interface. However, this router will recognise that QUIT_NOTIFICATION was unicast, the "on-
   tree" field of corresponding child inter-
   face is deleted from the encapsulating CBT header group's forwarding cache entry, and no fur-
   ther processing is set to 0xff, required.

   If the QUIT_NOTIFICATION was multicast, and so
   immediately discards the packet.

_________________________
  4 The exception to this is when CBT mode arrival interface is operating
between CBT routers connected to
   a multi-access link; a
data packet may traverse the link in  native  mode  (if
group  members are present on the link), as well as CBT
mode valid child interface for sending the data between CBT  routers  on specified group, the
tree.

6.  Data Packet Forwarding Rules

6.1.  Native Mode

   In native mode, when router sets a CBT
   cache-deletion-timer [CACHE_DEL_TIMER].

   Because this router receives might be acting as a data packet, parent router for multiple
   downstream routers attached to the packet
   may only be forwarded over outgoing tree interfaces (member subnets
   and interfaces leading to outgoing on-tree neighbours) iff it has
   been arrival link, [CACHE_DEL_TIMER]
   interval gives those routers that did not send the
   QUIT_NOTIFICATION, but received via a valid on-tree interface (or it over their parent interface, the packet has
   arrived encapsulated
   opportunity to ensure that the parent router does not remove the link
   from its child interface list.

   Therefore, on receipt of a non-member, i.e. off-tree, sender).  Oth-
   erwise, the packet is discarded.

   Before multicast QUIT_NOTIFICATION over a packet is forwarded by parent
   interface, a subnet's DR, provided the packet's
   TTL is greater than 1, the packet's TTL is decremented.

6.2.  CBT Mode

   In CBT mode, routers ignore all non-locally originated native mode
   multicast data packets. Locally-originated multicast data receiving router starts a random response interval timer
   which is only
   processed by set to [RND_RSP] seconds.

   If a subnet's DR; in this case, the DR forwards the native multicast data packet, TTL 1, JOIN_REQUEST is received over any outgoing member subnets the same interface (par-
   ent) for
   which that router is DR. Additionally, the DR encapsulates same group before this router's [RND_RSP] timer expires,
   it suppresses the
   locally-originated multicasting of its own similar JOIN_REQUEST.

   If a multicast and forwards it, CBT mode, over all tree
   interfaces, as dictated by JOIN_REQUEST is not received via the CBT forwarding database.

   When a router, operating in CBT mode, receives router's parent
   link before [RND_RSP] expires, a CBT-mode encapsu-
   lated data packet, it decapsulates one copy to send, native mode and
   TTL 1, over any directly attached member subnets for which it is DR.
   Additionally, an encapsulated copy JOIN_REQUEST is forwarded multicast over all outgoing
   tree interfaces, as dictated by its CBT forwarding database.

   Like the outer encapsulating IP header,
   link for the previously quit group, with IP TTL value 1.

4.5.  ECHO_REQUEST Processing

   The ECHO_REQUEST message allows a child to monitor reachability to
   its parent router for a group (or range of groups if the encapsu-
   lating CBT header parent
   router is decremented each time it the parent for multiple groups). Group information is processed by not
   carried in ECHO_REQUEST messages.

4.5.1.  Sending ECHO_REQUESTs

   Whenever a router creates a CBT
   router.

   An example of CBT mode forwarding is provided towards cache entry due to the end receipt
   of the
   next section.

7.  CBT Mode -- Encapsulation Details

   In a multi-protocol environment, whose infrastructure may include
   non-multicast-capable routers, it JOIN_ACK, the router begins the periodic sending of ECHO_REQUEST
   messages over its parent interface. The ECHO_REQUEST is necessary multicast to tunnel data packets
   between CBT-capable routers. This is called "CBT mode".  Data packets
   are de-capsulated by CBT routers (such that they become native mode
   data packets) before being forwarded
   the "all-cbt-routers" group over subnets with member hosts.
   When multicasting (native mode) multicast-capable interfaces, and
   unicast to member hosts, the TTL value of the
   original IP header parent router otherwise.

   ECHO_REQUEST messages are sent at [ECHO_INTERVAL] second intervals.
   Whenever an ECHO_REQUEST is set to one. CBT mode encapsulation sent, [ECHO_INTERVAL] is as fol-
   lows:

           ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
           | encaps IP hdr | CBT hdr | original IP hdr | data ....|
           ++++++++++++++++++++++++++++++++++++++++++++++++++++++++

                   Figure 3. Encapsulation reset.

   If, for CBT mode

   The TTL value of the CBT header is set by the encapsulating CBT
   router directly attached any echo-request sent to the origin of a data packet.  This value parent, the expected response
   (ECHO_REPLY) is decremented each time it not forthcoming within [ECHO_RTX_INTERVAL],  the echo
   request message is processed by a CBT router.  An encap-
   sulated data packet retransmitted. If no response is discarded when forthcoming
   within [ECHO_TIMEOUT] seconds, the CBT header TTL value
   reaches zero.

   The purpose router sends a FLUSH_TREE message
   over each of its child interfaces for the (outer) encapsulating IP header is to "tunnel"
   data packets between CBT-capable routers (or "islands"). The outer IP
   header's TTL value group, then removes all
   forwarding cache state for the group.

4.5.2.  Receiving ECHO_REQUESTs

   If a ECHO_REQUEST is set to received over any valid child interface, the "length" of
   receiving router responds with an ECHO_REPLY message over the corresponding tun-
   nel, or MAX_TTL (255)if this same
   interface. This message is not known, or subject multicast to change.

   It is worth pointing out here the distinction between subnetworks and
   tree branches (especially apparent in CBT mode), although they can be
   one and the same. For example, a multi-access subnetwork containing
   routers and end-systems could potentially be both a CBT tree branch "all-cbt-routers" group
   over multicast-capable interfaces, and unicast otherwise.

   If a subnetwork with group member presence. A tree branch which is
   not simultaneously a subnetwork is either a "tunnel" or a point-to-
   point link.

   In CBT mode there are three forwarding methods used by CBT routers:

   +o    IP multicasting. This method sends an unaltered (unencapsulated)
        data packet across a directly-connected subnetwork with group
        member presence.  Any host originating multicast data, does so
        in this form.

   +o    CBT unicasting. This method is used ECHO_REQUEST message arrives via any valid parent
   interface, the router resets its [ECHO_INTERVAL] timer for that
   upstream interface, thereby suppressing the sending data packets
        encapsulated (as illustrated above) across of  its own
   ECHO_REQUEST over that upstream interface.

4.6.  ECHO_REPLY Processing

   ECHO_REPLY messages allow a tunnel or point-to-
        point link; child to monitor the IP destination address reachability of its
   parent, and ensure the encapsulating IP
        header group state information is a unicast address. En/de-capsulation takes place consistent between
   them.

4.6.1.  Sending ECHO_REPLY messages

   An ECHO_REPLY message is sent in
        CBT routers.

   +o    CBT multicasting. A CBT router on direct response to receiving an
   ECHO_REQUEST message, provided the ECHO_REQUEST is received over any
   one of this router's valid child interfaces. Additionally, an
   ECHO_REPLY is sent periodically by a multi-access link can take
        advantage parent router over each of multicast in its
   child links, reporting all groups for which the case where multiple on-tree neigh-
        bours link is its child.

   ECHO_REPLY messages are reachable across unicast or multicast, as appropriate.

4.6.2.  Receiving ECHO_REPLY messages

   An ECHO_REPLY message must be received via a single physical link; valid parent interface.
   When received, the outer
        encapsulating IP header contains a multicast address as child router resets its des-
        tination address. [ECHO_INTERVAL] timer for
   this upstream interface.  The IP module of end-systems on child router also caches the same link
        subscribed to reported
   "group report interval" (seconds) - the same time at which the next group
   carrying ECHO_REPLY will discard these multicasts since
        the CBT payload type (protocol id) of be sent by the outer IP header parent router.  Like
   [ECHO_INTERVAL], this is cached per upstream interface. If the group
   carrying ECHO_REPLY does not
        recognizable by hosts.

   CBT routers create forwarding database (db) entries whenever they
   send or receive arrive shortly after "group report
   interval" has expired, a JOIN_ACK. The forwarding database describes the
   parent-child relationships on a per-group basis. A forwarding
   database entry dictates over which tree interfaces, and how (unicast
   or multicast) a data packet is to be sent.

   Note that a CBT forwarding db QUIT_NOTIFICATION is required sent for both CBT-mode and
   native-mode multicasting.

   Using our example topology in figure 1, let's assume each group for
   which the CBT routers
   are operating in CBT mode.

   Member G originates an IP multicast (native mode) packet. R8 non-reporting router is the
   DR for subnet S10. R8 therefore sends parent.

   If this echo reply carries a (native mode, TTL 1) copy
   over any member subnets list of groups, the child router must
   match all those of its forwarding cache entries for which it is DR - S14 and S10 (the copy
   over S10 the arrival
   interface is the upstream interface.  If the parent router does not sent, since
   consider itself the packet was originally received from
   S10).  The multicast packet parent router for group(s) which the child thinks
   is CBT mode encapsulated by R8, and uni-
   cast to each of its children, R9 and R12; these children are not
   reachable over parent, the same interface, otherwise R8 could have sent child sends a CBT
   mode multicast.  R9, the DR FLUSH_TREE message downstream for S12, need not IP multicast (native
   mode) onto S12 since there are no
   each such group. If this router has directly attached members present there. R9 unicasts for any
   of the packet in CBT mode to R10, which is flushed groups, the DR receipt of an IGMP host membership report
   for S13 and S15. R10
   decapsulates the CBT mode packet and IP multicasts (native mode, TTL
   1) to each any of S13 and S15.

   Going upstream from R8, R8 CBT mode unicasts those groups will prompt this router to R4. It is DR for all
   directly connected subnets and therefore IP multicasts (native mode) rejoin the data packet onto S5, S6 and S7, all of which have member pres-
   ence. R4 unicasts, CBT mode, corre-
   sponding tree(s).

   If the upstream router considers itself the packet to all outgoing children, R3
   and R7 (NOTE: R4 does not have a parent since it is for more groups
   than does the primary core receiving router, this router sends a QUIT_NOTIFICATION
   for each of those groups for which the group). R7 IP multicasts (native mode) onto S9. R3 CBT
   mode unicasts to R1 and R2, its children. Finally, R1 IP multicasts
   (native mode) onto S1 and S3, and R2 IP multicasts (native mode) onto
   S4.

8.  Non-Member Sending

   For a multicast data packet to span beyond QUIT_TIME bit is set in the scope of
   forwarding cache. Otherwise, the originat-
   ing subnetwork at least one CBT-capable router must be present on
   that subnetwork. takes no action.

4.7.  FLUSH_TREE Processing

   The DR for FLUSH_TREE (flush) message is the group on mechanism by which a router
   invokes the subnetwork must encap-
   sulate tearing down of all its downstream branches for a partic-
   ular group. The flush message is multicast to the (native) IP-style packet "all-cbt-routers"
   group when sent over multicast-capable interfaces, and unicast it to other-
   wise.

4.7.1.  Sending FLUSH_TREE messages

   A FLUSH_TREE message is sent over each downstream (child) interface
   when a core router has lost reachability with its parent router for the
   group (footnote 5).  The encapsulation required (detected via ECHO_REQUEST and ECHO_REPLY messages). All group
   state is shown in figure 3;
   CBT mode encapsulation removed from an interface over which a flush message is necessary so the receiving CBT router can
   demultiplex the packet accordingly.

   If
   sent.

4.7.2.  Receiving FLUSH_TREE messages

   A FLUSH_TREE message must be received over the encapsulated packet hits parent interface for
   the tree at an on-tree router, specified group, otherwise the
   packet message is discarded.

   The flush message must be forwarded according to the forwarding rules of section 6.1
   or 6.2, depending on whether the receiving router is operating in
   native- or CBT mode. Note that it is possible over each child interface for the different
   interfaces of a router to operate in different (and independent)
   modes.

   If the first on-tree router encountered is
   specified group.

   Once the target core, various
   scenarios define what happens next:

   +o    if flush message has been forwarded, all state for the target core group is not
   removed from the primary, router's forwarding cache.

5.  Timers and Default Values

   This section provides a summary of the target core has
        not yet joined the tree (because it has not yet itself received
        any join-requests), timers described above,
   together with their default values.

   +o    [HELLO_INTERVAL]: a base value making up the target core simply forwards bulk of the encapsu-
        lated packet to the primary core; the primary core IP address is
        included in the encapsulating CBT data packet header.

        if the target core is not the primary, but has children, the
        target core forwards the data according to the rules of section
        6.
_________________________
  5 It is assumed  that  CBT-capable  routers  discover
<core,  group> mappings by means of some discovery pro-
tocol. Such inter-
        val between sending a protocol is outside  the  scope  of  this
document. HELLO message. Default: 60 seconds.

   +o    if the target core is the primary, the primary forwards the data
        according to the rules of section 6.2.

9.  Eliminating the Topology-Discovery Protocol in the Presence of Tun-
nels

   Traditionally, multicast protocols operating within a virtual topol-
   ogy, i.e. an overlay    [RND_RSP]: router's random response interval. Default: 2 sec-
        onds.

   +o    [HELLO_TIMER]: (variable) interval between sending HELLO mes-
        sages.  [HELLO_TIMER] = [HELLO_INTERVAL + RND_RSP]

   +o    [HELLO_CONV]: convergence time of the physical topology, have required the
   assistance one round of a multicast topology discovery protocol, such as that
   present in DVMRP [1]. However, it is possible to have a multicast
   protocol operate within a virtual topology without the need HELLO proto-
        col.  [HELLO_CONV] = [min(RND_RSP) + 2 seconds].

   +o    [JOIN_RTX_INTERVAL]: retransmission time for a
   multicast topology discovery protocol. One way JOIN_REQUESTs.
        Default: 5 seconds.

   +o    [JOIN_TIMEOUT]: time to achieve this is by
   having a router configure all its tunnels raise exception due to its virtual neighbours
   in advance. A tunnel is identified by a local interface address and a
   remote tree join fail-
        ure. Default: 3.5 times [JOIN_RTX_INTERVAL].

   +o    [CACHE_DEL_TIMER]:  time to remove child interface address. Routing is replaced by "ranking" each such
   tunnel from forward-
        ing cache. Default: 2 seconds.

   +o    [QUIT_TIME]: time to remove parent interface associated with a particular core address; from forwarding
        cache entry.  Unset QUIT_TIME bit. Default: 60 seconds.

   +o    [QUIT_RATE]: period for sending QUIT_NOTIFICATION if the
   highest-ranked route is unavailable (tunnel end-points are required
   to run an Hello-like protocol traffic
        persists. Default: 15 seconds.

   +o    [ECHO_INTERVAL]: interval between themselves) then the next-
   highest ranked available route is selected, sending ECHO_REQUEST to parent
        routers.  Default: 60 seconds.

   +o    [ECHO_RTX_INTERVAL]: retransmission time for ECHO_REQUESTs.
        Default 2 seconds.

   +o    [ECHO_TIMEOUT]: time to consider parent unreachable. Default:
        3.5 times [ECHO_RTX_INTERVAL].

6.  CBT Packet Formats and so on. The exact
   specification of the Hello protocol is outside the scope of this doc-
   ument. Message Types

   CBT trees are built using the same join/join-ack mechanisms as
   before, only now some branches of a delivery tree run in native mode,
   whilst others (tunnels) run in CBT mode. Underlying unicast routing
   dictates which interface a packet should be forwarded over. Each
   interface is configured as either native mode or CBT mode, so a
   packet can be encapsulated (decapsulated) accordingly.

   As an example, router R's configuration would be as follows:

   intf    type    mode    remote addr
   -----------------------------------
   #1      phys    native  -
   #2      tunnel  cbt     128.16.8.117
   #3      phys    native  -
   #4      tunnel  cbt     128.16.6.8
   #5      tunnel  cbt     128.96.41.1

   core    backup-intfs
   --------------------
   A         #5, #2
   B         #3, #5
   C         #2, #4

   The CBT forwarding database needs to be slightly modified to accommo-
   date an extra field, "backup-intfs" (backup interfaces). The entry in
   this field specifies a backup interface whenever a tunnel interface
   specified in the forwarding db is down.  Additional backups (should
   the first-listed backup be down) are specified for each core in the
   core backup table. For example, if interface (tunnel) #2 were down,
   and the target core of a CBT control packet were core A, the core
   backup table suggests using interface #5 as a replacement. If inter-
   face #5 happened to be down also, then the same table recommends
   interface #2 as a backup for core A.

10.  CBT Packet Formats and Message Types

   We distinguish between two types of CBT packet: CBT mode data pack-
   ets, and CBT control packets. CBT control packets carry a CBT control
   packet header.

   CBT control packets control packets are encapsulated in IP, as illustrated below:

           +++++++++++++++++++++++++++++++
           | IP header | CBT control pkt |
           +++++++++++++++++++++++++++++++

   In CBT mode, the original data packet is encapsulated in a CBT header
   and an IP header, as illustrated below:

           ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
           | IP header | CBT header | original IP hdr | data .... |
           ++++++++++++++++++++++++++++++++++++++++++++++++++++++++

   The IP protocol field of the inner (original) IP header is used to
   demultiplex a packet correctly; IP. CBT has been assigned IP
   protocol number 7.  The CBT module then demultiplexes based on the encapsulat-
   ing CBT header's "type" field, thereby distinguishing between CBT
   control packets and CBT mode data packets.

   The CBT data packet header is illustrated below.

10.1. 7 by IANA [4].

   6.1.  CBT Common Control Packet Header Format (for

   All CBT Mode data) control messages have a common fixed length header.

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  vers |unused | type  |   hdr length  | on-tree|unused|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  addr len     |         checksum              |      IP TTL   |     unused    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        group identifier                       |

                Figure 1. CBT Common Control Packet Header

   This CBT specification is version 2.

   CBT packet types are:

   +o    type 0: HELLO

   +o    type 1: JOIN_REQUEST

   +o    type 2: JOIN_ACK

   +o    type 3: QUIT_NOTIFICATION

   +o    type 4: ECHO_REQUEST

   +o    type 5: ECHO_REPLY

   +o    type 6: FLUSH_TREE

   +o    type 7: Bootstrap Message

   +o    type 8: Candidate Core Advertisement

   +o    Addr Length: address length in bytes of unicast or multicast
        addresses carried in the control packet.

   +o    Checksum: the 16-bit one's complement of the one's complement
        sum of the entire CBT control packet.

6.2.  HELLO Packet Format

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        first-hop router                    CBT Control Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          primary core                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ rnd response  |  reserved  Preference   |  reserved  |T|S|     Type  |     Length    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        .....Flow-id value.....   option type    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     unused    |       unused      |     Type     |   Length   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ option len   |                        .....Security data......               option value                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                       Figure 4. CBT Header

   Each of the fields is described below: 2. HELLO Packet Format

   HELLO Packet Field Definitions:

   +o    Vers: Version number -- this release specifies version 1.    rnd response: random response interval in seconds.

   +o    preference: sender's HELLO preference.

   +o    option type: indicates CBT payload; values are defined for control
           (0x00), and data (0xff). For the type of option present in the "option value"
        field.  One option type is currently defined: option type 0
        (zero) = BR_HELLO; option value 0x00 (control), a CBT
           control header 0 (zero); option length 0
        (zero). This option type is assumed present rather than used with HELLO messages sent by a CBT header.
        border router (BR) as part of designated BR election (see [5]).

   +o    hdr length:    option len: length of the header, for purpose of checksum
           calculation. "option value" field in bytes.

   +o    on-tree: indicates whether    option value: variable length field carrying the packet is on-tree (0xff) or
           off-tree (0x00). option value.

6.3.  JOIN_REQUEST Packet Format

   JOIN_REQUEST Field Definitions

   +o    checksum: the 16-bit one's complement of the one's complement    group address: multicast group address of the CBT header, calculated across all fields.

      +o    IP TTL: TTL value corresponding to the value of the IP TTL
           value of the original multicast packet, and set in the CBT
           header by the DR directly attached to the origin host (decre-
           mented by CBT routers visited).

      +o    group identifier: multicast group address.

      +o    first-hop router: identifies the encapsulating router
           directly attached to the origin of a multicast packet. This
           field is relevant to source-migration of being
        joined.  For a core to the source "wildcard" join (see Appendix A). It is set to NULL when core migration is
           disabled.

      +o    primary core: the primary core for the group, as identified
           by "group-id".  This field is necessary for the case where
           non-member senders happen to send to a secondary core, which
           may not yet be joined to the primary core. This [5]), this field allows
           the secondary to know which is the primary for the group, so
           that the secondary can forward the (encapsulated) data
           onwards to the primary.

      +o    T bit: indicates contains
        the presence (1) or absence (0) of Type of
           Service/flow-id value ("type", "length", "type of ser-
           vice/flow-id") . INADDR_ANY.

   +o    S bit: indicates the presence (1) or absence (0) of a secu-
           rity value ("type", "length", "security data").

10.2.  Control Packet Header Format

The individual fields are described below.    originating router: router that originated this JOIN_REQUEST.

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  vers |unused |      type     |      code     |   # cores     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         hdr length            |            checksum           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        group identifier                    CBT Control Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          group mask                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          packet origin                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       primary core address                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   target core address (core #1)               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             Core #2                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                             Core #3                           |
   |                               ....                        originating router                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  reserved     | reserved  |T|S|      Type     |     Length                           target router                       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  option type of service/flow-id                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     unused    |       unused      |     Type  |   Length   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  option len   |                     .....Security data.....         option value          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                   Figure 5. CBT Control 3. JOIN_REQUEST Packet Header Format
   +o    Vers: Version number -- this release specifies version 1.    target router: target (core) router for the group.

   +o    option type: indicates control message type (see sections 10.3).

      +o    code: indicates subcode allows the specification of control message type.

      +o    # cores: number a variety of core addresses carried by this control
           packet.

      +o    header length:
        JOIN_REQUEST options.  One option is currently defined: option
        type 0 (zero) = BR_JOIN; option length of the header, 0 (zero); option value 0
        (zero). This option is used by a CBT domain border router to
        join an internal core for purpose of checksum
           calculation. all groups that map to that core. The
        state instantiated by a JOIN_REQUEST with this option set is
        represents (*, core). For further details, see [5].

6.4.  JOIN_ACK Packet Format

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    CBT Control Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          group address                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           target router                       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  option type  |  option len   |         option value          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                     Figure 4. JOIN_ACK Packet Format
   JOIN_ACK Field Definitions

   +o    checksum: the 16-bit one's complement    group address: multicast group address of the one's complement
           of group being
        joined.

   +o    target router: router (DR) that originated the corresponding
        JOIN_REQUEST.

6.5.  QUIT_NOTIFICATION Packet Format

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    CBT control header, calculated across all fields. Control Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          group address                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    originating child router                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                 Figure 5. QUIT_NOTIFICATION Packet Format

   QUIT_NOTIFICATION Field Definitions

   +o    group identifier: address: multicast group address.

      +o address of the group mask: mask value for aggregated CBT joins/join-acks.
           Zero for non-aggregated joins/join-acks. being
        joined.

   +o    packet origin:    originating child router: address of the CBT router that originated originates
        the
           control packet. QUIT_NOTIFICATION.

6.6.  ECHO_REQUEST Packet Format

   ECHO_REQUEST Field Definitions

   +o    primary core address: the    originating child router: address of the primary core for router that originates
        the
           group.

      +o    target core address: desired core affiliation of control mes-
           sage.

      +o    Core #N: IP address for each of a group's cores.

      +o    T bit: indicates the presence (1) or absence (0) of Type of
           Service/flow-id value ("type", "length", "type of ser-
           vice/flow-id") .

      +o    S bit: indicates the presence (1) or absence (0) of a secu-
           rity value ("type", "length", "security data").

10.3. ECHO_REQUEST.

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    CBT Control Message Types

   There are ten types of CBT message. All are encoded in the CBT con-
   trol header, shown in figure 5.

      +o    JOIN-REQUEST (type 1): generated by a router and unicast to
           the specified core address. It is processed hop-by-hop on its
           way to the specified core. Its purpose is to establish the Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    originating child router                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                   Figure 6. ECHO_REQUEST Packet Format
6.7.  ECHO_REPLY Packet Format

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    CBT router, and all intermediate CBT routers, as
           part of the corresponding delivery tree. Note that all cores
           for the corresponding Control Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    originating parent router                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     group are carried in join-requests. report interval     |        num groups             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       group address #1                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       group address #2                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           ......                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       group address #n                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 7. ECHO_REPLY Packet Format

   ECHO_REPLY Field Definitions

   +o    JOIN-ACK (type 2): an acknowledgement to the above. The full
           list    oringinating parent router: address of core addresses is carried in a JOIN-ACK, together
           with the actual core affiliation (the join may have been ter-
           minated by an on-tree router on its journey to the specified
           core, and the terminating router may or may not be affiliated
           to the core specified in the original join). A JOIN-ACK tra-
           verses the reverse path as the corresponding JOIN-REQUEST,
           with each CBT router on the path processing the ack. It is
           the receipt of a JOIN-ACK that actually "fixes" tree state.

      +o    JOIN-NACK (type 3): a negative acknowledgement, indicating
           that the tree join process has not been successful.

      +o    QUIT-REQUEST (type 4): a request, sent from a child to a par-
           ent, to be removed as a child of that parent.

      +o    QUIT-ACK (type 5): acknowledgement to the above. If the par-
           ent, or the path to it is down, no acknowledgement will be
           received within the timeout period.  This results in the
           child nevertheless removing its parent information.

      +o    FLUSH-TREE (type 6): a message sent from parent to all chil-
           dren, which traverses a complete branch. This message results
           in all tree interface information being removed from each
           router on the branch, possibly because of a re-configuration
           scenario.

      +o    CBT-ECHO-REQUEST (type 7): once a tree branch is established,
           this messsage acts as a "keepalive", and is unicast from
           child to parent (can be aggregated from one per group to one
           per link. See section 4).

      +o    CBT-ECHO-REPLY (type 8): positive reply to the above.

      +o    CBT-BR-KEEPALIVE (type 9): applicable to border routers only.
           See [11] for more information.

      +o    CBT-BR-KEEPALIVE-ACK (type 10): acknowledgement to the above.

10.3.1.  CBT Control Message Subcodes

   The JOIN-REQUEST has three valid subcodes:

      +o    ACTIVE-JOIN (code 0) - sent from a CBT router that has no
           children for the specified group.

      +o    REJOIN-ACTIVE (code 1) - sent from a CBT router that has at
           least one child for the specified group.

      +o    REJOIN-NACTIVE (code 2) - generated by a router subsequent to
           receiving a join ack, subcode NORMAL, in response to a
           active-rejoin.

   A JOIN-ACK has three valid subcodes:

      +o    NORMAL (code 0) - sent by a core router, or on-tree router,
           acknowledging joins with subcodes ACTIVE-JOIN and REJOIN-
           ACTIVE.

      +o    PRIMARY-REJOIN-ACK (code 1) - sent by a primary core to
           acknowledge the receipt of a join-request received with sub-
           code REJOIN-ACTIVE. This message traverses the reverse-path
           of the corresponding re-join, and is processed by each router
           on that path.

      +o    PRIMARY-NACTIVE-ACK (code 2) - sent by a primary core to
           acknowledge the receipt of a join-request received with sub-
           code REJOIN-NACTIVE.  This ack is unicast directly to the
           router that generated the rejoin-Nactive, i.e. the ack it is
           not processed hop-by-hop.

11.  CBT Protocol Number

   CBT has been assigned IP protocol number 7. CBT control messages are
   carried directly over IP.

12.  Default Timer Values

   There are several CBT control messages which are transmitted at fixed
   intervals. These values, retransmission times, and timeout values,
   are given below. Note these are recommended default values only, and
   are configurable with each implementation (all times are in seconds):

   +o    CBT-ECHO-INTERVAL 30 (time between sending successive CBT-ECHO-
        REQUESTs to parent).

   +o    PEND-JOIN-INTERVAL 5 (retransmission time for join-request if no
        ack rec'd)

   +o    PEND-JOIN-TIMEOUT 30 (time to try joining a different core, or
        give up)

   +o    EXPIRE-PENDING-JOIN 90 (remove transient state for join that has
        not been ack'd)

   +o    PEND_QUIT_INTERVAL 5 (retransmission time for quit-request if no
        ack rec'd)

   +o    CBT-ECHO-TIMEOUT 90 (time to consider parent unreachable)

   +o    CHILD-ASSERT-INTERVAL 90 (increment child timeout if no ECHO
        rec'd from a child)

   +o    CHILD-ASSERT-EXPIRE-TIME 180 (time to consider child gone)

   +o    IFF-SCAN-INTERVAL 300 (scan all interfaces for group presence.
        If none, send QUIT)

   +o    BR-KEEPALIVE-INTERVAL 200 (backup designated BR to designated BR
        keepalive interval)

   +o    BR-KEEPALIVE-RETRY-INTERVAL 30 (keepalive interval if BR fails
        to respond)

13.  Interoperability Issues

   Interoperability between CBT and DVMRP has recently been defined in
   [11].

   Interoperability with other multicast protocols will be fully speci-
   fied as the need arises.

14.  CBT Security Architecture

   see [4].

Acknowledgements

   Special thanks goes to Paul Francis, NTT Japan, for the original
   brainstorming sessions that brought about this work.

   Thanks too to Sue Thompson (Bellcore). Her detailed reviews led to
   the identification of some subtle protocol flaws, and she suggested
   several simplifications.

   Thanks also to the networking team at Bay Networks for their comments
   and suggestions, in particular Steve Ostrowski for his suggestion of
   using "native mode" as a router optimization, and Eric Crawley.

   Thanks also to Ken Carlberg (SAIC) for reviewing the text, and gener-
   ally providing constructive comments throughout.

   I would also like to thank the participants of the IETF IDMR working
   group meetings for their general constructive comments and sugges-
   tions since the inception of CBT.

APPENDICES

DISCLAIMER: As of writing, the mechanisms described in Appendices A and
B have not been tested, simulated, or demonstrated.

APPENDIX A

                   Dynamic Source-Migration of Cores

A.0 Abstract

   This appendix describes CBT protocol mechanisms that allow a CBT mul-
   ticast tree, initially constructed around a randomly-placed set of
   core router, to dynamically reconfigure itself in response to an
   active source, such that the CBT tree becomes rooted at the source's
   local CBT router. Henceforth, CBT emulates a shortest-path tree.

   For clarity, the mechanisms are described in the context of "flat"
   multicasting, but are transferrable to a hierarchical model with only
   minor changes.

A.1 Motivation

   One of the criticisms levelled against shared tree multicast schemes
   is that they potentially result in sub-optimal routes between
   receivers. Another criticism is that shared trees incur a high traf-
   fic concentration effect on the core routers. Given that any shared
   tree is likely to have two, three, or more cores which can be strate-
   gically placed in the network, as well as the fact that any on-tree
   router can act as a "branch point" (or "exploder point"), shared tree
   traffic concentration can be significantly reduced.  This note never-
   theless addresses both of these criticisms by describing new mecha-
   nisms that

   +o    allow a CBT to dynamically transition from a random configura-
        tion to one where any CBT router can become a core - more pre-
        cisely, that which is local to a source, and...

   +o    remove the traffic concentration issue completely, as a result
        of the above; traffic concentration is not an issue with source-
        rooted trees.

   The mechanisms described here are relevant to non-concurrent sources;
   the concurrent-sender case is not addressed here, although experience
   with MBONE applications for the past several years suggests that most
   multicast applications are of the single, infrequently-changing
   sender type.  Also, it is not necessarily implied that the initial
   CBT tree must be transitioned. Any transition is an "all-or-nothing"
   transition, meaning that either all the tree transitions, or none of
   it does (footnote 6).

A.2 Goals & Requirements

   By means of the mechanisms described, this Appendix sets out to
   achieve the follwoing:

   +o    provide mechanisms that allow the dynamic transition from an
        initial CBT, constructed around a pre-configured set of cores,
        to a CBT that is rooted at a core attached to a sender's local
        subnetwork. This is source-rooted tree emulation.

   +o    ensure that these mechanisms do not impact CBT's simplicity or
        scalability.

   +o    eliminate completely the traffic concentration issue from CBT.

   +o    to eliminate the core placement/core advertisement problems.

   +o    ensure that the scheme is robust, such that if a source's local
        router (or link to it) should fail, the CBT self-organises
        itself and returns to its original configuration.

   +o    the mechanisms should provide the same even to non-member
        senders.

   The above incurs a few additional requirements on existing baseline
   CBT mechanisms described in this specification:

   +o    a new JOIN-REQUEST subcode, REVERSE-JOIN

   +o    a new JOIN-ACK subcode, REVERSE-ACK
_________________________
  6 This is the expected behaviour of PIM Sparse  Mode;
on  reciept  of high-bandwidth traffic, most receivers'
local routers  will  be  configured  to  transition  to
source trees.

   +o    new JOIN-ACK subcode, CORE-MIGRATE

   +o    a "first-hop router" field needs to be included in the CBT data
        packet header.

   +o    a new message type:

        - SOURCE-NOTIFICATION

   +o    CBT-mode data encapsulation is required until the local CBT
        router connected to an active source receives a JOIN-REQUEST,
        whose "target core address" field is one of its own IP
        addresses.

   These new additions are explained in the next section.

A.3 Source-Tree Emulation Criteria

   CBT routers are configured with a lower-bound data-rate threshold
   that is the expected boundary between low- and high-bandwidth data
   rate traffic. CBT also monitors the duration each sender sends. If
   this duration exceeds a pre-configured value (global across CBT), say
   3 minutes, AND the data rate threshold is exceeded, the CBT tree
   transitions such that receivers become joined to the "core" local to
   the source's subnet, i.e. the CBT tree becomes source-rooted, but
   nevertheless remains a CBT.

A.4 Source-Migration Mechanisms
                                                    E o        o D
                                                       \     /
                                                        \   /
            L o                                          \ /
               \                                           o C
                \                  N                      /
                 \                                       /
                  \A(2)                            (1)B /
                   O===================================O
                   |                                   |
          M        |                                   |
                   |                                   |
                 K o                                   o H
                  /\                                  /\
                 /  \                                /  \
                /    \                              /    \
        s    J o      o I                        G o      o F
       ----------

   Key:  B = primary core
   A = secondary core
   s = sending host
   J = sending host's local DR
   M & N = network nodes not on original CBT tree

                       Figure A1: Original CBT Tree

   In figure A1, host s starts sending native mode multicast data. CBT
   router J encapsulates it as CBT mode, inserting its own IP address in
   the "first-hop router" field of the CBT mode data packet header. This
   data packet flows over the CBT tree.

   Note that tree migration can be disabled either by sending all pack-
   ets in native mode, or by inserting NULL value into the "first-hop
   router" field. Since the first-hop router is the original encapsulat-
   ing router (data packets are always originated from hosts in native
   mode), the first-hop router knows whether the sender's data rate war-
   rants activating the "first-hop router" field; for the purpose of the
   ensuing protocol description, we assume this is the case.

   Any router on the tree receiving the CBT mode data packet, inspects
   the "first-hop router" field of the CBT header, and compiles a join-
   request to send to it. In order to fully specify the join, it must
   inspect its underlying unicast routing table(s) to find the best
   next-hop to the source's first hop router. That next hop will be
   either on or off the existing CBT tree for the group. If the next hop
   is off-tree, the join generated is given a subcode of ACTIVE-JOIN (as
   per CBT spec), and a "target core address" of the source's first hop
   router. The join is then forwarded and processed according to the CBT
   specification. The primary core, and the original core list, remain
   specified in their respective fields of the CBT control packet
   header.

   Using figure A1 to illustrate an example, node L's routing tables
   suggest that the best next-hop to J, the source's first hop router,
   is via node M, not yet on the tree. So, node L generates a join and
   forwards it to M, which forwards it to J. The join-ack (subcode NOR-
   MAL) returns to L via M on the reverse-path of the join. When the
   join-ack reaches L, L sends a QUIT-REQUEST to A, its old parent.  The
   shortest-path branch now exists, L-M-J.

   If the best next hop to the source's first hop router is via an
   existing on-tree interface, if that interface is the node's parent on
   the current tree, no further action need be taken, and no join need
   be sent towards the source, J.

   However, the join's best next hop may be via an existing child inter-
   face - this is where the new join type, subcode REVERSE-JOIN, comes
   in. The purpose of this join type is to simply reverse the existing
   parent-child relationship between two adjacent on-tree routers; each
   end of the link between the two routers is re-labelled.  This join
   must be acknowledged by means of a JOIN-ACK, subcode REVERSE-ACK.  A
   reverse-join is only ever sent from a child to its parent.

   Immediately subsequent to sending a reverse-join-ACK, the sending
   node's  old parent interface is labelled as "pending child", and a
   timer is set on that interface. This is a delay timer, set at a
   default of 5 seconds, during which time a reverse-join is expected
   over that interface from the node's old parent. Should this timer
   expire, a REVERSE-ASSERT message is sent to the old parent (new
   child) to cause it to agree to the change in the parent-child rela-
   tionship.  A REVERSE-ASSERT must be ack'd (REVERSE-ASSERT-ACK). If,
   after (say) three retransmissions (at 5 sec intervals) no reverse-
   assert-ack has been received, a QUIT-REQUEST is sent to the old par-
   ent and the corresponding interface is removed from this node's cur-
   rent forwarding database.

   Of course, if a node has already received a reverse-join during the
   period one of its other interfaces was changing its parent-child
   relationship with another of its neighbours, then the pending-child
   delay timer need not be activated.

   Looking at figure A1 again, here's the process of how the parent-
   child relationships change on the tree when an active source, s,
   starts sending. Of course, links E-C, I-J, and L-J do not do this
   because they forge completely new paths towards the source's local
   router, J.

   K sends a reverse-join to J. J acks this with a join-ack, subcode
   REVERSE-ACK. At this point, J is K's parent, and I is still K's
   child.  K now sets the pending-child delay timer on its interface to
   A (K's old parent), and expects a reverse-join from A. If it weren't
   to arrive after the delay timer expires, plus several retransmissions
   of a reverse-assert control message, K can send a quit to A (it sends
   a quit because, as far as A is concerned, it thinks K is still its
   child) and removes the K-A interface from its CBT forwarding
   database.  However, assuming a reverse-join does arrive at K from A
   before the delay timer expires, K acks the reverse-join and cancels
   the delay timer on that interface.

   Next, let's consider CBT router (node) I. I's unicast routing table
   suggest it can reach J directly (next-hop) via a different interface
   than the I-K interface, so I sends a join-request, subcode active-
   join, to J, which acks it as normal. On receipt of the ack, I sends a
   quit to K and removes K as its parent from its database.

   Now let's consider node L. Like I, it finds a new path to J, via M,
   so simply sends a new join to J, via M, and on receipt of the join-
   ack, sends a quit to A, and removes A from its forwarding database.
   A new, shortest-path, branch now exists, J-M-L.

   Next let's consider A-B, the link between the cores. A is the sec-
   ondary, and B is the primary, so A originally joined towards B.  So,
   B sends a reverse-join to A. A sends a reverse-ack to B, so A is now
   B's parent, and B has children B-H, and B-C. Note that the role of
   primary and secondary is not affected - the target of B's join to A
   is the source's local router, J.

   The existing branches D-C-B, F-H-B, and G-H-B, need not change any of
   their parent-child relationships, since each of these nodes' unicast
   routing tables indicate that the best next-hop a join-request, tar-
   getted at source J, would take, is via the corresponding existing
   parent.

   For E, it sends a new join via N to J. On receipt of the join-ack, it
   sends a quit to C. A new branch has been created, E-N-J.

   Each node on the tree now has a shortest-path to J, the source's
   local CBT router. Hence, J is the root ("core") of a shortest-path
   multicast tree.

   Note that these new mechanisms augment the CBT protocol, and the
   baseline CBT protocol engine is not affected in any way by this add-
   on mechanism.

A.5 Robustness Issues

   Some immediate questions might be:

   +o    what happens to the source-rooted tree if the source's local CBT
        router fails?

   +o    what happens if the source's local CBT router fails whilst the
        initial tree is transitioning?

   +o    what happens if the tree is partitioned, or not yet fully con-
        nected, when a source starts sending?

   +o    how do new receivers join an already-transitioned tree?

   All of these questions are now addressed:

   +o    What happens to the source-rooted tree if the source's local CBT
        router fails?

        A source-rooted CBT has a single point of failure - the root of
        the tree.

        In spite of a source being joined, the corelist (primary & sec-
        ondaries) is carried in CBT control packets, as per the CBT
        spec. However, the contents of the "target core address" field
        identifies the IP address of the source's local CBT router. So,
        in the event of a failure, the CBT routers still have all the
        information they need to rejoin the original tree, constructed
        around the corelist. Rejoining then, proceeds according to the
        rules of the CBT specification.

        Of course, rejoining the original tree happens only after sev-
        eral attempts have been made to rejoin the source's "core".

   +o    What happens if the source's local CBT router fails whilst the
        initial tree is transitioning?

        This really is no different to the above case. The parts of the
        tree that have transitioned will rejoin the original tree
        according to their corresponding corelist. Those parts of the
        tree in the process of transitioning may temporarily transition,
        but eventually those nodes will receive a FLUSH from a CBT
        router adjacent to the failed source router ("core"). They then
        rejoin the original tree.

   +o    What happens if the tree is partitioned, or not yet fully con-
        nected, when a source starts sending?

        The problem here is that some parts of the network (CBT tree)
        may not receive CBT encapsulated mode data packets before the
        source's local DR starts forwarding data in native mode, and so
        those receivers will not know the IP address of the local DR to
        join to.

        For example, assume a secondary core with downstream members
        cannot reach the primary. If the routers adjacent to the secon-
        daries are all functioning correctly, the secondaries themselves
        may not be aware that a partition has occurred somewhere further
        upstream. So, what if a source downstream from a secondary,
        starts sending data after the partition has happened?

        A new control message, the SOURCE-NOTIFICATION, is used to solve
        this problem. As soon as any core recieves CBT mode encapsulated
        data, it caches the source "core" IP address, and starts multi-
        casting (to the group) SOURCE-NOTIFICATION messages, one every
        minute. Source-notifications contain the IP address of the
        source's local DR. A core continues to multicast source-
        notications at 1 minute intervals until the source has ceased
        transmitting data for more than 20 seconds.

        Obviously, if a CBT is fully connected, the larger proportion of
        source-notifications will be redundant. However, this cost jus-
        tifies the robustness the scheme provides.

        If an off-tree source begins sending data, which first hits the
        tree at a secondary core with no receivers attached, the
        secondary does not trigger a join towards the primary, but
        instead just unicasts the data, in CBT mode, to the primary (as
        per CBT spec). The primary then forwards the data over any con-
        nected tree branches. Receivers can then begin transitioning. In
        this way, a transitioned CBT tree extends to the first hop
        router of a non-member sender.

        Note that cores and on-tree routers only ever react to active
        sources iff they have an existing CBT forwarding database for
        the said group. For example, a primary core would not establish
        a shortest-path branch to a non-member sender unless it has at
        least one existing child registered for the corresponding group.

   +o    How do new receivers join an already-transitioned CBT?

        New receivers will always attempt to join one of the cores in
        the corelist for a group. Two things can happen here: firstly, a
        new join, targetted at one of the cores in the corelist eventu-
        ally reaches that target core. Secondly, the new join hits a
        router already established on-tree, but the router encountered
        is now joined to the source tree (source "core").

        For the first scenario, all on-tree routers and all core routers
        maintain the address of which upstream core their CBT branch
        actually emanates from (as per CBT spec). When a new join
        arrives at one of the original cores, the core checks whether
        its own current core affiliation is to a core outside the
        corelist set. If so, that core is a source "core", so the core
        responds to the new join with a JOIN-ACK, subcode CORE-MIGRATE.
        This join-ack contains the address of the active source "core".
        This join-ack causes a join-request to be issued by one of the
        routers that receives it - the router whose path to the core
        (just joined) diverges from that to the source "core"; this can
        easily be gleaned from unicast routing.  The router then simply
        directs it new join at the source "core", and on receipt of the
        join-ack, sends a quit to its now "old" parent.

        For the second case, the solution is trivial; any on-tree router
        receiving a join targetted either at one of the original cores
        for the group, or the active source "core", simply acks (subcode
        NORMAL) the join and includes in the ack the source "core"
        affiliation (as per CBT spec).

A.6 Loops
   It may seem that the potential for a transitioning tree to form
   loops, especially in the presence of reverse-joins, is greatly
   increased.  This is probably NOT the case; "reversed branches" are
   those that are already part of a loop-free tree that CBT constructs
   around the original set of cores. Transitioned tree are just CBTs,
   whereby the core is simply rooted at the source. Loops are no more
   likely with these mechanisms then they are with baseline CBT. Note
   that these are assertions - formal proofs may be more appropriate.

   APPENDIX B

                          Group State Aggregation

B.1 Introduction

   Although the scalability of shared tree multicast schemes is attrac-
   tive now, to scale over the longer-term, a combination of hierarchy
   (support mechanisms that facilitate domain-oriented multicasting),
   and group aggregation strategies, is required.  If IP multicast is to
   have a long-term future in the Internet as a global transport mecha-
   nism, by far the most serious challenge is to address the issue of
   group state aggregation.

   Shared trees were developed partly to address scalability with
   regards to multicast state maintained in the network, which resulted
   in an improvement in that state by a factor of the number of active
   sources (a source being a subnetwork aggregate).  However, it is per-
   ceived that the number of sources sending to any one group will not
   grow as fast as the number of groups, indeed the latter will probably
   grow at several orders of magnitude faster [12]. Therefore, it is
   essential to contain this potential problem, particularly for the
   benefit of routers on wide-area links, by designing an effective
   group state aggregation mechanism, capable of collapsing group state.

   Unlike unicast addresses, multicast addresses cannot be aggregated
   according to topological locality; multicast addresses are truly
   location-independent. Thus, it would not seem obvious how the problem
   can be addressed - clearly, it must be looked at in a different way.

   In order to be effective, flexibility and efficiency must be facets
   of group aggregation; an aggregation scheme must be able to accommo-
   date groups with wide-ranging characteristics in the least constrain-
   ing way possible.  For example, the trend towards small, non-local
   groups (e.g. 4 or 5 person audio/video conferences between different
   user groups spread over different countries/continents); it is these
   types of groups that are likely to result in an explosive growth in
   state.  Also, these groups will, in all likelihood, utilize multicast
   addresses that are randomly spread across the multicast address
   space, making aggregation seemingly more difficult. An aggregation
   scheme must therefore account for this.

B.2 Design Overview
   This scheme involves replacing a subset the router originating
        this ECHO_REPLY.

   +o    group report interval: number of individual tree state pre-
   sent on inter-domain links, and aggregating it over seconds until the sending
        router will send its next ECHO_REPLY containing a single shared
   tree. The scheme does not yet specify how candidate list of group
        addresses.

   +o    num groups: the number of groups for aggre-
   gation are arrived at, but an obvious scheme to would be to aggregate
   already-overlapping distribution trees. The pivotal idea behind being reported by this
   approach encompasses two inter-dependent strategies:
        ECHO_REPLY.

   +o    administratively defining    group address: a portion list of the multicast address
        space group addresses for aggregate groups. For brevity, an example might be which
        this router considers itself a parent router w.r.t. the
        range 238.0.0.0 - 238.255.255.255. link
        over which this message is sent.

6.8.  FLUSH_TREE Packet Format

       0               1               2               3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    CBT Control Packet Header                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         group address                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 8. FLUSH_TREE Packet Format

   FLUSH_TREE Field Definitions

   +o    associated with each aggregate    group address: multicast group address is a mask, specify-
        ing the portion of the address that it used to identify the
        aggregate group itself (the portion covered by being
        "flushed".

7.  Core Router Discovery

   For intra-domain core discovery, CBT has decided to adopt the mask); "boot-
   strap" mechanism currently specified with the
        remaining address space PIM sparse mode proto-
   col [2]. This bootstrap mechanism is used scalable, robust, and does not
   rely on underlying multicast routing support to deliver core router
   information; this information is distributed via traditional unicast
   hop-by-hop forwarding.

   It is expected that the bootstrap mechanism will be specified inde-
   pendently as an index a "generic" RP/Core discovery mechanism in its own sepa-
   rate document. It is unlikely at this stage that the bootstrap mecha-
   nism will be appended to an ordered list a well-known network layer protocol, such as
   IGMP [3], though this would facilitate its ubiquitous (intra-domain)
   deployment. Therefore, each multicast routing protocol requiring the
   bootstrap mechanism must implement it as part of the multicast rout-
   ing protocol itself.

   A summary of groups with which the aggregate address is associated. The
        ordered list and its association with a group aggregate address
        is conveyed by means operation of a protocol message (TBD). The index the bootstrap mechanism follows
   (details are provided in [7]). It is
        used to de-aggregate at region boundaries (border routers).

   The scheme subscribes to assumed that all routers within
   the domain implement the notion "bootstrap" protocol, or at least forward
   bootstrap protocol messages.

   A subset of aggregation-on-demand; a bor-
   der router (BR) is the domain's routers are configured with a threshold number of groups on a
   BRs external interface, above which it begins to solicit aggregations
   periodically, say once be CBT candidate
   core routers. Each candidate core router periodically (default every hour.

   As an example, say BR 123 wishes
   60 secs) advertises itself to aggregate 200 groups. BR 123 ran-
   domly chooses (or by some address allocation algorithm) a group
   aggregate address.  It has been established that the number of groups
   for which aggregation is desired is 200. domain's Bootstrap Router (BSR),
   using  "Core Advertisement" messages.  The nearest power of 2 value
   to 200 BSR is 256 (2^8), and so itself elected
   dynamically from all (or participating) routers in the aggregate mask covers 24 bits, leav-
   ing 8 domain.  The
   domain's elected BSR collects "Core Advertisement" messages from can-
   didate core routers and periodically advertises a candidate core set
   (CC-set) to specify each individual group's traffic flowing over the
   aggregate tree.

   So we have:
         Group aggregate address: 238.10.12.0

         Group aggregate mask:    238.10.12/24

   A data packet for the 30th listed group (listed other router in a protocol message
   (TBD) as described above) would be addressed to: 238.10.12.30.

   Similarly, a data packet pertaining to the 150th listed group would
   be addressed to: 238.10.12.150, and so on.

   All routers comprising the aggregate tree need only maintain domain, using traditional hop-
   by-hop unicast forwarding. The BSR uses "Bootstrap Messages" to
   advertise the
   group aggregate address CC-set. Together, "Core Advertisements" and mask, together with "Bootstrap
   Messages" comprise the aggregate tree's
   associated interfaces. If "bootstrap" protocol.

   When a number of individual shared trees have
   been replaced by router receives an aggregate tree, then the core routers (RPs) of
   each IGMP host membership report from one of those shared trees must additionally maintain its
   directly attached hosts, the complete
   list local router uses a hash function on the
   reported group address, the result of groups associated with an <aggregate address/mask-len> so which is used as an index into
   the CC-set. This is how local routers discover which core to be able to "re-direct" any incoming joins use for already aggregated
   groups.  Similarly, border routers (BRs) are incurred
   a particular group.

   Note the storage
   cost hash function is specifically tailored such that a small
   number of maintaining the individual consecutive groups associated with an <aggre-
   gate address/mask-len>, so as to be able always hash to aggregate and de-
   aggregate as data packets flow across a (sub)region's border.

B.3 Scaling Further

   The scheme described the same core. Further-
   more, bootstrap messages can be applied recursively (to border routers) carry a "group mask", potentially limit-
   ing a CC-set to accommodate a hierarchy containing an arbitrary number particular range of levels.

   The scheme described imposes two general requirements (or assump-
   tions):

   +o groups. This can help reduce
   traffic concentration at the core.

   If a well defined aggregate group address space BSR detects a particular core as being unreachable (it has not
   announced its availability within some period), it deletes the rele-
   vant core from the CC-set sent in its next bootstrap message. This is
   how a local router discovers a group's core is unreachable; the
   router must re-hash for each level of
        hierarchy (or scope levels).

   +o affected group and join the ability to arbitrarily create boundaries in multicast
        routers, thereby separating different hierarchical levels. new core
   after removing the old state. The former will require consensus within removal of the IETF "old" state follows
   the sending of a QUIT_NOTIFICATION upstream, and a FLUSH_TREE message
   downstream.

7.1.  Bootstrap Message Format

        0               1               2               3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |             CBT common control packet header                  |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |      For full Bootstrap Message specification, see [7]        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 9. Bootstrap Message Format

7.2.  Candidate Core Advertisement Message Format

        0               1               2               3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              CBT common control packet header                 |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |   For full Candidate Core Adv. Message specification, see [7] |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure 10. Candidate Core Advertisement Message Format

8.  Interoperability Issues

   Interoperability between CBT and approval from
   the IANA. The latter capability DVMRP is already available in multicast
   routers; boundaries are specified in a multicast routers configura-
   tion file.  This capability is currently available in the best known
   multicast routing protocols: DVMRP, M-OSPF, PIM, and CBT.

   Defining boundaries may require some degree of coordination; whenever
   a particular scoped level (boundary) is introduced which has multiple
   entry/exit [5].

   Interoperability with other multicast routers, these must all be configured such that
   their boundary definitions are identical, i.e. they must each protocols will be con-
   figured with fully speci-
   fied as the same boundary-address/mask (the range 239.0.0.0 -
   239.255.255.255 is need arises.

Acknowledgements

   Special thanks goes to Paul Francis, NTT Japan, for the IANA-defined multicast boundary address
   range).

Author Information:

   Tony Ballardie,
   Department original
   brainstorming sessions that brought about this work.

   Others that have contributed to the progress of Computer Science,
   University College London,
   Gower Street,
   London, WC1E 6BT,
   ENGLAND, U.K.

   Tel: ++44 (0)71 419 3462
   e-mail: A.Ballardie@cs.ucl.ac.uk

   Scott Reeve, CBT include Ken Carl-
   berg, Eric Crawley, Nitin Jain,
   Bay Networks, Inc.
   3, Federal Street,
   Billerica, MA 01821,
   USA.

   Tel: ++1 508 670 8888
   e-mail: {sreeve, njain}@BayNetworks.com Steven Ostrowsksi, Radia Perlman,
   Scott Reeve, Clay Shields, Sue Thompson, Paul White.

   The participants of the IETF IDMR working group have provided useful
   feedback since the inception of CBT.

   References

  [1] T. Pusateri. Distance Vector Core Based Trees (CBT) Multicast Routing Protocol. Architecture;
  A. Ballardie; ftp://ds.internic.net/internet-drafts/draft-ietf-idmr-
  cbt-arch-**.txt.  Working draft, June 1996. (draft-ietf-idmr-dvmrp-v3-01.{ps,txt}). 1997.

  [2] J. Moy. Multicast Routing Extensions to OSPF. Communications of
  the ACM, 37(8): 61-66, August 1994. Also RFC 1584, March 1994.

  [3] D. Farinacci, S. Deering, D. Estrin, and V. Jacobson. Protocol Independent Multicast (PIM) Dense-Mode Specification. Working draft,
  July 1996.  (draft-ietf-idmr-pim-dm-spec-02.{ps,txt}).

  [4a] A. Ballardie. Core Based Tree (CBT) Multicast Architecture. Sparse Mode/Dense Mode; D.
  Estrin et al; ftp://netweb.usc.edu/pim   Working draft, July drafts, 1996. (draft-ietf-idmr-cbt-arch-04.txt)

  [4] A. J. Ballardie. Scalable Multicast Key Distribution; RFC 1949,
  SRI Network Information Center, 1996.

  [5] A. J. Ballardie. "A New Approach to Multicast Communication in a
  Datagram Internetwork", PhD Thesis, 1995. Available via anonymous ftp
  from: cs.ucl.ac.uk:darpa/IDMR/ballardie-thesis.ps.Z.

  [6] W. Fenner.

  [3] Internet Group Management Protocol, version 2 (IGMPv2). (IGMPv2); W. Fenner;
  ftp://ds.internic.net/internet-drafts/draft-ietf-idmr-igmp-v2-**.txt.
  Working draft, May 1996. (draft-idmr-igmp-v2-03.txt).

  [7] B. Cain, S. Deering, A. Thyagarajan. Internet Group Management
  Protocol Version 3 (IGMPv3) (draft-cain-igmp-00.txt).

  [8] M. Handley,

  [4] Assigned Numbers; J. Crowcroft, I. Wakeman. Hierarchical Rendezvous
  Point proposal, work in progress.
  (http://www.cs.ucl.ac.uk/staff/M.Handley/hpim.ps) Reynolds and
  (ftp://cs.ucl.ac.uk/darpa/IDMR/IETF-DEC95/hpim-slides.ps).

  [9] D. Estrin et al. USC/ISI, Work in progress.
  (http://netweb.usc.edu/pim/).

  [10] D. Estrin et al. PIM Sparse Mode Specification.  Working draft,
  July 1996. (draft-ietf-idmr-pim-sparse-spec-04.{ps,txt}).

  [11] A. Ballardie. J. Postel; RFC 1700, October
  1994.

  [5] CBT - Dense Mode Interoperability: Border Router
  Specification; Specification for Interconnecting a CBT Stub
  Region to a DVMRP Backbone; A. Ballardie;
  ftp://ds.internic.net/internet-drafts/draft-ietf-idmr-cbt-
  dvmrp-**.txt.  Working draft, March 1997.

  [6] Scalable Multicast Key Distribution; A. Ballardie; RFC 1949, July
  1996. Also available from:
  ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-cbt-dm-interop-XX.txt

  [12] S. Deering. Private communication, August 1996.

  [7] A Dynamic Bootstrap Mechanism for Rendezvous-based Multicast Rout-
  ing; D. Estrin et al.; Technical Report; ftp://catarina.usc.edu/pim

Author Information:

   Tony Ballardie,
   Research Consultant,

   e-mail: ABallardie@acm.org