draft-ietf-idmr-cbt-spec-04.txt   draft-ietf-idmr-cbt-spec-05.txt 
Inter-Domain Multicast Routing (IDMR) A. J. Ballardie Inter-Domain Multicast Routing (IDMR) A. J. Ballardie
INTERNET-DRAFT University College London INTERNET-DRAFT University College London
S. Reeve S. Reeve
Bay Networks, Inc. Bay Networks, Inc.
N. Jain N. Jain
Bay Networks, Inc. Bay Networks, Inc.
February 9th, 1996 April, 1996
Core Based Trees (CBT) Multicast Core Based Trees (CBT) Multicast
-- Protocol Specification -- -- Protocol Specification --
<draft-ietf-idmr-cbt-spec-04.txt> <draft-ietf-idmr-cbt-spec-05.txt>
Status of this Memo Status of this Memo
This document is an Internet Draft. Internet Drafts are working do- This document is an Internet Draft. Internet Drafts are working do-
cuments of the Internet Engineering Task Force (IETF), its Areas, and cuments of the Internet Engineering Task Force (IETF), its Areas, and
its Working Groups. Note that other groups may also distribute work- its Working Groups. Note that other groups may also distribute work-
ing documents as Internet Drafts). ing documents as Internet Drafts).
Internet Drafts are draft documents valid for a maximum of six Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted by months. Internet Drafts may be updated, replaced, or obsoleted by
skipping to change at page 1, line 37 skipping to change at page 1, line 38
Drafts as reference material or to cite them other than as a "working Drafts as reference material or to cite them other than as a "working
draft" or "work in progress." draft" or "work in progress."
Please check the I-D abstract listing contained in each Internet Please check the I-D abstract listing contained in each Internet
Draft directory to learn the current status of this or any other Draft directory to learn the current status of this or any other
Internet Draft. Internet Draft.
Abstract Abstract
This document describes the Core Based Tree (CBT) network layer mul- This document describes the Core Based Tree (CBT) network layer mul-
ticast protocol specification. CBT is a next-generation multicast ticast protocol. CBT is a next-generation multicast protocol that
protocol that makes use of a shared delivery tree rather than makes use of a shared delivery tree rather than separate per-sender
separate per-sender trees utilized by most other multicast schemes trees utilized by most other multicast schemes [1, 2, 3].
[1, 2, 3].
This specification includes a description of an optimization whereby
native IP-style multicasts are forwarded over tree branches as well
as subnetworks with group member presence. This mode of operation
will be called CBT "native mode" and obviates the need to encapsulate
data packets before forwarding over CBT tree interfaces. Native mode
is only relevant to CBT-only domains or ``clouds''. Also included are
some new "data-driven" features.
A special authors' note is included explaining the latest updates to This specification includes an optimization whereby unencapsulated
the CBT specification, together with some nomenclature, and miscel- (native) IP-style multicasts are forwarded by CBT, resulting in very
laneous items. good forwarding performance. This mode of operation is called CBT
"native mode". Native mode can only be used in CBT-only domains or
"clouds".
This document is progressing through the IDMR working group of the This document is progressing through the IDMR working group of the
IETF. The CBT architecture is described in an accompanying document: IETF. The CBT architecture is described in an accompanying document:
ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-arch-00.txt. Other ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-arch-03.txt. Other
related documents include [4, 5]. For all IDMR-related documents, see related documents include [4, 5]. For all IDMR-related documents, see
http://www.cs.ucl.ac.uk/ietf/idmr. http://www.cs.ucl.ac.uk/ietf/idmr.
1. Authors' Note 1. Changes since Previous Revision (04)
The purpose of this note is to explain how the CBT protocol has
evolved since the previous version (November 1995).
Since the previous release, CBT has been assigned official IP proto-
col and UDP port numbers (section 8).
The CBT designers have constantly been seeking to streamline the pro-
tocol and seek new mechanisms to simplify the group initiation pro-
cedure. Especially, it has been a high priority to ensure that join
latency be kept to an absolute minimum. The November '95 draft intro-
duced the re-invented subnet designated router (DR) election pro-
cedure, described here in section 2.3.
The concept of proxy-ACKs was introduced in the November '95 draft,
but these have been removed since the extra message overhead does not
warrant the negligible gain they provide.
The CBT loop detection mechanism (comprising rejoin-active and This note summarizes the changes to this document since the previous
rejoin-nactive) has been slightly modified, and is now simpler and revision (revision 04).
more straighforward. The revised mechanism incorporates a new join
ack subcode, and is explained in section 5.3.
Core selection, placement, and management, which have prevented sim- + inclusion of a "group mask" field for aggregated joins/join-acks
ple group initiation/joining, apparent in data-driven schemes (like (sections 10.2, 8.1, and Appendix A).
DVMRP), have been separated out from the protocol itself. Core
management is not a problem unique to CBT, but also PIM-Sparse Mode.
Separate, protocol-independent core management mechanisms are
currently being proposed/developed [8, 9]. In the absence of core
management/distribution protocol, the task could be manually handled
by network management facilities.
In CBT, the core routers for a particular group are categorised into + removal of the term "Group DR (G-DR)", which was only a "token"
PRIMARY CORE, and NON-PRIMARY (secondary) CORES. identity.
The core tree, the part of a tree linking all core routers together, + more complete explanation of the use of CBT's IP protocol and
is built on-demand (section 2.4). That is, the core tree is only UDP port numbers (section 11).
built subsequent to a non-primary core receiving a join-request
(non-primary core routers join the primary core router -- the primary
need do nothing). Join-requests carry an ordered list of core routers
(and the identity of the primary core in its own separate field),
making it possible for the non-primary cores to know where to join.
On-demand core tree building is explained as part of section 2.4.
CBT now supports the aggregation of neighbour keepalives, which pre- + more complete explanation of non-member sender case (section 6).
viously were sent on a per group basis. Any two adjacent CBT routers
need only send a single keepalive between each other, rather than one
per group. Additional aggregation strategies are currently being
worked on, and we present some ideas on aggregated rejoins in Appen-
dix A. An updated draft fully specifying CBT aggregation strategy
should appear soon.
The end result of these developments is that the CBT protocol is much + the term FIB (forwarding information base) has been replaced
simplified and more efficient. throughout with the term "forwarding database (db)".
2. Protocol Specification + editorial changes throughout for extra clarity.
2.1. CBT Group Initiation Finally, in keeping with CBT's tradition of simplicity, this revision
is 1 page less than the previous revision :-) .
The requirement of hosts to discover the identity of candidate core 2. Some Terminology
routers (or RPs) differentiates the role of hosts in shared tree mul-
ticast protocols and shortest-path tree multicast protocols; the
latter need only announce their desire to join a group by means of an
IGMP membership report. It is highly desirable that hosts wishing to
join a shared tree need only do the same, leaving local multicast
routers to discover <core, group> mappings, or have local routers
configured with the identity of core(s) in the next level of a
hierarchy, as suggested by Hierarchical PIM [8].
If the latter approach is eventually adopted by the IETF, then host In CBT, the core routers for a particular group are categorised into
operations need not differ due to the type of multicast tree being PRIMARY CORE, and NON-PRIMARY (secondary) CORES.
joined, and indeed, the type of tree being joined for a particular
group can remain transparent to the host.
If the latter approach is not adopted, then hosts need to inform The "core tree" is the part of a tree linking all core routers of a
their local multicast router of a <core, group> mapping for each particular group together.
group joined. This requires hosts to discover <core, group> mappings,
which in turn requires the existence of a (global) core advertisement
protocol. Hosts subsequently need a means of advertising <core,
group> mappings to the local multicast router so it can initiate a
join. This requires an extension to IGMP, for example, the presence
of IGMP RP/Core Reports, as suggested in IGMP version 3 [7], or the
protocol itself must provide a means (message) for advertising cores
to the local router. In the absence of H-PIM, some similar mechanism,
or IGMPv3, CBT implementors may wish to extend CBT to include a core
reporting message for group initiators/joiners (for example, whenever
a group is initiated/joined, a configuration file is read which holds
<core, group> mappings).
Alternatively, <core, group> mappings can be downloaded to local mul- 3. Protocol Specification
ticast routers by means of network management tools.
2.2. Tree Joining Process -- Overview 3.1. Tree Joining Process -- Overview
A local CBT router is notified, by IGMP, of a host's desire to join a A CBT router is notified of a local host's desire to join a group via
group. If more than one CBT router is present on the subnetwork, each IGMP [6]. We refer to a CBT router with directly attached hosts as a
will receive the IGMP membership report. However, only one, the "leaf CBT router", or just "leaf" router.
default subnet designated router (DEFAULT DR) will act upon the
receipt of a report by initiating a CBT join. Note, a CBT join is
only initiated if the subnetwork is not yet part of the delivery
tree. Also, we assume that the local CBT default DR discovers <core,
group> mappings by one of the mechanisms described in the previous
section. DR election is described in section 2.3.
The following CBT control messages come into play subequent to the The following CBT control messages come into play subequent to a
host sending an IGMP join (host membership report): subnet's CBT leaf router receiving an IGMP membership report (also
termed "IGMP join"):
+ JOIN_REQUEST + JOIN_REQUEST
+ JOIN_ACK + JOIN_ACK
A join-request is generated by a locally-elected DR (see next If the CBT leaf router is the subnet's default designated router (see
section) in response to receiving an IGMP group membership report next section), it generates a CBT join-request in response to receiv-
from a directly connected host. The join is sent to the next-hop on ing an IGMP group membership report from a directly connected host.
the path to the target core, as specified in the join packet. The The CBT join is sent to the next-hop on the unicast path to a target
join is processed by each such hop on the path to the core, until core, specified in the join packet; a router elects a "target core"
based on a static configuration. If, on receipt of an IGMP-join, the
locally-elected DR has already joined the corresponding tree, then it
need do nothing more with respect to joining.
The join is processed by each such hop on the path to the core, until
either the join reaches the target core itself, or hits a router that either the join reaches the target core itself, or hits a router that
is already part of the corresponding distribution tree (as identified is already part of the corresponding distribution tree (as identified
by the group address). In both cases, the router concerned terminates by the group address). In both cases, the router concerned terminates
the join, and responds with a join-ack, which traverses the reverse- the join, and responds with a join-ack, which traverses the reverse-
path of the corresponding join. This is possible due to the transient path of the corresponding join. This is possible due to the transient
path state created by a join traversing a CBT router. The ack fixes path state created by a join traversing a CBT router. The ack fixes
that state. that state.
2.3. DR Election 3.2. DR Election
Multiple CBT routers may be connected to a multi-access subnetwork. Multiple CBT routers may be connected to a multi-access subnetwork.
In such cases it is necessary to elect a (sub)network designated In such cases it is necessary to elect a subnetwork designated router
router (DR) that is responsible for sending IGMP host membership (D-DR) that is responsible for generating and sending CBT joins
queries, and generating join-requests in response to receiving IGMP upstream, on behalf of the subnetwork.
group membership reports. Such joins are forwarded upstream by the
DR.
The IGMP querier election is as follows (note, here we talk about CBT DR election happens "on the back" of IGMP [6]; on a subnet with
"CBT routers", but the described mechanism also applies to the gen- multiple multicast routers, an IGMP "querier" is elected as part of
eral case). At start-up, a CBT router assumes it is the only CBT- IGMP; at start-up, a multicast router assumes no other multicast
capable router on its subnetwork. It therefore sends two IGMP-HOST- routers are present on its subnetwork, and so begins by believing it
MEMBERSHIP-QUERYs in short succession (within 5 secs) (for robust- is the subnet's IGMP querier. It sends a small number IGMP-HOST-
ness) in order to quickly learn about any group memberships on the MEMBERSHIP-QUERYs in short succession in order to quickly learn about
subnet. If other CBT routers are present on the same subnet, they any group memberships on the subnet. If other multicast routers are
will receive these IGMP queries, and depending on which router was present on the same subnet, they will receive these IGMP queries; a
already the elected querier, yield querier duty to the new router iff multicast router yields querier duty as soon as it hears an IGMP
the new router is lower-addressed. If it is not, then the newly- query from a lower-addressed router on the same subnetwork.
started CBT router will yield when it hears a query from the already
established querier.
The CBT DEFAULT DR (D-DR) is always (footnote 1) the subnet's IGMP- The CBT default DR (D-DR) is always (footnote 1) the subnet's IGMP-
querier. As a result, there is no protocol overhead whatsoever asso-
ciated with electing a CBT D-DR.
3.3. Tree Joining Process -- Details
The receipt of an IGMP group membership report by a CBT D-DR for a
CBT group not previously heard from triggers the tree joining pro-
cess; the D-DR unicasts a JOIN-REQUEST to the first hop on the (uni-
cast) path to the target core specified in the CBT join packet.
Each CBT-capable router traversed on the path between the sending DR
and the core processes the join. However, if a join hits a CBT router
that is already on-tree (footnote 2), the join is not propogated
further, but ACK'd downstream from that point.
JOIN-REQUESTs carry the identity of all the cores associated with the
group. Assuming there are no on-tree routers in between, once the
join (subcode ACTIVE_JOIN) reaches the target core, if the target
core is not the primary core (as indicated in a separate field of the
join packet) it first acknowledges the received join by means of a
_________________________ _________________________
1 This document does not address the case where some 1 This document does not address the case where some
routers on a multi-access subnet may be running multi- routers on a multi-access subnet may be running multi-
cast routing protocols other than CBT. In such cases, cast routing protocols other than CBT. In such cases,
IGMP querier may be a non-CBT router, in which case the IGMP querier may be a non-CBT router, in which case the
CBT DR election breaks. This will be discussed in a CBT CBT DR election breaks. This will be discussed in a CBT
interoperability document, to appear shortly. interoperability document, to appear shortly.
2 "on-tree" refers to whether a router has a forward-
ing db entry for the corresponding group.
querier; in CBT these two roles go hand-in-hand. As a result, there JOIN-ACK, then sends a JOIN-REQUEST, subcode REJOIN-ACTIVE, to the
is no protocol overhead whatsoever associated with electing the CBT primary core router.
D-DR.
2.4. Tree Joining Process -- Details
The receipt of an IGMP group membership report by a CBT D-DR for a
CBT group not previously heard from triggers the tree joining pro-
cess.
Immediately subsequent to receiving an IGMP group membership report
for a CBT group not previously heard from, the D-DR unicasts a JOIN-
REQUEST to the first hop on the (unicast) path to the target core
specified in the CBT join packet.
Each CBT-capable router traversed on the path between the sending DR If the rejoin-active reaches the primary core, it responds by sending
and the core processes the join. However, if a join hits a CBT router a JOIN-ACK, subcode PRIMARY-REJOIN-ACK, which traverses the reverse-
that is already on-tree (footnote), the join is not propogated path of the join. The primary-rejoin-ack serves to confirm no loop is
further, but ACK'd downstream from that point. present without requiring explicit loop detection.
JOIN-REQUESTs carry the identity of all cores for the group. Assuming If some other on-tree router is encountered before the rejoin-active
there are no on-tree routers in between, once the join (subcode reaches the primary, that router responds with a JOIN-ACK, subcode
ACTIVE_JOIN) reaches the target core, if the target core is not the NORMAL. On receipt of the ack, subcode normal, the router sends a
primary core (as indicated in a separate field of the join packet) it join, subcode REJOIN-NACTIVE, which acts as a loop detection packet
first acknowledges the received join by means of a JOIN-ACK, then (see section 8.3). Note that loop detection is not necessary subse-
sends a JOIN-REQUEST, subcode REJOIN-ACTIVE, to the primary core quent to receiving a join-ack with subcode PRIMARY-REJOIN-ACK.
router. Either the primary core, or the first on-tree router encoun-
tered, acknowledges the received rejoin by means of a JOIN-ACK. In
the former case, the primary core responds by sending a join-ack,
subcode PRIMARY-REJOIN-ACK, which traverses the reverse-path of the
join. In the latter case, the join-ack is returned with subcode NOR-
MAL; the receiving router responds to this with a rejoin-Nactive, for
loop detection. Note that loop detection is not necessary subsequent
to receiving a join-ack with subcode PRIMARY-REJOIN-ACK. Loop detec-
tion is described further in section 5.3.
To facilitate detailed protocol description, we use a sample topol- To facilitate detailed protocol description, we use a sample topol-
ogy, illustrated in Figure 1 (shown over). Member hosts are shown as ogy, illustrated in Figure 1 (shown over). Member hosts are shown as
individual capital letters, routers are prefixed with R, and subnets individual capital letters, routers are prefixed with R, and subnets
_________________________
"on-tree" describes whether a router has a FIB entry
for the corresponding group.
are prefixed with S. are prefixed with S.
A B A B
| S1 S4 | | S1 S4 |
------------------- ----------------------------------------------- ------------------- -----------------------------------------------
| | | | | | | |
------ ------ ------ ------ ------ ------ ------ ------
| R1 | | R2 | | R5 | | R6 | | R1 | | R2 | | R5 | | R6 |
------ ------ ------ ------ ------ ------ ------ ------
C | | | | | C | | | | |
skipping to change at page 8, line 30 skipping to change at page 6, line 30
| | --------------------------------------------- | | ---------------------------------------------
| |----| | | | |----| | |
---| R7 |-----| ------ ---| R7 |-----| ------
| |----| |------------------| R4 | | |----| |------------------| R4 |
| S7 | ------ F | S7 | ------ F
| | | S6 | | | | S6 |
|-E | --------------------------------- |-E | ---------------------------------
| | | |
| ------ | ------
|---| |---------------------| R8 | |---| |---------------------| R8 |
|R12 -----| ------ G |R12 ----| ------ G
|---| | | | S10 |---| | | | S10
| S14 ---------------------------- | S14 ----------------------------
| | | |
I --| ------ I --| ------
| | R9 | | | R9 |
------ ------
| S12 | S12
| ---------------------------- | ----------------------------
S15 | | S15 | |
| ------ | ------
|----------------------|R10 | |----------------------|R10 |
J ---| ------ H J ---| ------ H
| | | | | |
| ---------------------------- | ----------------------------
| S13 | S13
Figure 1. Example Network Topology Figure 1. Example Network Topology
Taking the example topology in figure 1, host A is the group initia- Taking the example topology in figure 1, host A is the group initia-
tor, and has elected core routers R4 (primary core) and R9 (secondary tor, and has configured core routers R4 (primary core) and R9 (secon-
core) by some external protocol. We assume the local CBT DR discovers dary core).
<core,group> mappings by "some means", possible one of the mechanisms
described in section 2.1.
Router R1 receives an IGMP host membership report, and proceeds to Router R1 receives an IGMP host membership report, and proceeds to
unicast a JOIN-REQUEST, subcode ACTIVE-JOIN to the next-hop on the unicast a JOIN-REQUEST, subcode ACTIVE-JOIN to the next-hop on the
path to R4 (R3), the target core. R3 receives the join, caches the path to R4 (R3), the target core. R3 receives the join, caches the
necessary group information, and forwards it to R4 -- the target of necessary group information, and forwards it to R4 -- the target of
the join. the join.
R4, being the target of the join, sends a JOIN_ACK back out of the R4, being the target of the join, sends a JOIN_ACK (subcode NORMAL)
receiving interface to the previous-hop sender of the join, R3. A back out of the receiving interface to the previous-hop sender of the
JOIN-ACK, like a JOIN-REQUEST, is processed hop-by-hop by each router join, R3. A JOIN-ACK, like a JOIN-REQUEST, is processed hop-by-hop by
on the reverse-path of the corresponding join. The receipt of a each router on the reverse-path of the corresponding join. The
join-ack establishes the receiving router on the corresponding CBT receipt of a join-ack establishes the receiving router on the
tree, i.e. the router becomes part of a branch on the delivery tree. corresponding CBT tree, i.e. the router becomes part of a branch on
Finally, R3 sends a join-ack to R1. A new CBT branch has been the delivery tree. Finally, R3 sends a join-ack to R1. A new CBT
created, attaching subnet S1 to the CBT delivery tree for the branch has been created, attaching subnet S1 to the CBT delivery tree
corresponding group (footnote 2). for the corresponding group.
For the period between any CBT-capable router forwarding (or ori- For the period between any CBT-capable router forwarding (or ori-
ginating) a JOIN_REQUEST and receiving a JOIN_ACK the corresponding ginating) a JOIN_REQUEST and receiving a JOIN_ACK the corresponding
router is not permitted to acknowledge any subsequent joins received router is not permitted to acknowledge any subsequent joins received
for the same group; rather, the router caches such joins till such for the same group; rather, the router caches such joins till such
time as it has itself received a JOIN_ACK for the original join. Only time as it has itself received a JOIN_ACK for the original join. Only
then can it acknowledge any cached joins. A router is said to be in a then can it acknowledge any cached joins. A router is said to be in a
pending-join state if it is awaiting a JOIN_ACK itself. "pending-join" state if it is awaiting a JOIN_ACK itself.
Note that the presence of underlying transient asymmetric routes is
irrelevant to the tree-building process; CBT tree branches are sym-
metric by the nature in which they are built. Joins set up transient
state (incoming and outgoing interface state) in all routers along a
path to a particular core. The corresponding join-ack traverses the
reverse-path of the join as dictated by the transient state, and not
the path that underlying routing would dictate. Whilst permanent
asymmetric routes could pose a problem for CBT, transient
_________________________
2 At this point, it is proposed that IGMP (v3) group
multicasts a notification across the subnet indicating
to member hosts that the delivery tree has been joined
successfully. Such a message would greatly benefit mul-
ticast protocols requiring explicit joins [5, 10].
Note that the presence of asymmetric routes in the underlying unicast
routing, does not affect the tree-building process; CBT tree branches
are symmetric by the nature in which they are built. Joins set up
transient state (incoming and outgoing interface state) in all
routers along a path to a particular core. The corresponding join-ack
traverses the reverse-path of the join as dictated by the transient
state, and not the path that underlying routing would dictate. Whilst
permanent asymmetric routes could pose a problem for CBT, transient
asymmetricity is detected by the CBT protocol. asymmetricity is detected by the CBT protocol.
2.5. Default DRs and Group DRs 3.4. Forwarding Joins on Multi-Access Subnets
The DR election mechanism does not guarantee that the DR will be the The DR election mechanism does not guarantee that the DR will be the
router that actually forwards a join off a multi-access network; the router that actually forwards a join off a multi-access network; the
first hop on the path to a particular core might be via another first hop on the path to a particular core might be via another
router on the same (sub)network, which actually forwards off-subnet. router on the same subnetwork, which actually forwards off-subnet.
The CBT router that becomes the interface between the subnet and the
rest of the CBT tree, i.e. the CBT router at which a join-ack arrives
on the subnet, becomes the CBT GROUP DR. This group-specific DR (G-
DR) is a token (implicit) identity. In the normal case where there is
no subnet extra hop, the receipt of a JOIN-ACK means that the D-DR
becomes the G-DR for the specified group.
Although very much the same, let's see another example using our Although very much the same, let's see another example using our
example topology of figure 1 of a host joining a CBT tree for the example topology of figure 1 of a host joining a CBT tree for the
case where more than one CBT router exists on the host subnetwork. case where more than one CBT router exists on the host subnetwork.
B's subnet, S4, has 3 CBT routers attached. Assume also that R6 has B's subnet, S4, has 3 CBT routers attached. Assume also that R6 has
been elected IGMP-querier and CBT D-DR. been elected IGMP-querier and CBT D-DR.
R6 (S4's D-DR) receives an IGMP group membership report. By some R6 (S4's D-DR) receives an IGMP group membership report. R6's config-
means, R6 discovers the <core, group> mapping for the group specified ured information suggests R4 as the target core for this group. R6
in the report; R4 is the target core for the group. R6 generates a thus generates a join-request for target core R4, subcode
join-request for target core R4, subcode ACTIVE_JOIN. R6's routing ACTIVE_JOIN. R6's routing table says the next-hop on the path to R4
table says the next-hop on the path to R4 is R2, which is on the same is R2, which is on the same subnet as R6. This is irrelevant to R6,
subnet as R6. This is irrelevant to R6, which unicasts it to R2. R2 which unicasts it to R2. R2 unicasts it to R3, which happens to be
unicasts it to R3, which happens to be already on-tree for the speci- already on-tree for the specified group (from R1's join). R3 there-
fied group (from R1's join). R3 therefore can acknowledge the arrived fore can acknowledge the arrived join and unicast the ack back to R2.
join and unicast it back to R2. R2 realises it is not the origin of R2 forwards it to R6, the origin of the join-request.
the corresponding join-request, but sees that the origin (R6) is on
the same subnet as itself, and that over which the join-ack should be
forwarded to the origin, R6. R2 unicasts the join-ack on its final
hop. R2 has thus become the group's G-DR, with R6 remaining the D-DR
for all groups.
If an IGMP membership report is received by a D-DR with a join for If an IGMP membership report is received by a D-DR with a join for
the same group already pending, or if the D-DR is already on-tree for the same group already pending, or if the D-DR is already on-tree for
the group, it takes no action. the group, it takes no action.
2.6. Tree Teardown 3.5. On-Demand "Core Tree" Building
The "core tree", the part of a CBT tree linking all of its cores
together, is built on-demand. That is, the core tree is only built
subsequent to a non-primary (secondary) core receiving a join-
request. This triggers the secondary core to join the primary core;
the primary need never join anything.
Join-requests carry an ordered list of core routers (and the identity
of the primary core in its own separate field), making it possible
for the secondary cores to know where to join when they themselves
receive a join. Hence, the primary core must be uniquely identified
as such across a whole group. A secondary joins the primary subse-
quent to sending an ack for the join just received.
3.6. Tree Teardown
There are two scenarios whereby a tree branch may be torn down: There are two scenarios whereby a tree branch may be torn down:
+ During a re-configuration. If a router's best next-hop to the + During a re-configuration. If a router's best next-hop to the
specified core is one of its existing children, then before specified core is one of its existing children, then before
sending the join it must tear down that particular downstream sending the join it must tear down that particular downstream
branch. It does so by sending a FLUSH_TREE message which is pro- branch. It does so by sending a FLUSH_TREE message which is pro-
cessed hop-by-hop down the branch. All routers receiving this cessed hop-by-hop down the branch. All routers receiving this
message must process it and forward it to all their children. message must process it and forward it to all their children.
Routers that have received a flush message will re-establish Routers that have received a flush message will re-establish
themselves on the delivery tree if they have directly connected themselves on the delivery tree if they have directly connected
subnets with group presence. subnets with group presence.
+ If a CBT router has no children it periodically checks all its + If a CBT router has no children it periodically checks all its
directly connected subnets for group member presence. If no directly connected subnets for group member presence. If no
member presence is ascertained on any of its subnets it sends a member presence is ascertained on any of its subnets it sends a
QUIT_REQUEST upstream to remove itself from the tree. QUIT_REQUEST upstream to remove itself from the tree.
The receipt of a quit-request triggers the receiving parent
router to immediately query its forwarding database, and estab-
lish whether there remains any directly connected group member-
ship, or any children, for the said group. If not, the router
itself sends a quit-request upstream.
The following example, using the example topology of figure 1, shows The following example, using the example topology of figure 1, shows
how a tree branch is gracefully torn down using a QUIT_REQUEST. how a tree branch is gracefully torn down using a QUIT_REQUEST.
Assume group member B leaves group G on subnet S4. B issues an IGMP Assume group member B leaves group G on subnet S4. B issues an IGMP
HOST-MEMBERSHIP-LEAVE (relevant only to IGMPv2 and later versions) HOST-MEMBERSHIP-LEAVE (relevant only to IGMPv2 and later versions)
message which is multicast to the "all-routers" group (224.0.0.2). message which is multicast to the "all-routers" group (224.0.0.2).
R6, the subnet's D-DR and IGMP-querier, responds with a group- R6, the subnet's D-DR and IGMP-querier, responds with a group-
specific-QUERY. No hosts respond within the required response inter- specific-QUERY. No hosts respond within the required response inter-
val, so D-DR assumes group G traffic is no longer wanted on subnet val, so D-DR assumes group G traffic is no longer wanted on subnet
S4. S4.
skipping to change at page 12, line 7 skipping to change at page 10, line 13
removes the corresponding parent information, i.e. it does not removes the corresponding parent information, i.e. it does not
wait for the receipt of a QUIT-ACK. wait for the receipt of a QUIT-ACK.
R3 responds to the QUIT by unicasting a QUIT-ACK to R2. R3 subse- R3 responds to the QUIT by unicasting a QUIT-ACK to R2. R3 subse-
quently checks whether it in turn can send a quit by checking group G quently checks whether it in turn can send a quit by checking group G
presence on its directly attached subnets, and any group G children. presence on its directly attached subnets, and any group G children.
It has the latter (R1 is its child on the group G tree), and so R3 It has the latter (R1 is its child on the group G tree), and so R3
cannot itself send a quit. However, the branch R3-R2-R6 has been cannot itself send a quit. However, the branch R3-R2-R6 has been
removed from the tree. removed from the tree.
3. Data Packet Forwarding Rules 4. Data Packet Forwarding Rules
When a router receives (non-locally originated) data packets for for- 4.1. Native Mode
warding over directly attached member subnets, it only does so over
the set of outgoing member subnets (interfaces) for which that router
is DR, irrespective of whether group membership is registered on
other local interfaces. In addition, in native mode, packets are for-
warded over any remaining interfaces specified by the FIB entry for
the group that are not in the above set (excluding the incoming
interface). In CBT mode, encapsulated data packets are forwarded over
the full set of interfaces specified by the FIB entry, except the
incoming interface.
A router only forwards data packets originated by directly attached In native mode, when a router receives a data packet, the packet's
hosts iff the router is the DR on the interface over which those TTL is decremented, and, provided the packet's TTL remains greater
packets were received. than/equal to 1, forwards the data packet over all outgoing inter-
faces that are part of the corresponding CBT tree.
4. Data Packet Forwarding -- Encapsulation Details 4.2. CBT Mode
In "native mode" all data packets are forwarded over CBT tree inter- In CBT mode, routers ignore all non-locally originated native mode
faces as native IP multicasts, i.e. there are no encapsulations multicast data packets. Locally-originated multicast data is only
required. This assumes that CBT is the multicast routing protocol in processed by a subnet's D-DR; in this case, the D-DR forwards the
operation within the domain (or "cloud") in question, and that all native multicast data packet, TTL 1, over any outgoing member subnets
routers within the domain of operation are CBT-capable, i.e. there for which that router is D-DR. Additionally, the D-DR encapsulates
are no "tunnels". the locally-originated multicast and forwards it, CBT mode, over all
tree interfaces, as dictated by the CBT forwarding database.
When a router, operating in CBT mode, receives an encapsulated multi-
cast data packet, it decapsulates one copy to send, native mode and
TTL 1, over any directly attached member subnets for which it is D-
DR. Additionally, an encapsulated copy is forwarded over all outgoing
tree interfaces, as dictated by the CBT forwarding database.
Like the outer encapsulating IP header, the TTL value of the encapsu-
lating CBT header is decremented each time it is processed by a CBT
router.
An example of CBT mode forwarding is provided towards the end of the
next section.
5. CBT Mode -- Encapsulation Details
In a multi-protocol environment, whose infrastructure may include In a multi-protocol environment, whose infrastructure may include
non-multicast-capable routers, it is necessary to tunnel data packets non-multicast-capable routers, it is necessary to tunnel data packets
between CBT-capable routers. This is called "CBT mode". Data packets between CBT-capable routers. This is called "CBT mode". Data packets
are de-capsulated by CBT routers (such that they become native mode are de-capsulated by CBT routers (such that they become native mode
data packets) before being forwarded over subnets with member hosts. data packets) before being forwarded over subnets with member hosts.
When multicasting (native mode) to member hosts, the TTL value of the When multicasting (native mode) to member hosts, the TTL value of the
original IP header is set to one. CBT mode encapsulation is as fol- original IP header is set to one. CBT mode encapsulation is as fol-
lows: lows:
skipping to change at page 13, line 22 skipping to change at page 11, line 34
router directly attached to the origin of a data packet. This value router directly attached to the origin of a data packet. This value
is decremented each time it is processed by a CBT router. An encap- is decremented each time it is processed by a CBT router. An encap-
sulated data packet is discarded when the CBT header TTL value sulated data packet is discarded when the CBT header TTL value
reaches zero. reaches zero.
The purpose of the (outer) encapsulating IP header is to "tunnel" The purpose of the (outer) encapsulating IP header is to "tunnel"
data packets between CBT-capable routers (or "islands"). The outer IP data packets between CBT-capable routers (or "islands"). The outer IP
header's TTL value is set to the "length" of the corresponding tun- header's TTL value is set to the "length" of the corresponding tun-
nel, or MAX_TTL (255)if this is not known, or subject to change. nel, or MAX_TTL (255)if this is not known, or subject to change.
For native mode IP multicasts, i.e. those without any extra encapsu-
lation, the TTL value of the IP header is decremented each time the
packet is received by a multicast router.
It is worth pointing out here the distinction between subnetworks and It is worth pointing out here the distinction between subnetworks and
tree branches, although they can be one and the same. For example, a tree branches (especially apparent in CBT mode), although they can be
multi-access subnetwork containing routers and end-systems could one and the same. For example, a multi-access subnetwork containing
potentially be both a CBT tree branch and a subnetwork with group routers and end-systems could potentially be both a CBT tree branch
member presence. A tree branch which is not simultaneously a subnet- and a subnetwork with group member presence. A tree branch which is
work is either a "tunnel" or a point-to-point link. not simultaneously a subnetwork is either a "tunnel" or a point-to-
point link.
In CBT mode there are three forwarding methods used by CBT routers: In CBT mode there are three forwarding methods used by CBT routers:
+ IP multicasting. This method is used to send a data packet + IP multicasting. This method sends an unaltered (unencapsulated)
across a directly-connected subnetwork with group member pres- data packet across a directly-connected subnetwork with group
ence. System host changes are not required for CBT. Similarly, member presence. Any host originating multicast data, does so
end-systems originating multicast data do so in traditional IP- in this form.
style.
+ CBT unicasting. This method is used for sending data packets + CBT unicasting. This method is used for sending data packets
encapsulated (as illustrated above) across a tunnel or point- encapsulated (as illustrated above) across a tunnel or point-
to-point link. En/de-capsulation takes place in CBT routers. to-point link. En/de-capsulation takes place in CBT routers.
+ CBT multicasting. Routers on multi-access links use this method + CBT multicasting. Routers on multi-access links use this method
to send data packets encapsulated (as illustrated above) but the to send data packets encapsulated (as illustrated above) but the
outer encapsulating IP header contains a multicast address. This outer encapsulating IP header contains a multicast address. This
method is used when a parent or multiple children are reachable method is used when a parent or multiple children are reachable
over a single physical interface, as could be the case on a over a single physical interface, as could be the case on a
multi-access Ethernet. The IP module of end-systems subscribed multi-access Ethernet. The IP module of end-systems subscribed
to the same group will discard these multicasts since the CBT to the same group will discard these multicasts since the CBT
payload type (protocol id) of the outer IP header is not recog- payload type (protocol id) of the outer IP header is not recog-
nizable by hosts. nizable by hosts.
CBT routers create Forwarding Information Base (FIB) entries whenever CBT routers create forwarding database (db) entries whenever they
they send or receive a JOIN_ACK. The FIB describes the parent-child send or receive a JOIN_ACK. The forwarding database describes the
relationships on a per-group basis. A FIB entry dictates over which parent-child relationships on a per-group basis. A forwarding data-
tree interfaces, and how (unicast or multicast) a data packet is to base entry dictates over which tree interfaces, and how (unicast or
be sent. Additionally, a data packet is IP multicast over any multicast) a data packet is to be sent. A forwarding db entry is
directly-connected subnetworks with group member presence. Such shown below:
interfaces are kept in a separate table relating to IGMP. A FIB entry
is shown below: Note that a CBT forwarding db is required for both CBT-mode and
native-mode multicasting.
The field lengths shown above assume a maximum of 16 directly con-
nected neighbouring routers.
Using our example topology in figure 1, let's assume the CBT routers
are operating in CBT mode.
Member G originates an IP multicast (native mode) packet. R8 is the
DR for subnet S10. R8 therefore sends a (native mode) copy over any
member subnets for which it is DR - S14 and S10 (the copy over S10 is
not sent, since the packet was originally received from S10). The
multicast packet is CBT mode encapsulated by R8, and unicast to each
of its children, R9 and R12; these children are not reachable over
the same interface, otherwise R8 could have sent a CBT mode multi-
cast. R9, the DR for S12, need not IP multicast (native mode) onto
32-bits 4 4 4 8 32-bits 4 4 4 8
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group-id | parent addr | parent vif | No. of | | | group-id | parent addr | parent vif | No. of | |
| | index | index |children | children | | | index | index |children | children |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+-+-+-+-+-++-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+-+-+-+-+-++-+-+-+-+-+-+-+-+-+
|chld addr |chld vif | |chld addr |chld vif |
| index | index | | index | index |
|+-+-+-+-+-+-+-+-+-+-+ |+-+-+-+-+-+-+-+-+-+-+
|chld addr |chld vif | |chld addr |chld vif |
| index | index | | index | index |
|+-+-+-+-+-+-+-+-+-+-+ |+-+-+-+-+-+-+-+-+-+-+
|chld addr |chld vif | |chld addr |chld vif |
| index | index | | index | index |
|+-+-+-+-+-+-+-+-+-+-+ |+-+-+-+-+-+-+-+-+-+-+
| | | |
| etc. | | etc. |
|+-+-+-+-+-+-+-+-+-+-| |+-+-+-+-+-+-+-+-+-+-|
Figure 3. CBT FIB entry Figure 3. CBT forwarding database entry
Note that a CBT FIB is required for both CBT-mode and native-mode
multicasting.
The field lengths shown above assume a maximum of 16 directly con-
nected neighbouring routers.
When a data packet arrives at a CBT router, the following rules S12 since there are no members present there. R9, in CBT mode, uni-
apply: casts the packet to R10, which is the DR for S13 and S15. R10 decap-
sulates the CBT mode packet and IP multicasts (native mode) to each
of S13 and S15.
+ if the packet is an IP-style multicast, it is checked to see if Going upstream from R8, R8 CBT mode unicasts to R4. It is DR for all
it originated locally (i.e. if the arrival interface subnetmask directly connected subnets and therefore IP multicasts (native mode)
bitwise ANDed with the packet's source IP address equals the the data packet onto S5, S6 and S7, all of which have member pres-
arrival interface's subnet number, then the packet was sourced ence. R4 unicasts, CBT mode, the packet to all outgoing children, R3
locally). If the packet is not of local origin, it is discarded. and R7 (NOTE: R4 does not have a parent since it is the primary core
router for the group). R7 IP multicasts (native mode) onto S9. R3 CBT
mode unicasts to R1 and R2, its children. Finally, R1 IP multicasts
(native mode) onto S1 and S3, and R2 IP multicasts (native mode) onto
S4.
+ the packet is IP multicast to all directly connected subnets 6. Non-Member Sending
with group member presence. The packet is sent with an IP TTL
value of 1 in this case.
+ the packet is encapsulated for CBT forwarding (see figure 2) and For a multicast data packet to span beyond the scope of the originat-
unicast to parent and children. However, if more than one child ing subnetwork at least one CBT-capable router must be present on
is reachable over the same interface the packet will be CBT mul- that subnetwork. The default DR (D-DR) for the group on the
ticast. Therefore, it is possible that an IP-style multicast and subnetwork must encapsulate the (native) IP-style packet and unicast
a CBT multicast will be forwarded over a particular subnetwork. it to a core for the group. The encapsulation required is shown in
figure 2; CBT mode encapsulation is necessary so the receiving CBT
router can demultiplex the packet accordingly.
NOTE: the TTL value of encapsulated data packets is manipulated as If the encapsulated packet hits the tree at a non-core router, the
described at the beginning of this section. packet is forwarded according to the forwarding rules of section 4.2.
Using our example topology in figure 1, let's assume member G ori- If the first on-tree router encountered is the target core, various
ginates an IP multicast packet. R8 is the DR for subnet S10. R8 CBT scenarios define what happens next:
unicasts the packet to each of its children, R9 and R12. These chil-
dren are not reachable over the same interface. R8, being the DR for
subnets S14 and S10 also IP multicasts the packet to S14 (S10
received the IP style packet already from the originator). R9, the DR
for S12, need not IP multicast onto S12 since there are no members
present there. R9 CBT unicasts the packet to R10, which is the DR for
S13 and S15. It IP multicasts to both S13 and S15.
Going upstream from R8, R8 CBT unicasts to R4. It is DR for all + if the target core is not the primary, and the target core has
directly connected subnets and therefore IP multicasts the data not yet joined the tree (because it has not yet itself received
packet onto S5, S6 and S7, all of which have member presence. R4 uni- any join-requests), the target core simply forwards the encapsu-
casts the packet to all outgoing children, R3 and R7 (NOTE: R4 does lated packet to the primary core.
not have a parent since it is the primary core router for the group).
R7 IP multicasts onto S9. R3 CBT unicasts to R1 and R2, its children.
Finally, R1 IP multicasts onto S1 and S3, and R2 IP multicasts onto
S4.
4.1. Non-Member Sending if the target core is not the primary, but has children, the
target core forwards the data according to the rules of section
4.2.
For a multicast data packet to span beyond the scope of the originat- + if the target core is the primary, the primary forwards the data
ing subnetwork at least one CBT-capable router must be present on according to the rules of section 4.2.
that subnetwork. The default DR (D-DR) for the group on the subnet-
work must encapsulate the (native) IP-style packet and unicast it to
a core for the group. In native mode this encapsualation constitutes
IP-in-IP. In CBT mode, the encapsulation required is shown in figure
2. In both cases, CBT routers are required to know <core, group> map-
pings. The alternatives for discovering these are discussed in sec-
tion 2.1. Beyond this, this topic is beyond the scope of this docu-
ment.
5. Eliminating the Topology-Discovery Protocol in the Presence of Tun- 7. Eliminating the Topology-Discovery Protocol in the Presence of Tun-
nels nels
Traditionally, multicast protocols operating within a virtual topol- Traditionally, multicast protocols operating within a virtual topol-
ogy, i.e. an overlay of the physical topology, have required the ogy, i.e. an overlay of the physical topology, have required the
assistance of a multicast topology discovery protocol, such as that assistance of a multicast topology discovery protocol, such as that
present in DVMRP. However, it is possible to have a multicast proto- present in DVMRP [1]. However, it is possible to have a multicast
col operate within a virtual topology without the need for a multi- protocol operate within a virtual topology without the need for a
cast topology discovery protocol. One way to achieve this is by hav- multicast topology discovery protocol. One way to achieve this is by
ing a router configure all its tunnels to its virtual neighbours in having a router configure all its tunnels to its virtual neighbours
advance. A tunnel is identified by a local interface address and a in advance. A tunnel is identified by a local interface address and a
remote interface address. Routing is replaced by "ranking" each such remote interface address. Routing is replaced by "ranking" each such
tunnel interface associated with a particular core address; if the tunnel interface associated with a particular core address; if the
highest-ranked route is unavailable (tunnel end-points are required highest-ranked route is unavailable (tunnel end-points are required
to run an Hello-like protocol between themselves) then the next- to run an Hello-like protocol between themselves) then the next-
highest ranked available route is selected, and so on. The exact highest ranked available route is selected, and so on. The exact
specification of the Hello protocol is outside the scope of this specification of the Hello protocol is outside the scope of this
document. document.
CBT trees are built using the same join/join-ack mechanisms as CBT trees are built using the same join/join-ack mechanisms as
before, only now some branches of a delivery tree run in native mode, before, only now some branches of a delivery tree run in native mode,
skipping to change at page 17, line 19 skipping to change at page 15, line 26
#3 phys native - #3 phys native -
#4 tunnel cbt 128.16.6.8 #4 tunnel cbt 128.16.6.8
#5 tunnel cbt 128.96.41.1 #5 tunnel cbt 128.96.41.1
core backup-intfs core backup-intfs
-------------------- --------------------
A #5, #2 A #5, #2
B #3, #5 B #3, #5
C #2, #4 C #2, #4
The CBT FIB needs to be slightly modified to accommodate an extra The CBT forwarding database needs to be slightly modified to accommo-
field, "backup-intfs" (backup interfaces). The entry in this field date an extra field, "backup-intfs" (backup interfaces). The entry in
specifies a backup interface whenever a tunnel interface specified in this field specifies a backup interface whenever a tunnel interface
the FIB is down. Additional backups (should the first-listed backup specified in the forwarding db is down. Additional backups (should
be down) are specified for each core in the core backup table. For the first-listed backup be down) are specified for each core in the
example, if interface (tunnel) #2 were down, and the target core of a core backup table. For example, if interface (tunnel) #2 were down,
CBT control packet were core A, the core backup table suggests using and the target core of a CBT control packet were core A, the core
interface #5 as a replacement. If interface #5 happened to be down backup table suggests using interface #5 as a replacement. If inter-
also, then the same table recommends interface #2 as a backup for face #5 happened to be down also, then the same table recommends
core A. interface #2 as a backup for core A.
6. Tree Maintenance 8. Tree Maintenance
Once a tree branch has been created, i.e. a CBT router has received a Once a tree branch has been created, i.e. a CBT router has received a
JOIN_ACK for a JOIN_REQUEST previously sent (forwarded), a child JOIN_ACK for a JOIN_REQUEST previously sent (or forwarded), a child
router is required to monitor the status of its parent/parent link at router is required to monitor the status of its parent/parent link at
fixed intervals by means of a ``keepalive'' mechanism operating fixed intervals by means of a "keepalive" mechanism operating between
between them. The ``keepalive'' mechanism is implemented by means of them. The "keepalive" mechanism is implemented by means of two CBT
two CBT control messages: CBT_ECHO_REQUEST and CBT_ECHO_REPLY. Adja- control messages: CBT_ECHO_REQUEST and CBT_ECHO_REPLY. Adjacent CBT
cent CBT routers only need to send one keepalive per link, regardless routers only need to send one keepalive per link, regardless of how
of how many groups are present on that link. This aggregation stra- many groups are present on that link. This aggregation strategy is
tegy is expected to conserve considerable bandwidth on "busy" links, expected to conserve considerable bandwidth on "busy" links, such as
such as those nearer the "centre" of the network. transit network, or backbone network, links.
The keepalive protocol is simple, as follows: a child unicasts a The keepalive protocol is simple, as follows: a child unicasts a
CBT-ECHO-REQUEST to its parent, which unicasts a CBT-ECHO-REPLY in CBT-ECHO-REQUEST to its parent, which unicasts a CBT-ECHO-REPLY in
response. response.
For any CBT router, if its parent router, or path to the parent, For any CBT router, if its parent router, or path to the parent,
fails, the child is initially responsible for re-attaching itself, fails, the child is initially responsible for re-attaching itself,
and therefore all routers subordinate to it on the same branch, to and therefore all routers subordinate to it on the same branch, to
the tree. the tree.
6.1. Router Failure CBT echo requests and replies can be aggregated and sent on a per
link basis, rather than individually for each group; the CBT control
packet header (section 10.2) accommodates such aggregation.
8.1. Router Failure
An on-tree router can detect a failure from the following two cases: An on-tree router can detect a failure from the following two cases:
+ if the child responsible for sending keepalives across a partic- + if the child responsible for sending keepalives across a partic-
ular link stops receiving CBT_ECHO_REPLY messages. In this case ular link stops receiving CBT_ECHO_REPLY messages. In this case
the child realises that its parent has become unreachable and the child realises that its parent has become unreachable and
must therefore try and re-connect to the tree for all groups must therefore try and re-connect to the tree for all groups
represented on the parent/child link. Until an aggregation stra- represented on the parent/child link. For all groups sharing a
tegy is fully worked out, a (re)join must be sent for each group common core set (corelist), provided those groups can be speci-
individually. (We present some ideas on rejoin aggregation in fied as a CIDR-like aggregate, an aggregated join can be sent
Appendix A). representing a range of groups. Aggregated joins are made pos-
sible by the presence of a "group mask" field in the CBT control
packet header. Aggregated joins are also discussed in Appendix
A.
The rejoining router (that which is immediately subordinate to If a range of groups cannot be represented by a mask, then each
the failure) sends a JOIN_REQUEST (subcode ACTIVE_JOIN if it has group must be re-joined individually.
no children attached, and subcode ACTIVE_REJOIN if at least one
child is attached) to the best next-hop router on the path to CBT's re-join strategy is as follows: the rejoining router which
the elected core. If no JOIN-ACK is received after three is immediately subordinate to the failure sends a JOIN_REQUEST
retransmissions, each transmission being at PEND-JOIN-INTERVAL (subcode ACTIVE_JOIN if it has no children attached, and subcode
(10 secs), an alternate core is elected from the core list, and ACTIVE_REJOIN if at least one child is attached) to the best
the process repeated. If all cores have been tried unsuccess- next-hop router on the path to the elected core. If no JOIN-ACK
fully, the D-DR has no option but to give up. is received after three retransmissions, each transmission being
at PEND-JOIN-INTERVAL (10 secs), the next-highest priority core
is elected from the core list, and the process repeated. If all
cores have been tried unsuccessfully, the D-DR has no option but
to give up.
+ if a parent stops receiving CBT_ECHO_REQUESTs from a child. In + if a parent stops receiving CBT_ECHO_REQUESTs from a child. In
this case the parent simply removes the child interface from FIB this case, if the parent has not received an expected keepalive
entries that are represented by that parent/child link. after CHILD_ASSERT_EXPIRE_TIME, all children reachable across
that link are removed from the parent's forwarding database.
6.2. Router Re-Starts 8.2. Router Re-Starts
There are two cases to consider here: There are two cases to consider here:
+ Core re-start. All JOIN-REQUESTs (all types) carry the identi- + Core re-start. All JOIN-REQUESTs (all types) carry the identi-
ties (i.e. addresses) of each of the cores for a group. If a ties (i.e. IP addresses) of each of the cores for a group. If a
router is a core for a group, but has only recently re-started, router is a core for a group, but has only recently re-started,
it will not be aware that it is a core for any group(s). In such it will not be aware that it is a core for any group(s). In such
circumstances, a core only becomes aware that it is such by circumstances, a core only becomes aware that it is such by
receiving a JOIN-REQUEST. Subsequent to a core learning its receiving a JOIN-REQUEST. Subsequent to a core learning its
status in this way, if it is not the primary core it ack- status in this way, if it is not the primary core it ack-
nowledges the received join, then sends a JOIN_REQUEST (subcode nowledges the received join, then sends a JOIN_REQUEST (subcode
ACTIVE_REJOIN) to the primary core. If the re-started router is ACTIVE_REJOIN) to the primary core. If the re-started router is
the primary core, it need take no action, i.e. in all cir- the primary core, it need take no action, i.e. in all cir-
cumstances, the primary core simply waits to be joined by other cumstances, the primary core simply waits to be joined by other
routers. routers.
+ Non-core re-start. In this case, the router can only join the + Non-core re-start. In this case, the router can only join the
tree again if a downstream router sends a JOIN_REQUEST through tree again if a downstream router sends a JOIN_REQUEST through
it, or it is elected DR for one of its directly attached sub- it, or it is elected DR for one of its directly attached sub-
nets, and subsequently receives an IGMP membership report. nets, and subsequently receives an IGMP membership report.
6.3. Route Loops 8.3. Route Loops
Routing loops are only a concern when a router with at least one Routing loops are only a concern when a router with at least one
child is attempting to re-join a CBT tree. In this case the re- child is attempting to re-join a CBT tree. In this case the re-
joining router sends a JOIN_REQUEST (subcode ACTIVE REJOIN) to the joining router sends a JOIN_REQUEST (subcode ACTIVE REJOIN) to the
best next-hop on the path to an elected core. This join is forwarded best next-hop on the path to an elected core. This join is forwarded
as normal until it reaches either the specified core, another core, as normal until it reaches either the specified core, another core,
or a non-core router that is already part of the tree. If the rejoin or a non-core router that is already part of the tree. If the rejoin
reaches the primary core, loop detection is not necessary. The pri- reaches the primary core, loop detection is not necessary because the
mary core acks an active-rejoin by means of a JOIN-ACK, subcode primary never has a parent. The primary core acks an active-rejoin by
PRIMARY-REJOIN-ACK. This ack must be processed by each router on the means of a JOIN-ACK, subcode PRIMARY-REJOIN-ACK. This ack must be
reverse-path of the active-rejoin. If an active-rejoin is terminated processed by each router on the reverse-path of the active-rejoin;
by any router on the tree other than the primary core, loop detection this ack creates tree state, just like a normal join-ack.
must take place, as we now describe.
If an active-rejoin is terminated by any router on the tree other
than the primary core, loop detection must take place, as we now
describe.
If, in response to an active-rejoin, a JOIN-ACK is returned, subcode If, in response to an active-rejoin, a JOIN-ACK is returned, subcode
NORMAL (as opposed to an ack with subcode PRIMARY-REJOIN-ACK), the NORMAL (as opposed to an ack with subcode PRIMARY-REJOIN-ACK), the
router receiving the ack subsequently generates a JOIN-REQUEST, sub- router receiving the ack subsequently generates a JOIN-REQUEST, sub-
code NACTIVE-REJOIN (non-active rejoin). This packet serves only to code NACTIVE-REJOIN (non-active rejoin). This packet serves only to
detect loops; it does not create any transient state in the routers detect loops; it does not create any transient state in the routers
it traverses, other than the originating router. Any on-tree router it traverses, other than the originating router. Any on-tree router
receiving a non-active rejoin is required to forward it over its receiving a non-active rejoin is required to forward it over its
parent interface for the specified group. In this way, it will either parent interface for the specified group. In this way, it will either
reach the primary core, which returns, directly to the sender, a join reach the primary core, which returns, directly to the sender, a join
skipping to change at page 20, line 14 skipping to change at page 18, line 38
sends a QUIT_REQUEST to its newly-established parent and the loop is sends a QUIT_REQUEST to its newly-established parent and the loop is
broken. broken.
Using figure 4 (over) to demonstrate this, if R3 is attempting to Using figure 4 (over) to demonstrate this, if R3 is attempting to
re-join the tree (R1 is the core in figure 4) and R3 believes its re-join the tree (R1 is the core in figure 4) and R3 believes its
best next-hop to R1 is R6, and R6 believes R5 is its best next-hop to best next-hop to R1 is R6, and R6 believes R5 is its best next-hop to
R1, which sees R4 as its best next-hop to R1 -- a loop is formed. R3 R1, which sees R4 as its best next-hop to R1 -- a loop is formed. R3
begins by sending a JOIN_REQUEST (subcode ACTIVE_REJOIN, since R4 is begins by sending a JOIN_REQUEST (subcode ACTIVE_REJOIN, since R4 is
its child) to R6. R6 forwards the join to R5. R5 is on-tree for the its child) to R6. R6 forwards the join to R5. R5 is on-tree for the
group, so responds to the active-rejoin with a JOIN-ACK, subcode NOR- group, so responds to the active-rejoin with a JOIN-ACK, subcode NOR-
MAL (the ack traverses R6 on its way to R3). R3 now generates a MAL (the ack traverses R6 on its way to R3).
JOIN-REQUEST, subcode NACTIVE-REJOIN, and forwards this to its
parent, R6. R6 forwards the non-active rejoin to R5, its parent. R5 R3 now generates a JOIN-REQUEST, subcode NACTIVE-REJOIN, and forwards
does similarly, as does R4. Now, the non-active rejoin has reached this to its parent, R6. R6 forwards the non-active rejoin to R5, its
R3, which originated it, so R3 concludes a loop is present on the parent. R5 does similarly, as does R4. Now, the non-active rejoin has
parent interface for the specified group. It immediately sends a reached R3, which originated it, so R3 concludes a loop is present on
the parent interface for the specified group. It immediately sends a
QUIT_REQUEST to R6, which in turn sends a quit if it has not received QUIT_REQUEST to R6, which in turn sends a quit if it has not received
an ACK from R5 already AND has itself a child or subnets with member an ACK from R5 already AND has itself a child or subnets with member
presence. If so it does not send a quit -- the loop has been broken presence. If so it does not send a quit -- the loop has been broken
by R3 sending the first quit. by R3 sending the first quit.
QUIT_REQUESTs are typically acknowledged by means of a QUIT_ACK. A QUIT_REQUESTs are typically acknowledged by means of a QUIT_ACK. A
child removes its parent information immediately subsequent to send- child removes its parent information immediately subsequent to send-
ing its first QUIT-REQUEST. The ack here serves to notify the (old) ing its first QUIT-REQUEST. The ack here serves to notify the (old)
child that it (the parent) has in fact removed its child information. child that it (the parent) has in fact removed its child information.
However, there might be cases where, due to failure, the parent can- However, there might be cases where, due to failure, the parent can-
skipping to change at page 22, line 11 skipping to change at page 20, line 18
If we assume R2 is on tree for the corresponding group, R3 sends a If we assume R2 is on tree for the corresponding group, R3 sends a
join, subcode REJOIN_ACTIVE to R2, which replies with a join ack, join, subcode REJOIN_ACTIVE to R2, which replies with a join ack,
subcode NORMAL. R3 must then generate a loop detection packet (join subcode NORMAL. R3 must then generate a loop detection packet (join
request, subcode REJOIN-NACTIVE) which is forwarded to its parent, request, subcode REJOIN-NACTIVE) which is forwarded to its parent,
R2, which does similarly. On receipt of the rejoin-Nactive, the pri- R2, which does similarly. On receipt of the rejoin-Nactive, the pri-
mary core unicasts a join ack back directly to R3, with subcode mary core unicasts a join ack back directly to R3, with subcode
PRIMARY-NACTIVE-ACK. This confirms to R3 that its rejoin does not PRIMARY-NACTIVE-ACK. This confirms to R3 that its rejoin does not
form a loop. form a loop.
7. Data Packet Loops 9. Data Packet Loops
The CBT protocol builds a loop-free distribution tree. If all routers The CBT protocol builds a loop-free distribution tree. If all routers
that comprise a particular tree function correctly, data packets that comprise a particular tree function correctly, data packets
should never traverse a tree branch more than once. should never traverse a tree branch more than once.
CBT routers will only forward native-style data packets if they are CBT mode data packets from a non-member sender must arrive on a tree
received over a valid on-tree interface. A native-style data packet via an "off-tree" interface. The CBT mode data packet's header
that is not received over such an interface is discarded. includes an "on-tree" field, which contains the value 0x00 until the
data packet reaches an on-tree router. The first on-tree router must
Encapsulated CBT data packets from a non-member sender can arrive via convert this value to 0xff. This value remains unchanged, and from
an "off-tree" interface (this is how CBT-mode sends data across tun- here on the packet should traverse only on-tree interfaces. If an
nels, and how data from non-member senders in native-mode or CBT-mode encapsulated packet happens to "wander" off-tree and back on again,
reaches a tree). The encapsulating CBT data packet header includes an on-tree router will receive the CBT encapsulated packet via an
an "on-tree" field, which contains the value 0x00 until the data off-tree interface. However, this router will recognise that the
packet reaches an on-tree router. At this point, the router must con- "on-tree" field of the encapsulating CBT header is set to 0xff, and
vert this value to 0xff to indicate the data packet is now on-tree. so immediately discards the packet.
This value remains unchanged, and from here on the packet should
traverse only on-tree interfaces. If an encapsulated packet happens
to "wander" off-tree and back on again, the latter on-tree router
will receive the CBT encapsulated packet via an off-tree interface.
However, this router will recognise that the "on-tree" field of the
encapsulating CBT header is set to 0xff, and so immediately discards
the packet.
8. CBT Packet Formats and Message Types 10. CBT Packet Formats and Message Types
CBT packets travel in IP datagrams. We distinguish between two types We distinguish between two types of CBT packet: CBT mode data pack-
of CBT packet: CBT data packets, and CBT control packets. CBT con- ets, and CBT control packets. CBT control packets carry a CBT control
trol packets carry a CBT control header. All CBT control messages are
implemented over UDP. CBT mode data (figure 2) requires a CBT data
packet header. packet header.
8.1. CBT Header Format (for CBT Mode data) For "conventional router" implementations, it is recommended CBT con-
trol packets be encapsulated in IP, as illustrated below:
+++++++++++++++++++++++++++++++
| IP header | CBT control pkt |
+++++++++++++++++++++++++++++++
In CBT mode, the original data packet is encapsulated in a CBT header
and an IP header, as illustrated below:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| IP header | CBT header | original IP hdr | data .... |
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
The IP protocol field of the IP header is used to demultiplex a
packet correctly; CBT has been assigned IP protocol number 7. The
CBT module then demultiplexes based on the encapsulating CBT header's
"type" field, thereby distinguishing between CBT control packets and
CBT mode data packets (the first 16 bits of both the CBT control and
CBT data packet headers are identical).
Some implementations of CBT encapsulate CBT control packets in UDP
(like the workstation router version). In these implementations, the
encapsulation of CBT contol packets is as follows:
++++++++++++++++++++++++++++++++++++++++++++
| IP header | UDP header | CBT control pkt |
++++++++++++++++++++++++++++++++++++++++++++
CBT has been assigned UDP port number 7777 for this purpose.
It is recommended for performance reasons that conventional router
implementations implement the IP encapsulation for control packets,
not the UDP encapsulation.
The CBT data packet header is illustrated below:
10.1. CBT Header Format (for CBT Mode data)
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| vers |unused | type | hdr length | on-tree|unused| | vers |unused | type | hdr length | on-tree|unused|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| checksum | IP TTL | unused | | checksum | IP TTL | unused |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group identifier | | group identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reserved | reserved | Type | Length | | reserved | reserved | Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .....VALUE.... | | .....Flow-id value..... |
| (for flow-id and/or security options) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unused | unused | Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .....Security Information..... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5. CBT Header Figure 5. CBT Header
Each of the fields is described below: Each of the fields is described below:
+ Vers: Version number -- this release specifies version 1. + Vers: Version number -- this release specifies version 1.
+ type: indicates CBT payload is data. The only value defined + type: indicates CBT payload; values are defined for control
for this field is 255 (0xff). (0x00), and data (0xff). For the value 0x00 (control), a CBT
control header is assumed present rather than a CBT header.
+ hdr length: length of the header, for purpose of checksum + hdr length: length of the header, for purpose of checksum
calculation. calculation.
+ on-tree: indicates whether the packet is on-tree (0xff) or + on-tree: indicates whether the packet is on-tree (0xff) or
off-tree (0x00). Once this field is set (i.e. on-tree), it off-tree (0x00).
is non-changing. This field can only be set by a router that
has a FIB entry for the corresponding group, i.e. a router
that has received a join-ack for a join-request previously
sent/forwarded.
+ checksum: the 16-bit one's complement of the one's complement + checksum: the 16-bit one's complement of the one's complement
of the CBT header, calculated across all fields. of the CBT header, calculated across all fields.
+ IP TTL: TTL value gleaned from the IP header where the packet + IP TTL: TTL value gleaned from the IP header where the packet
originated. It is decremented each time it traverses a CBT originated.
router.
+ group identifier: multicast group address. + group identifier: multicast group address.
+ The TLV fields at the end of the header are for a flow- + The TLV fields at the end of the header are for a flow-
identifier, and/or security options, if and when implemented. identifier, and/or security options, if and when implemented.
A "type" value of zero implies a "length" of zero, implying A "type" value of zero implies a "length" of zero, implying
there is no "value" field. there is no "value" field.
8.2. Control Packet Header Format 10.2. Control Packet Header Format
The individual fields are described below. It should be noted that only The individual fields are described below.
certain fields beyond ``group identifier'' are processed for the dif-
ferent control messages.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| vers |unused | type | code | # cores | | vers |unused | type | code | # cores |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| hdr length | checksum | | hdr length | checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group identifier | | group identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| packet origin | | packet origin |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| primary core address | | primary core address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| target core address (core #1) | | target core address (core #1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Core #2 | | Core #2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Core #3 | | Core #3 |
| .... | | .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reserved | reserved | Type | Length | | reserved | reserved | Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .....VALUE.... | | .....Flow-id value..... |
| (for flow-id and/or security options) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unused | unused | Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .....Security data..... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6. CBT Control Packet Header Figure 6. CBT Control Packet Header
+ Vers: Version number -- this release specifies version 1. + Vers: Version number -- this release specifies version 1.
+ type: indicates control message type (see sections 7.3, + type: indicates control message type (see sections 10.3).
7.3.1).
+ code: indicates subcode of control message type. + code: indicates subcode of control message type.
+ # cores: number of core addresses carried by this control + # cores: number of core addresses carried by this control
packet (does not include "primary core address" field). packet.
+ header length: length of the header, for purpose of checksum + header length: length of the header, for purpose of checksum
calculation. calculation.
+ checksum: the 16-bit one's complement of the one's complement + checksum: the 16-bit one's complement of the one's complement
of the CBT control header, calculated across all fields. of the CBT control header, calculated across all fields.
+ group identifier: multicast group address. + group identifier: multicast group address.
+ group mask: mask value for aggregated CBT joins/join-acks.
Zero for non-aggregated joins/join-acks.
+ packet origin: address of the CBT router that originated the + packet origin: address of the CBT router that originated the
control packet. control packet.
+ primary core address: the address of the primary core for the + primary core address: the address of the primary core for the
group. group.
+ target core address: desired core affiliation of control mes- + target core address: desired core affiliation of control mes-
sage. sage.
+ Core #Z: Z refers to some arbitrary IP address representing a + Core #1, #2, #3 etc.: IP address for each of a group's cores.
core.
+ The TLV fields at the end of the header are for a flow- + The TLV fields at the end of the header are for a flow-
identifier, and/or security options, if implemented. A "type" identifier, and/or security options, if implemented. A "type"
value of zero implies a "length" of zero, implying there is value of zero implies a "length" of zero, implying there is
no "value" field. no "value" field.
8.3. CBT Control Message Types 10.3. CBT Control Message Types
There are eight types of CBT message. All are encoded in the CBT con- There are ten types of CBT message. All are encoded in the CBT con-
trol header, shown in figure 6. trol header, shown in figure 6.
+ JOIN-REQUEST (type 1): generated by a router and unicast to + JOIN-REQUEST (type 1): generated by a router and unicast to
the specified core address. It is processed hop-by-hop on its the specified core address. It is processed hop-by-hop on its
way to the specified core. Its purpose is to establish the way to the specified core. Its purpose is to establish the
sending CBT router, and all intermediate CBT routers, as part originating CBT router, and all intermediate CBT routers, as
of the corresponding delivery tree. Note that all cores are part of the corresponding delivery tree. Note that all cores
carried in join-requests. are carried in join-requests.
+ JOIN-ACK (type 2): an acknowledgement to the above. The full + JOIN-ACK (type 2): an acknowledgement to the above. The full
list of core addresses is carried in a JOIN-ACK, together list of core addresses is carried in a JOIN-ACK, together
with the actual core affiliation (the join may have been ter- with the actual core affiliation (the join may have been ter-
minated by an on-tree router on its journey to the specified minated by an on-tree router on its journey to the specified
core, and the terminating router may or may not be affiliated core, and the terminating router may or may not be affiliated
to the core specified in the original join). A JOIN-ACK to the core specified in the original join). A JOIN-ACK
traverses the same path as the corresponding JOIN-REQUEST, traverses the reverse path as the corresponding JOIN-REQUEST,
with each CBT router on the path processing the ack. It is with each CBT router on the path processing the ack. It is
the receipt of a JOIN-ACK that actually creates a tree the receipt of a JOIN-ACK that actually "fixes" tree state.
branch.
+ JOIN-NACK (type 3): a negative acknowledgement, indicating + JOIN-NACK (type 3): a negative acknowledgement, indicating
that the tree join process has not been successful. that the tree join process has not been successful.
+ QUIT-REQUEST (type 4): a request, sent from a child to a + QUIT-REQUEST (type 4): a request, sent from a child to a
parent, to be removed as a child to that parent. parent, to be removed as a child to that parent.
+ QUIT-ACK (type 5): acknowledgement to the above. If the + QUIT-ACK (type 5): acknowledgement to the above. If the
parent, or the path to it is down, no acknowledgement will be parent, or the path to it is down, no acknowledgement will be
received within the timeout period. This results in the received within the timeout period. This results in the
child nevertheless removing its parent information. child nevertheless removing its parent information.
+ FLUSH-TREE (type 6): a message sent from parent to all chil- + FLUSH-TREE (type 6): a message sent from parent to all chil-
dren, which traverses a complete branch. This message results dren, which traverses a complete branch. This message results
in all tree interface information being removed from each in all tree interface information being removed from each
router on the branch, possibly because of a re-configuration router on the branch, possibly because of a re-configuration
scenario. scenario.
+ CBT-ECHO-REQUEST (type 7): once a tree branch is established, + CBT-ECHO-REQUEST (type 7): once a tree branch is established,
this messsage acts as a ``keepalive'', and is unicast from this messsage acts as a "keepalive", and is unicast from
child to parent (one per link, NOT one per group). child to parent (can be aggregated from one per group to one
per link).
+ CBT-ECHO-REPLY (type 8): positive reply to the above. + CBT-ECHO-REPLY (type 8): positive reply to the above.
8.3.1. CBT Control Message Subcodes + CBT-BR-KEEPALIVE (type 9): applicable to border routers only,
when attaching a CBT domain to some other domain. See [11]
for more information.
+ CBT-BR-KEEPALIVE-ACK (type 10): acknowledgement to the above.
10.3.1. CBT Control Message Subcodes
The JOIN-REQUEST has three valid subcodes: The JOIN-REQUEST has three valid subcodes:
+ ACTIVE-JOIN (code 0) - sent from a CBT router that has no + ACTIVE-JOIN (code 0) - sent from a CBT router that has no
children for the specified group. children for the specified group.
+ REJOIN-ACTIVE (code 1) - sent from a CBT router that has at + REJOIN-ACTIVE (code 1) - sent from a CBT router that has at
least one child for the specified group. least one child for the specified group.
+ REJOIN-NACTIVE (code 2) - generated by a router subsequent to + REJOIN-NACTIVE (code 2) - generated by a router subsequent to
skipping to change at page 27, line 37 skipping to change at page 27, line 5
REJOIN-ACTIVE. This message traverses the reverse-path of the REJOIN-ACTIVE. This message traverses the reverse-path of the
corresponding re-join, and is processed by each router on corresponding re-join, and is processed by each router on
that path. that path.
+ PRIMARY-NACTIVE-ACK (code 2) - sent by a primary core to ack- + PRIMARY-NACTIVE-ACK (code 2) - sent by a primary core to ack-
nowledge the receipt of a join-request received with subcode nowledge the receipt of a join-request received with subcode
REJOIN-NACTIVE. This ack is unicast directly to the router REJOIN-NACTIVE. This ack is unicast directly to the router
that generated the rejoin-Nactive, i.e. the ack it is not that generated the rejoin-Nactive, i.e. the ack it is not
processed hop-by-hop. processed hop-by-hop.
9. CBT Protocol and Port Numbers 11. CBT Protocol and Port Numbers
CBT mode (data) encapsulation (figure 2) requires an IP protocol
number assignment for CBT. An official protocol number has recently
been approved by the IANA; CBT has IP protocol number 7.
CBT control packets travel inside UDP datagrams, as the following
diagram illustrates:
++++++++++++++++++++++++++++++++++++++++++++
| IP header | UDP header | CBT control pkt |
++++++++++++++++++++++++++++++++++++++++++++
Figure 7. Encapsulation for CBT control messages
CBT therefore requires a UDP port assignment for control messages. CBT has been assigned IP protocol number 7, and UDP port number 7777.
An official UDP port number has recently been approved by the IANA; The UDP port number is only required for certain CBT implementations,
CBT control messages are received on UDP port 7777. as described at the beginning of section 10.
10. Default Timer Values 12. Default Timer Values
There are several CBT control messages which are transmitted at fixed There are several CBT control messages which are transmitted at fixed
intervals. These values, retransmission times, and timeout values, intervals. These values, retransmission times, and timeout values,
are given below. Note these are recommended default values only, and are given below. Note these are recommended default values only, and
are configurable with each implementation (all times are in seconds): are configurable with each implementation (all times are in seconds):
+ CBT-ECHO-INTERVAL 30 (time between sending successive CBT-ECHO- + CBT-ECHO-INTERVAL 30 (time between sending successive CBT-ECHO-
REQUESTs to parent). REQUESTs to parent).
+ PEND-JOIN-INTERVAL 10 (retransmission time for join-request if + PEND-JOIN-INTERVAL 10 (retransmission time for join-request if
skipping to change at page 29, line 5 skipping to change at page 27, line 43
+ CBT-ECHO-TIMEOUT 90 (time to consider parent unreachable) + CBT-ECHO-TIMEOUT 90 (time to consider parent unreachable)
+ CHILD-ASSERT-INTERVAL 90 (increment child timeout if no ECHO + CHILD-ASSERT-INTERVAL 90 (increment child timeout if no ECHO
rec'd from a child) rec'd from a child)
+ CHILD-ASSERT-EXPIRE-TIME 180 (time to consider child gone) + CHILD-ASSERT-EXPIRE-TIME 180 (time to consider child gone)
+ IFF-SCAN-INTERVAL 300 (scan all interfaces for group presence. + IFF-SCAN-INTERVAL 300 (scan all interfaces for group presence.
If none, send QUIT) If none, send QUIT)
11. Interoperability Issues + BR-KEEPALIVE-INTERVAL 200 (backup designated BR to designated BR
keepalive interval)
One of the design goals of CBT is for it to fully interwork with + BR-KEEPALIVE-RETRY-INTERVAL 30 (keepalive interval if BR fails
other IP multicast schemes. We have already described how CBT-style to respond)
packets are transformed into IP-style multicasts, and vice-versa.
In order for CBT to fully interwork with other schemes, it is neces- 13. Interoperability Issues
sary to define the interface(s) between a ``CBT cloud'' and the cloud
of another scheme. The CBT authors are currently working out the
details of interoperability, and we expect an interoperability docu-
ment to be available shortly.
12. CBT Security Architecture Interoperability between CBT and DVMRP has recently been defined in
ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-cbt-dvmrp-00.txt.
see current I-D: ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-mkd- Interoperability with other multicast protocols will be fully speci-
01.{ps,txt} fied shortly.
14. CBT Security Architecture
see [4].
Acknowledgements Acknowledgements
Special thanks goes to Paul Francis, NTT Japan, for the original Special thanks goes to Paul Francis, NTT Japan, for the original
brainstorming sessions that brought about this work. brainstorming sessions that brought about this work.
Thanks too to Sue Thompson (Bellcore). Her detailed reviews led to Thanks too to Sue Thompson (Bellcore). Her detailed reviews led to
the identification of some subtle protocol flaws, and she suggested the identification of some subtle protocol flaws, and she suggested
several simplifications. several simplifications.
skipping to change at page 30, line 7 skipping to change at page 29, line 7
Thanks also to Ken Carlberg (SAIC) for reviewing the text, and gen- Thanks also to Ken Carlberg (SAIC) for reviewing the text, and gen-
erally providing constructive comments throughout. erally providing constructive comments throughout.
I would also like to thank the participants of the IETF IDMR working I would also like to thank the participants of the IETF IDMR working
group meetings for their general constructive comments and sugges- group meetings for their general constructive comments and sugges-
tions since the inception of CBT. tions since the inception of CBT.
APPENDIX A APPENDIX A
A single rejoin could be sent for all the groups the keepalive There are situations where it is advantageous to send a single join-
represents. This constitutes an aggregated rejoin strategy; a single request that represents potentially many groups. One such example is
rejoin message can serve to rejoin multiple groups to their respec- provided in [11], whereby a designated border router is required to
tive trees, provided those groups share a common core (that which is join all groups inside a CBT domain.
being rejoined). Therefore, it may be that several rejoins need to be
sent to re-connect all groups traversing the router after a failure.
Similarly, the corresponding join-ack would represent an aggregate.
NOTE: it remains to be worked out how the new parent establishes from Such aggregated joining is only possible if each of the groups the
the aggregated rejoin all those groups which the rejoin represents join represents shares a common corelist. Furthermore, aggregation is
(so the new parent can create/modify the necessary FIB entries). A only efficient over contiguous ranges of group addresses; the "group
"group aggregate" field may be necessary in the control packet. mask" field in the CBT control packet header is used to specify a
Alternatively, when the ack is received in response to the rejoin, CIDR-like group address mask.
each group represented by the rejoin sends a group-specific echo
until an ack is received for each.
Authors' Addresses: Authors' Addresses:
Tony Ballardie, Tony Ballardie,
Department of Computer Science, Department of Computer Science,
University College London, University College London,
Gower Street, Gower Street,
London, WC1E 6BT, London, WC1E 6BT,
ENGLAND, U.K. ENGLAND, U.K.
skipping to change at page 31, line 14 skipping to change at page 30, line 9
Billerica, MA 01821, Billerica, MA 01821,
USA. USA.
Tel: ++1 508 670 8888 Tel: ++1 508 670 8888
e-mail: njain@BayNetworks.com e-mail: njain@BayNetworks.com
References References
[1] DVMRP. Described in "Multicast Routing in a Datagram Internet- [1] DVMRP. Described in "Multicast Routing in a Datagram Internet-
work", S. Deering, PhD Thesis, 1990. Available via anonymous ftp from: work", S. Deering, PhD Thesis, 1990. Available via anonymous ftp from:
gregorio.stanford.edu:vmtp/sd-thesis.ps. gregorio.stanford.edu:vmtp/sd-thesis.ps. NOTE: DVMRP version 3 is
specified as a working draft.
[2] J. Moy. Multicast Routing Extensions to OSPF. Communications of [2] J. Moy. Multicast Routing Extensions to OSPF. Communications of
the ACM, 37(8): 61-66, August 1994. the ACM, 37(8): 61-66, August 1994.
[3] D. Farinacci, S. Deering, D. Estrin, and V. Jacobson. Protocol [3] D. Farinacci, S. Deering, D. Estrin, and V. Jacobson. Protocol
Independent Multicast (PIM) Dense-Mode Specification (draft-ietf- Independent Multicast (PIM) Dense-Mode Specification (draft-ietf-
idmr-pim-spec-01.ps). Working draft, 1994. idmr-pim-spec-01.ps). Working draft, 1994.
[4] A. J. Ballardie. Scalable Multicast Key Distribution [4] A. J. Ballardie. Scalable Multicast Key Distribution; RFC XXXX,
(ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-mkd-01.{ps,txt}). Work- SRI Network Information Center, 1996.
ing draft, 1995.
[5] A. J. Ballardie. "A New Approach to Multicast Communication in a [5] A. J. Ballardie. "A New Approach to Multicast Communication in a
Datagram Internetwork", PhD Thesis, 1995. Available via anonymous ftp Datagram Internetwork", PhD Thesis, 1995. Available via anonymous ftp
from: cs.ucl.ac.uk:darpa/IDMR/ballardie-thesis.ps.Z. from: cs.ucl.ac.uk:darpa/IDMR/ballardie-thesis.ps.Z.
[6] W. Fenner. Internet Group Management Protocol, version 2 (IGMPv2), [6] W. Fenner. Internet Group Management Protocol, version 2 (IGMPv2),
(draft-idmr-igmp-v2-01.txt). (draft-idmr-igmp-v2-01.txt).
[7] B. Cain, S. Deering, A. Thyagarajan. Internet Group Management [7] B. Cain, S. Deering, A. Thyagarajan. Internet Group Management
Protocol Version 3 (IGMPv3) (draft-cain-igmp-00.txt). Protocol Version 3 (IGMPv3) (draft-cain-igmp-00.txt).
skipping to change at line 1246 skipping to change at page 30, line 42
[8] M. Handley, J. Crowcroft, I. Wakeman. Hierarchical Rendezvous [8] M. Handley, J. Crowcroft, I. Wakeman. Hierarchical Rendezvous
Point proposal, work in progress. Point proposal, work in progress.
(http://www.cs.ucl.ac.uk/staff/M.Handley/hpim.ps) and (http://www.cs.ucl.ac.uk/staff/M.Handley/hpim.ps) and
(ftp://cs.ucl.ac.uk/darpa/IDMR/IETF-DEC95/hpim-slides.ps). (ftp://cs.ucl.ac.uk/darpa/IDMR/IETF-DEC95/hpim-slides.ps).
[9] D. Estrin et al. USC/ISI, Work in progress. [9] D. Estrin et al. USC/ISI, Work in progress.
(http://netweb.usc.edu/pim/). (http://netweb.usc.edu/pim/).
[10] D. Estrin et al. PIM Sparse Mode Specification. (draft-ietf- [10] D. Estrin et al. PIM Sparse Mode Specification. (draft-ietf-
idmr-pim-sparse-spec-00.txt). idmr-pim-sparse-spec-00.txt).
[11] A. Ballardie. CBT Multicast Interoperability - Stage 1; Working
draft, April 1996. Also available from:
ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-cbt-dvmrp-00.txt
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/