draft-ietf-pim-bidir-00.txt   draft-ietf-pim-bidir-01.txt 
PIM Working Group Mark Handley Internet Engineering Task Force PIM WG
Internet Draft ACIRI INTERNET-DRAFT Mark Handley/ACIRI
Expiration Date: September, 2000 Isidor Kouvelas draft-ietf-pim-bidir-01.txt Isidor Kouvelas/Cisco
cisco Systems Lorenzo Vicisano/Cisco
Lorenzo Vicisano 23 November 2000
cisco Systems Expires: May 2001
March 1, 2000
Bi-directional Protocol Independent Multicast Bi-directional Protocol Independent Multicast (BIDIR-PIM)
<draft-ietf-pim-bidir-00.txt>
Status of this Memo Status of this Document
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with all
all provisions of Section 10 of RFC2026. provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering Task
Task Force (IETF), its areas, and its working groups. Note that other Force (IETF), its areas, and its working groups. Note that other groups
groups may also distribute working documents as Internet-Drafts. may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference material
material or to cite them other than as "work in progress." or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This document is a product of the IETF PIM WG. Comments should be
addressed to the authors, or the WG's mailing list at
pim@catarina.usc.edu.
Abstract Abstract
This document discusses Bi-directional PIM, a variant of PIM Sparse- This document discusses Bi-directional PIM, a variant of PIM
Mode [1] that builds bi-directional shared trees connecting multicast Sparse-Mode [9] that builds bi-directional shared trees
sources and receivers. Bi-directional trees are built using a fail- connecting multicast sources and receivers. Bi-directional
safe Designated Forwarder (DF) election mechanism operating on each trees are built using a fail-safe Designated Forwarder (DF)
link of a multicast topology. With the assistance of the DF, multi- election mechanism operating on each link of a multicast
cast data is natively forwarded from sources to the Rendezvous-Point topology. With the assistance of the DF, multicast data is
without requiring source-specific state. The DF election takes place natively forwarded from sources to the Rendezvous-Point
at RP discovery time and provides a default route to the RP thus without requiring source-specific state. The DF election
eliminating the requirement for data-driven protocol events. takes place at RP discovery time and provides a default route
to the RP thus eliminating the requirement for data-driven
protocol events.
1 Introduction Note on BIDIR-PIM status
This document discusses Bi-directional PIM, a variant of PIM Sparse
Mode [1] that builds bi-directional shared trees connecting multicast
sources and receivers.
PIM Sparse-Mode (PIM-SM) version 1 and version 2 construct uni- The differences between this version of the BIDIR-PIM specification and
directional shared trees that are used to forward data from senders draft-ietf-pim-bidir-new-00.txt are mostly in the format of the
to receivers of a multicast group. PIM-SM also allows the construc- information presented. As BIDIR-PIM has many similarities in operation
tion of source specific trees, but this capability is not related to to Sparse-Mode PIM, the earlier version of this spec relied heavily on
the proposal described in this document. the now obsolete PIM-SM [11] specification. This revision removes this
dependency and instead references the new Sparse-Mode documentation [9]
where necessary. In addition the method in which the protocol
specification is presented has been updated to follow the format of [9].
The shared tree for each multicast group is rooted at a multicast Table of Contents
router called the Rendezvous Point (RP). Different multicast group
ranges can use separate RPs within a PIM domain.
In unidirectional sparse-mode PIM, there are two possible methods for 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1. Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2. Pseudocode Notation. . . . . . . . . . . . . . . . . . . . . . 7
3. Protocol Specification. . . . . . . . . . . . . . . . . . . . . . 8
3.1. BIDIR-PIM Protocol State . . . . . . . . . . . . . . . . . . . 8
3.1.1. General Purpose State . . . . . . . . . . . . . . . . . . . 9
3.1.2. RP State. . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.3. Group State . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.4. State Summarization Macros. . . . . . . . . . . . . . . . . 11
3.2. PIM Neighbor Discovery . . . . . . . . . . . . . . . . . . . . 12
3.3. Data Packet Forwarding Rules . . . . . . . . . . . . . . . . . 12
3.3.1. Source-Only Branches. . . . . . . . . . . . . . . . . . . . 13
3.4. PIM Join/Prune Messages. . . . . . . . . . . . . . . . . . . . 13
3.4.1. Receiving (*,G) Join/Prune Messages . . . . . . . . . . . . 14
3.4.2. Sending Join/Prune Messages . . . . . . . . . . . . . . . . 16
3.5. Designated Forwarder (DF) Election . . . . . . . . . . . . . . 19
3.5.1. DF Requirements . . . . . . . . . . . . . . . . . . . . . . 19
3.5.2. DF Election description . . . . . . . . . . . . . . . . . . 20
3.5.2.1. Bootstrap Election . . . . . . . . . . . . . . . . . . . 20
3.5.2.2. Loser Metric Changes . . . . . . . . . . . . . . . . . . 21
3.5.2.3. Winner Metric Changes. . . . . . . . . . . . . . . . . . 22
3.5.2.4. Winner Loses Path. . . . . . . . . . . . . . . . . . . . 22
3.5.2.5. Late Router Starting Up. . . . . . . . . . . . . . . . . 22
3.5.2.6. Winner Dies. . . . . . . . . . . . . . . . . . . . . . . 22
3.5.3. Election Protocol Specification . . . . . . . . . . . . . . 23
3.5.3.1. Election State . . . . . . . . . . . . . . . . . . . . . 23
3.5.3.2. Election Messages. . . . . . . . . . . . . . . . . . . . 23
3.5.3.3. Election Events. . . . . . . . . . . . . . . . . . . . . 24
3.5.3.4. Election State Transitions . . . . . . . . . . . . . . . 24
3.6. Timers and Constants . . . . . . . . . . . . . . . . . . . . . 28
3.7. PIM DF Election Packet Formats . . . . . . . . . . . . . . . . 31
3.7.1. Backoff Message . . . . . . . . . . . . . . . . . . . . . . 32
3.7.2. Pass Message. . . . . . . . . . . . . . . . . . . . . . . . 33
4. RP Discovery. . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5. Security Considerations . . . . . . . . . . . . . . . . . . . . . 34
5.1. Appendix A: Election Reliability
Enhancements. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.1. A.1 Missing Pass. . . . . . . . . . . . . . . . . . . . . . 34
5.1.2. A.2 Periodic Winner Announcement. . . . . . . . . . . . . . 35
5.2. Appendix B: Interoperability with legacy
code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3. Appendix C: Comparison with PIM-SM . . . . . . . . . . . . . . 35
6. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . . . 36
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 37
8. References. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9. Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1. Introduction
This document discusses Bi-directional PIM, a variant of PIM Sparse-Mode
(PIM-SM) [9] that builds bi-directional shared trees connecting
multicast sources and receivers.
PIM-SM constructs uni-directional shared trees that are used to forward
data from senders to receivers of a multicast group. PIM-SM also allows
the construction of source specific trees, but this capability is not
related to the protocol described in this document.
The shared tree for each multicast group is rooted at a multicast router
called the Rendezvous Point (RP). Different multicast group ranges can
use separate RPs within a PIM domain.
In unidirectional PIM-SM, there are two possible methods for
distributing data packets on the shared tree. These differ in the way distributing data packets on the shared tree. These differ in the way
packets are forwarded from a source to the RP: packets are forwarded from a source to the RP:
o Initially when a source starts transmitting, it's first hop router o Initially when a source starts transmitting, it's first hop router
encapsulates data packets in special control messages (Registers) encapsulates data packets in special control messages (Registers)
which are unicast to the RP. After reaching the RP the packets are which are unicast to the RP. After reaching the RP the packets are
decapsulated and distributed on the shared tree. decapsulated and distributed on the shared tree.
o A transition from the above distribution mode can be made at a o A transition from the above distribution mode can be made at a later
later stage. This is achieved by building source specific state on stage. This is achieved by building source specific state on all
all routers along the path between the source and the RP. This routers along the path between the source and the RP. This state is
state is then used to natively forward packets from that source. then used to natively forward packets from that source.
Both these mechanisms suffer from problems. Encapsulation results in Both these mechanisms suffer from problems. Encapsulation results in
significant processing, bandwidth and delay overheads. Forwarding significant processing, bandwidth and delay overheads. Forwarding using
using source specific state has additional protocol and memory source specific state has additional protocol and memory requirements.
requirements.
Bi-directional PIM dispenses with both encapsulation and source state Bi-directional PIM dispenses with both encapsulation and source state by
by allowing packets to be natively forwarded from a source to the RP allowing packets to be natively forwarded from a source to the RP using
using shared tree state. For a complete discussion of the pros and shared tree state. For a complete discussion of the pros and cons of Bi-
cons of Bi-directional PIM consult appendix C. directional PIM consult appendix C.
The ideas presented in this document are similar to those described The ideas presented in this document are similar to those described in
in [2]. The main difference between the two proposals is in the [10]. The main difference between the two proposals is in the method
method used to forward packets traveling upstream from a source to used to forward packets traveling upstream from a source to the RP. In
the RP. In particular [2] uses an IP option (UMP option) on data particular [10] uses an IP option (UMP option) on data packets to assist
packets to assist with upstream forwarding. The UMP option identifies with upstream forwarding. The UMP option identifies the next hop router
the next hop router responsible for forwarding the packet upstream. responsible for forwarding the packet upstream. In contrast, this
proposal does not alter data packets to embed control information.
Instead the identification of the next hop upstream forwarder is
performed at RP discovery time using a fail-safe election mechanism.
In contrast, this proposal does not alter data packets to embed con- 2. Terminology
trol information. Instead the identification of the next hop upstream
forwarder is performed at RP discovery time using a fail-safe elec-
tion mechanism. This significantly simplifies forwarding procedures
and eliminates forwarding loops and packet duplication problems that
exist in [2]. Appendix D presents a comparison between the proposal
in this document and [2].
The rest of this document is structured as follows. Section 2 defines In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
basic terms. Section 3 describes bidirectional tree formation and "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
forwarding. The new forwarding rules rely heavily on an election "OPTIONAL" are to be interpreted as described in RFC 2119 and indicate
mechanism described in section 4. requirement levels for compliant PIM-SM implementations.
2 Definitions 2.1. Definitions
In the discussion below, the terms upstream, downstream and RPF This specification uses a number of terms to refer to the roles of
interface are always referring to the shared tree rooted at the Ren- routers participating in BIDIR-PIM. The following terms have special
dezvous Point. Downstream indicates the direction on which packets significance for BIDIR-PIM:
travel from the RP to receivers along the shared tree. Upstream indi-
cates the opposite direction used by packets traveling from sources
to the RP. The RPF interface for a group is the interface unicast
routing uses to reach the RP.
We assume that the reader is familiar with the unidirectional PIM-SM MRIB Multicast Routing Information Base. This is the multicast
protocol [1], as much of the functionality is common to the version topology table, which is typically derived from the unicast
of bidir PIM described below. In particular in the rest of this docu- routing table, or routing protocols such as MBGP that carry
ment we will use the concepts of (*,G), (S,G) and (*,*,RP) state and multicast-specific topology information. In PIM-SM this is used
their component fields (olist, iif, ...). We will also reference Join to make decisions regarding where to forward Join/Prune messages
and Prune messages whose semantics and packet formats are defined in and in BIDIR-PIM is used as a source for routing metrics for the
[1]. In the context of this document, entries in Join and Prune mes- DF election process.
sages always have the RP and WC bits set. Also, default timer values
are the ones given in [1].
The protocol presented in this document is largely based on the con- Rendezvous Point (RP):
cept of a Designated Forwarder (DF). A single DF exists for each RP An RP is a router that has been configured to be used as the root
on every link within a PIM domain (this includes both multi-access of the distribution tree for a range of multicast groups. Join
and point-to-point links). The DF is the router on the link with the messages from receivers for a group are sent towards the RP.
best unicast route to the RP. A DF for a given RP is in charge of
forwarding traffic downstream onto the link, and forwarding upstream
traffic from the link towards the RP. It does this for all the bi-
directional groups served by the RP. For those familiar with the DR
in PIM-SM, the Bidir DF provides the same support for local
receivers. The DF election procedures are described in section 4.
3 Tree Building and Forwarding Upstream
Towards the root (Rendezvous-Point) of the tree. The direction
used by packets traveling from sources to the RP.
This section describes how bi-directional tree building procedures Downstream
and forwarding rules vary from normal PIM-SM operation. Away from the root of the tree. The direction on which packets
travel from the RP to receivers.
A router learns which multicast addresses will be used for sparse- Designated Forwarder (DF):
mode PIM and which will be for bidirectional groups along with the The protocol presented in this document is largely based on the
candidate RP information through PIM-SM bootstrap messages. Thus concept of a Designated Forwarder (DF). A single DF exists for
unidirectional and bidirectional groups can coexist in the same each RP on every link within a BIDIR-PIM domain (this includes
domain. both multi-access and point-to-point links). The DF is the router
on the link with the best unicast route to the RP. A DF for a
given RP is in charge of forwarding downstream traffic onto the
link, and forwarding upstream traffic from the link towards the
RP. It does this for all the bi-directional groups served by the
RP. The DF on a link is also responsible for interpreting IGMP
information from local receivers and originating Join messages
towards the RP.
Throughout the section it is assumed that on each link, all the RPF Interface
routers have a consistent view on which router has the best path to RPF stands for "Reverse Path Forwarding". The RPF Interface of a
the RP. This router is called the DF for that RP on the link. This router with respect to an address is the interface that the MRIB
assumption rests on the DF election procedures described in section indicates should be used to forward packets to that address. In
4. the case of a BIDIR-PIM multicast group, the RPF interface is the
interface that would be used to send packets to the RP for the
group.
In the procedures described in the rest of this section, if DF infor- RPF Neighbor
mation is required but not available (election is incomplete), then The RPF Neighbor of a router with respect to an address is the
no tree building or forwarding action is taken. neighbor that the MRIB indicates should be used to forward packets
to that address. Note that in BIDIR-PIM, the RPF neighbor for a
group is not necessarily the router on the RPF interface that Join
messages for that group would be directed to (Join messages are
directed to the DF on the RPF interface for the group).
3.1 Tree Building TIB Tree Information Base. This is the collection of state at a PIM
router that has been created by receiving PIM Join/Prune messages,
PIM DF election messages and IGMP information from local hosts.
It essentially stores the state of all multicast distribution
trees at that router.
3.1.1 Joining the (Shared) Tree MFIB Multicast Forwarding Information Base. The TIB holds all the
state that is necessary to forward multicast packets at a router.
However, although this specification defines forwarding in terms
of the TIB, to actually forward packets using the TIB is very
inefficient. Instead a real router implementation will normally
build an efficient MFIB from the TIB state to perform forwarding.
How this is done is implementation-specific, and is not discussed
in this document.
The procedures for joining the (*,G) shared tree, are almost identi- 2.2. Pseudocode Notation
cal to those used in PIM-SM with the difference that the tasks of the
DR are handled by the DF.
When a router receives a membership indication from IGMP for a We use set notation in several places in this specification.
bidirectional group G with rendezvous point RP, and it is the DF for
the RP on the link on which the report was received, the following
steps are taken:
o If no (*,G) state exists for the group, then a (*,G) entry is A (+) B
created and populated with the RP DF information. is the union of two sets A and B.
o If the interface on which the report was received is not in the A (-) B
olist of the entry, then the interface is added to the olist. is the elements of set A that are not in set B.
o According to standard PIM-SM procedures [1], if the olist transi- NULL
tioned from null to non-null, a Join message for the group is trig- is the empty set or list.
gered upstream. The Join is directed to the DF for the (*,G) incom-
ing interface.
When a router receives a Join message addressed to it for a bidir In addition we use C-like syntax:
group G with rendezvous point RP, it must determine if it is the DF
on the link for this RP. If the router is not the DF, it must ignore
the Join message. If it is the DF, then the following steps are
taken:
o If no (*,G) state exists for the group, then a (*,G) entry is = denotes assignment of a variable.
created and populated with the RP DF information.
o If the interface on which the Join was received is not in the olist == denotes a comparison for equality.
of the entry, then the interface is added to the olist.
o According to standard PIM-SM procedures [1], the expiration timer != denotes a comparison for inequality.
of the olist interface is updated and if the olist transitioned
from null to non-null, a Join message for the group is triggered
upstream. The Join is directed to the DF for the (*,G) incoming
interface.
3.1.2 Leaving the (shared) tree Braces { and } are used for grouping.
When the DF for a link receives notification that an interface is no 3. Protocol Specification
longer required in the olist of a group (either through IGMP or by
receiving a Prune), it follows standard PIM-SM procedures except any
originated prunes are addressed to the DF on the (*,G) iif.
3.1.3 Designated Forwarder Change The specification of BIDIR-PIM is broken into several parts:
When the DF for a RP on a link changes to a different router, tree o Section 3.1 details the protocol state stored.
maintenance has to take place to ensure that traffic is still
delivered for all affected groups.
3.1.3.1 Old DF Actions o Section 3.3 specifies the data packet forwarding rules.
On losing its status as acting DF on a link, the old DF has to take o Section 3.4 specifies the BIDIR-PIM Join/Prune generation and
the following actions for existing groups that are affected. processing rules.
o If there were downstream receivers (discovered through IGMP or o Designated Forwarder (DF) election is specified in Section 3.5.
downstream Joins), the router has to delete the interface to the
link from its olist.
o If the interface deletion results in a null olist for the (*,G) o PIM packet formats are specified in Section 3.7.
then the usual actions are taken to propagate a Prune upstream.
3.1.3.2 New DF Actions o A summary of BIDIR-PIM timers and their default values is given in
Section 3.6.
On assuming the role of the DF for a given link, a router has to take 3.1. BIDIR-PIM Protocol State
the following actions for each existing group that is affected. If
the router has IGMP information from local receivers for a group, the
interface to the link must be added to the olist for the (*,G). If
the (*,G) entry did not exist then it must be created and populated
with the RP DF information. If the (*,G) olist was previously null
then the usual actions are taken to propagate a Join upstream.
3.1.3.3 Downstream Router Actions This section specifies all the protocol state that a BIDIR-PIM
When learning about a switch to a new DF on the RPF interface, down- implementation should maintain in order to function correctly. We term
stream routers must take the following actions for all affected this state the Tree Information Base or TIB, as it holds the state of
groups. all the multicast distribution trees at this router. In this
specification we define PIM mechanisms in terms of the TIB. However,
only a very simple implementation would actually implement packet
forwarding operations in terms of this state. Most implementations will
use this state to build a multicast forwarding table, which would then
be updated when the relevant state in the TIB changes.
o If the router has a (*,G) entry with a non-null olist, it must send Although we specify precisely the state to be kept, this does not mean
a Join for the group towards the new DF. that an implementation of PIM-SM needs to hold the state in this form.
This is actually an abstract state definition, which is needed in order
to specify the router's behavior. A BIDIR-PIM implementation is free to
hold whatever internal state it requires, and will still be conformant
with this specification so long as it results in the same externally
visible protocol behavior as an abstract router that holds the following
state.
o The router may also send a Prune for the group towards the old DF. We divide TIB state into two sections:
3.2 Forwarding Data RP state
State that maintains the DF election information for each RP.
The following responsibilities are uniquely assigned to the DF of a Group state
link: State that maintains a group-specific tree for groups that map to a
given RP.
o The DF is the only router that forwards packets traveling down- The state that should be kept is described below. Of course,
stream onto the link. implementations will only maintain state when it is relevant to
forwarding operations - for example, the "NoInfo" state might be assumed
from the lack of other state information, rather than being held
explicitly.
o The DF is the only router that picks-up upstream traveling packets 3.1.1. General Purpose State
off the link to forward towards the RP.
Non-DF routers on a link, that use that link to reach the RP, may A router holds the following state that is not specific to a RP or
perform the following forwarding actions for bidirectional groups: group:
Neighbor State:
For each neighbor:
o Information from neighbor's Hello
o Neighbor's Gen ID.
o Neighbor liveness timer (NLT)
3.1.2. RP State
A router maintains a multicast-group to RP mapping which is built using
the BSR mechanism described in section 4. For each BIDIR-PIM RP a router
holds the following state:
o RP address
Designated Forwarder (DF) State:
For each router interface:
Acting DF information:
o DF IP Address
o DF metric
Election information:
o DF Election-Timer (DFT)
o Offer-Count (OC)
Current best offer:
o IP address of best offering router
o Best offering router metric
Designated Forwarder state is described in section 3.5.
3.1.3. Group State
For every group G a router keeps the following state:
Group state:
For each interface:
Local Membership:
o State: One of {"NoInfo", "Include"}
PIM Join/Prune State:
o State: One of {"NoInfo" (NI), "Join" (J),
"PrunePending" (PP)}
o Prune Pending Timer (PPT)
o Join/Prune Expiry Timer (ET)
Not interface specific:
o Upstream Join/Prune Timer (JT)
o Last RP Used
Local membership is the result of the local membership mechanism (such
as IGMP) running on that interface. This information is used by the
pim_include(*,G) macro described in section 3.1.4.
PIM Join/Prune state is the result of receiving PIM (*,G) Join/Prune
messages on this interface, and is specified in section 3.4.1. The state
is used by the macros that calculate the outgoing interface list in
section 3.1.4, and in the JoinDesired(G) macro (defined in section
3.4.2) that is used in deciding whether a Join(*,G) should be sent
upstream.
The upstream Join/Prune timer is used to send out periodic Join(*,G)
messages, and to override Prune(*,G) messages from peers on an upstream
LAN interface.
The last RP used must be stored because if the RP Set changes [9] then
state must be torn down and rebuilt for groups whose RP changes.
3.1.4. State Summarization Macros
Using this state, we define the following "macro" definitions which we
will use in the descriptions of the state machines and pseudocode in the
following sections.
olist(G) =
RPF_interface(RP(G)) (+) joins(G) (+) pim_include(G)
RPF_interface(RP) is the interface the MRIB indicates would be used to
route packets to RP. The olist(G) is the list of interfaces on which
packets to group G must be forwarded.
The macro pim_include(G) indicates the interfaces to which traffic might
be forwarded because of hosts that are local members on that interface.
pim_include(G) =
{ all interfaces I such that:
I_am_DF(RP(G),I) AND local_receiver_include(G,I) }
The clause "I_am_DF(RP,I)" is TRUE if the router is in the Win or
Backoff states in the DF election state machine in section 3.5.
The clause "local_receiver_include(G,I)" is true if the IGMP module or
other local membership mechanism has determined that there are local
members on interface I that desire to receive traffic sent to group G.
The set "joins(G)" is the set of all interfaces on which the router has
received (*,G) Joins:
joins(G) =
{ all interfaces I such that
I_am_DF(RP(G),I) AND
DownstreamJPState(G,I) is either Joined or PrunePending }
DownstreamJPState(G,I) is the state of the finite state machine in
section 3.4.1.
RPF'(RP) is the neighbor that Join messages must be sent to in order to
reach the RP. This is the Designated-Forwarder on the RPF_interface(RP).
3.2. PIM Neighbor Discovery
PIM routers exchange PIM Hello messages with their neighboring PIM
routers. These messages are used to update the Neighbor State described
in section 3.1. The procedures for generating and processing received
Hello messages as well as maintaining Neighbor State are specified in
the PIM-SM [9] documentation.
3.3. Data Packet Forwarding Rules
For groups mapping to a given RP, the following responsibilities are
uniquely assigned to the DF for that RP on each link:
o The DF is the only router that forwards packets traveling downstream
onto the link.
o The DF is the only router that picks-up upstream traveling packets off
the link to forward towards the RP.
Non-DF routers on a link, that use that link as their RPF interface to
reach the RP, may perform the following forwarding actions for
bidirectional groups:
o Forward packets from the link towards downstream receivers. o Forward packets from the link towards downstream receivers.
o Forward packets from downstream sources onto the link (provided o Forward packets from downstream sources onto the link (provided they
they are the DF for the downstream link from which the packet was are the DF for the downstream link from which the packet was picked-
picked-up). up).
When a router receives a multicast packet sent to a bidir group G, it The BIDIR-PIM packet forwarding rules are defined below in pseudocode.
first looks for a (*,G) matching entry. If this is not found, then
the matching (*,*,RP) state may be used. Alternatively (*,G) state
may be created with a null olist and populated with the RP DF infor-
mation.
The router must forward the packet if either: iif is the incoming interface of the packet.
G is the destination address of the packet (group address).
RP is the address of the Rendezvous Point for this group.
o It was received on the RPF interface of the entry (always forward First we check to see whether the packet should be accepted based on TIB
downstream traveling packets) state and the interface that the packet arrived on. A packet is accepted
if it arrives on the RPF_interface to reach the RP (downstream traveling
packet) or if the router is the DF on the interface the packet arrives
(upstream traveling packet).
o The router is the Designated Forwarder (DF) for the RP on the If the packet should be forwarded we build an outgoing interface list
interface the packet was received (only the DF forwards upstream). for the packet.
If a decision to forward the packet is made, then it is forwarded on Finally we remove the incoming interface from the outgoing interface
all the interfaces in the olist including the entry's RPF interface list we've created, and if the resulting outgoing interface list is not
but excluding the interface the packet was received on. Otherwise the empty, we forward the packet out of those interfaces.
packet is discarded.
Note: A major advantage of using a Designated Forwarder in bi- On receipt on a data to G on interface iif:
directional PIM is that special treatment is no longer required for
sources that are directly connected to a router. Data from such
sources does not need to be differentiated from other multicast
traffic and will automatically be picked up by the DF. This removes
the need for performing a directly connected check for data to groups
that do not have existing state.
3.2.1 Source-only Branches if( iif == RPF_interface(RP) || I_am_DF(RP,I) ) {
oiflist = olist(G)
oiflist = oiflist (-) iif
forward packet on all interfaces in oiflist
}
Source-only branches of the distribution tree for a group are Note: A major advantage of using a Designated Forwarder in BIDIR-PIM
branches which do not lead to any receivers, but which are used to compared to PIM-SM is that special treatment is no longer required for
forward packets traveling upstream from sources towards the RP. sources that are directly connected to a router. Data from such sources
Routers along source-only branches do not have an olist for the group does not need to be differentiated from other multicast traffic and will
and hence do not need to maintain (*,G) state. Upstream forwarding automatically be picked up by the DF. This removes the need for
can be performed using (*,*,RP) state. An implementation may decide performing a directly-connected-source check for data to groups that do
to maintain (*,G) state for accounting or performance reasons. not have existing state.
4 Designated Forwarder 3.3.1. Source-Only Branches
Source-only branches of the distribution tree for a group G are branches
which do not lead to any receivers, but which are used to forward
packets traveling upstream from sources towards the RP. Routers along
source-only branches only have the RPF_interface to the RP in their
olist for G and hence do not need to maintain any group specific state.
Upstream forwarding can be performed using RP state. An implementation
may decide to maintain group state for source-only branches for
accounting or performance reasons.
3.4. PIM Join/Prune Messages
A BIDIR-PIM Join/Prune message consists of a list of Joined and Pruned
Groups. When processing a received Join/Prune message, each Joined or
Pruned Group is effectively considered individually by applying the
following state machines. When considering a Join/Prune message whose
PIM Destination field addresses this router, (*,G) Joins and Prunes can
affect the downstream state machine. When considering a Join/Prune
message whose PIM Destination field addresses another router, most Join
or Prune entries could affect the upstream state machine.
3.4.1. Receiving (*,G) Join/Prune Messages
When a router receives a Join(*,G) or Prune(*,G) it must first check to
see whether the RP in the message matches RP(G) (the router's idea of
who the RP is). If the RP in the message does not match RP(G) the Join
or Prune MUST be silently dropped. In addition a router MUST NOT process
Join(*,G) messages targeted to itself if it is not the DF for RP(G) on
the interface the message was received.
The per-interface state-machine for receiving (*,G) Join/Prune Messages
is given below. There are three states:
NoInfo (NI)
The interface has no (*,G) Join state and no timers running.
Join (J)
The interface has (*,G) Join state which will cause us to
forward packets destined for G from this interface.
PrunePending (PP)
The router has received a Prune(*,G) on this interface from a
downstream neighbor and is waiting to see whether the prune
will be overridden by another downstream router. For
forwarding purposes, the PrunePending state functions exactly
like the Join state.
In addition the state-machine uses two timers:
ExpiryTimer (ET)
This timer is restarted when a valid Join(*,G) is received.
Expiry of the ExpiryTimer causes the interface state to revert
to NoInfo for this group.
PrunePendingTimer (PPT)
This timer is set when a valid Prune(*,G) is received. Expiry
of the PrunePendingTimer causes the interface state to revert
to NoInfo for this group.
+-----------------------------------+
| Figures omitted from text version |
+-----------------------------------+
Figure 1: Downstream group per-interface state-machine
In tabular form, the group per-interface state-machine is:
+----------+------------------------------------------------------------+
| | Event |
| +----------+------------+-----------+------------+-----------+
Prev State |Receive |Receive |Prune |Expiry Stop Being |
| |Join(*,G) |Prune(*,G) |Pending |Timer DF on I |
| | | |Timer |Expires | |
| | | |Expires | | |
+----------+----------+------------+-----------+------------+-----------+
| |-> J state|-> NI state |- |- + |
NoInfo |start | | | | |
(NI) |Expiry | | | | |
| |Timer | | | | |
+----------+----------+------------+-----------+------------+-----------+
| |-> J state|-> PP state |- |-> NI state +> NI state |
Join (J) |restart |start Prune | | | |
| |Expiry |Pending | | | |
| |Timer |Timer | | | |
+----------+----------+------------+-----------+------------+-----------+
| |-> J state|-> PP state |-> NI state|-> NI state +> NI state |
| |restart | |Send Prune-| | |
Prune |Expiry | |Echo(*,G) | | |
Pending |Timer; | | | | |
(PP) |stop Prune| | | | |
| |Pending | | | | |
| |Timer | | | | |
+----------+----------+------------+-----------+------------+-----------+
The transition events "Receive Join(*,G)" and "Receive Prune(*,G)" imply
receiving a Join or Prune targeted to this router's address on the
received interface. If the destination address is not correct, these
state transitions in this state machine must not occur, although seeing
such a packet may cause state transitions in other state machines.
On unnumbered interfaces on point-to-point links, the router's address
should be the same as the source address it chose for the hello packet
it sent over that interface. However on point-to-point links we also
recommend that PIM messages with a 0.0.0.0 destination address are also
accepted.
The transition event "Stop being DF" implies a DF re-election taking
place on this router interface and the router changing status from being
the active DF to being a non-DF router (the value of the I_am_DF macro
changing to FALSE).
When ExpiryTimer is started or restarted, it is set to the HoldTime from
the triggering Join/Prune message.
When PrunePendingTimer is started, it is set to the
J/P_Override_Interval if the router has more than one neighbor on that
interface; otherwise it is set to zero causing it to expire immediately.
The action "Send PruneEcho(*,G)" is triggered when the router stops
forwarding on an interface as a result of a prune. A PruneEcho(*,G) is
simply a Prune(*,G) message sent by the upstream router to itself on a
LAN. Its purpose is to add additional reliability so that if a Prune
that should have been overridden by another router is lost locally on
the LAN, then the PruneEcho may be received and cause the override to
happen. A PruneEcho(*,G) need not be sent on a point-to-point
interface.
3.4.2. Sending Join/Prune Messages
The downstream per-interface state-machines described above hold join
state from downstream PIM routers. This state then determines whether a
router needs to propagate a Join(*,G) upstream towards the RP. Such
Join(*,G) messages are sent on the RPF_interface towards the RP and are
targeted at the DF on that interface.
If a router wishes to propagate a Join(*,G) upstream, it must also watch
for messages on it's upstream interface from other routers on that
subnet, and these may modify its behavior. If it sees a Join(*,G) to
the correct upstream neighbor, it should suppress its own Join(*,G). If
it sees a Prune(*,G) to the correct upstream neighbor, it should be
prepared to override that prune by sending a Join(*,G) almost
immediately. Finally, if it sees the Generation ID (see PIM-SM
specification [9]) of the correct upstream neighbor change, it knows
that the upstream neighbor has lost state, and it should be prepared to
refresh the state by sending a Join(*,G) almost immediately.
In addition changes in the next hop towards the RP trigger a prune off
from the old next hop, and join towards the new next hop. Such a change
can be cause by the following two reasons:
o The MRIB indicates that the RPF_interface towards the RP has
changed.
o There is a DF re-election on the RPF_interface and a new router
emerges as the DF.
The upstream (*,G) state-machine only contains two states:
Not Joined
The downstream state-machines indicate that the router does not
need to join the RP tree for this group.
Joined
The downstream state-machines indicate that the router would like
to join the RP tree for this group.
In addition, one timer JT(G) is kept which is used to trigger the
sending of a Join(*,G) to the upstream next hop towards the RP (the DF
on the RPF_interface for RP(G)).
+-----------------------------------+
| Figures omitted from text version |
+-----------------------------------+
Figure 2: Upstream group state-machine
In tabular form, the state machine is:
+----------------------+------------------------------------------------+
| | Event |
| Prev State +------------------------+-----------------------+
| | JoinDesired(G) | JoinDesired(G) |
| | ->True | ->False |
+----------------------+------------------------+-----------------------+
| | -> J state | - |
| NotJoined (NJ) | Send Join(*,G); | |
| | Set Timer to | |
| | t_periodic | |
+----------------------+------------------------+-----------------------+
| Joined (J) | - | -> NJ state |
| | | Send Prune(*,G) |
+----------------------+------------------------+-----------------------+
In addition, we have the following transitions which occur within the
Joined state:
+-----------------------------------------------------------------------+
| In Joined (J) State |
+-----------------+-----------------+-----------------+-----------------+
|Timer Expires | See Join(*,G) | See Prune(*,G) | RPF'(*,G) |
| | to RPF'(*,G) | to RPF'(*,G) | changes |
+-----------------+-----------------+-----------------+-----------------+
|Send | Increase Timer | Decrease Timer | Decrease Timer |
|Join(*,G); Set | to | to t_override | to t_override |
|Timer to | t_suppressed | | |
|t_periodic | | | |
+-----------------+-----------------+-----------------+-----------------+
+-----------------------------------------------------------------------+
| In Joined (J) State |
+-------------------------------------+---------------------------------+
| topology change wrt | RPF'(RP(G)) GenID |
| RPF'(RP(G)) | changes |
+-------------------------------------+---------------------------------+
| Send Join(*,G) to new | Decrease Timer to |
| next hop; Send | t_override |
| Prune(*,G) to old next | |
| hop; set Timer to | |
| t_periodic | |
+-------------------------------------+---------------------------------+
This state machine uses the following macro:
bool JoinDesired(G) {
if (olist(G) (-) RPF_interface(RP(G))) != NULL
return TRUE
else
return FALSE
}
3.5. Designated Forwarder (DF) Election
This section presents a fail-safe mechanism for electing a per-RP This section presents a fail-safe mechanism for electing a per-RP
designated router on each link in a PIM domain. We call this router designated router on each link in a BIDIR-PIM domain. We call this
the Designated Forwarder (DF). router the Designated Forwarder (DF).
4.1 DF Requirements 3.5.1. DF Requirements
The DF election chooses the best router on a link to assume the The DF election chooses the best router on a link to assume the
responsibility of forwarding traffic between the RP and the link for responsibility of forwarding traffic between the RP and the link for the
the range of multicast groups served by the RP. Different multicast range of multicast groups served by the RP. Different multicast groups
groups that share a common RP must use the same bi-directional tree that share a common RP must use the same bi-directional tree for data
for data forwarding. Hence, the election of an upstream forwarder on forwarding. Hence, the election of an upstream forwarder on each link
each link does not have to be a group specific decision but instead does not have to be a group specific decision but instead can be RP-
can be RP-specific. As the number of RPs is typically small, the specific. As the number of RPs is typically small, the number of
number of elections that have to be performed is significantly elections that have to be performed is significantly reduced by this
reduced by this observation. observation.
To optimise tree creation, it is desirable that the winner of the To optimise tree creation, it is desirable that the winner of the
election process should be the router on the link with the "best" election process should be the router on the link with the "best"
unicast routing metric to the RP. When comparing metrics from dif- unicast routing metric to the RP (as reported by the MRIB). When
ferent unicast routing protocols, we use the same comparison rules comparing metrics from different unicast routing protocols, we use the
used in the PIM assert process [1]. same comparison rules used by the PIM-SM assert process [9].
The election process needs to take place when information on a new RP The election process needs to take place when information on a new RP
initially becomes available, and can be re-used as new bidir groups initially becomes available, and can be re-used as new bidir groups for
for the same RP are encountered. There are however some conditions the same RP are encountered. There are however some conditions where an
where an update to the election is required: update to the election is required:
o There is a change in unicast metric to reach the RP for any of the o There is a change in unicast metric to reach the RP for any of
routers on the link. the routers on the link.
o The interface on which the RP is reachable changes to an interface o The interface on which the RP is reachable changes to an
for which the router was previously the DF. interface for which the router was previously the DF.
o A new PIM neighbor starts up on a link. o A new PIM neighbor starts up on a link.
o The elected DF dies. o The elected DF dies.
The election process has to be robust enough to ensure with very high The election process has to be robust enough to ensure with very high
probability that all routers on the link have a consistent view of probability that all routers on the link have a consistent view of the
the DF. This is because with the forwarding rules described in sec- DF. This is because with the forwarding rules described in section 3.3
tion 3.2, if multiple routers end-up thinking that they should be if multiple routers end-up thinking that they should be responsible for
responsible for forwarding, loops may result. To reduce the possibil- forwarding, loops may result. To reduce the possibility of this
ity of this occurrence to a minimum, the election algorithm has been occurrence to a minimum, the election algorithm has been biased towards
biased towards discarding DF information and suspending forwarding discarding DF information and suspending forwarding during periods of
during periods of ambiguity. ambiguity.
4.2 DF Election Description 3.5.2. DF Election description
To perform the election of the DF for a particular RP, routers on a This section does not provide the definitive specification for the DF
link need to exchange their unicast routing metric information for election process. If any discrepancy exists between section 3.5.3 and
reaching the RP. this section, the specification in section 3.5.3 is to be assumed
correct.
In the election protocol described below, many message exchanges are To perform the election of the DF for a particular RP, routers on a link
repeated 3 times for reliability. In all those cases the message need to exchange their unicast routing metric information (as reported
retransmissions are spaced in time by a small random interval. by the MRIB) for reaching the RP.
For the purposes of the election, interface specific counters and In the election protocol described below, many message exchanges are
timers need to be maintained for each RP. When (*,G) entries are repeated Election_Robustness times for reliability. In all those cases
created, they inherit information on the elected DF from the the message retransmissions are spaced in time by a small random
corresponding RP database entry. Subsequent changes in the winner of interval.
the DF election for a RP are propagated to all dependent (*,G)
entries.
4.2.1 Bootstrap Election 3.5.2.1. Bootstrap Election
Initially when no DF has been elected, routers finding out about a Initially when no DF has been elected, routers finding out about a new
new RP start participating in the election by sending Offer messages. RP start participating in the election by sending Offer messages. Offer
Offer messages include the router's metric to reach the RP. Offers messages include the router's metric to reach the RP. Offers are
are periodically retransmitted with a period randomly chosen in the periodically retransmitted with a period of Offer_Interval.
interval [0.5 * Offer-Interval, Offer-Interval].
If a router hears a better offer from a neighbor, it stops If a router hears a better offer from a neighbor, it stops participating
participating in the election for a period of [3 * Offer-Interval]. in the election for a period of Election_Robustness * Offer_Interval. If
If during this period no winner is elected, then it restarts the during this period no winner is elected, then the router restarts the
election from the beginning. If a router receives an offer with election from the beginning. If a router receives an offer with worse
worse metrics, then it restarts the election from the beginning. metrics, then it restarts the election from the beginning.
The result should be that all routers except the best candidate stop The result should be that all routers except the best candidate stop
advertising. advertising their offers.
A router assumes the role of the DF after having advertised its A router assumes the role of the DF after having advertised its metrics
metrics 3 times without receiving any offer from any other neighbor. Election_Robustness times without receiving any offer from any other
At that point it transmits a Winner message which declares to every neighbor. At that point it transmits a Winner message which declares to
other router on the link the identity of the winner and the metrics
it is using.
Routers hearing a winner message stop participating in the election every other router on the link the identity of the winner and the
and record the identity and metrics of the winner. If the local metrics it is using.
metrics are better than those of the winner then the router records
the identity of the winner but reinitiates the election.
4.2.2 Loser Metric Changes Routers hearing a winner message stop participating in the election and
record the identity and metrics of the winner. If the local metrics are
better than those of the winner then the router records the identity of
the winner but reinitiates the election.
3.5.2.2. Loser Metric Changes
Whenever the unicast metric to a RP changes for a non-DF router to a Whenever the unicast metric to a RP changes for a non-DF router to a
value that is better than that previously advertised by the DF, the value that is better than that previously advertised by the acting DF,
router with the new metric should take action to eventually assume the router with the new metric should take action to eventually assume
forwarding responsibility. After the metric change is detected, the forwarding responsibility. After the metric change is detected, the new
new candidate restarts participating in the election. If no response candidate restarts participating in the election. If no response is
is received after 3 retransmissions, the router assumes the role of received after Election_Robustness retransmissions, the router assumes
the DF following the usual Winner announcement procedure. the role of the DF following the usual Winner announcement procedure.
Upon receipt of an offer that is worse than its current metric, the Upon receipt of an offer that is worse than its current metric, the DF
DF will respond with a Winner message declaring its status and will respond with a Winner message declaring its status and advertising
advertising its metric. Upon receiving this message, the originator its metric. Upon receiving this message, the originator of the Offer
of the Offer records the identity of the DF and aborts the election. records the identity of the DF and aborts the election.
Upon receipt of an offer that is better the its current metric, the Upon receipt of an offer that is better the its current metric, the DF
DF records the identity and metrics of the offering router and records the identity and metrics of the offering router and responds
responds with a Backoff message. This instructs the offering router with a Backoff message. This instructs the offering router to hold off
to hold off for a short period of time while the unicast routing sta- for a short period of time while the unicast routing stabilises. The
bilises. The Backoff message includes the offering router's new Backoff message includes the offering router's new metric and address.
metric and address. All routers on the link who have pending offers All routers on the link who have pending offers with metrics worse than
with metrics worse than those in the backoff message (including the those in the backoff message (including the original offering router)
original offering router) will hold further offers for a period of will hold further offers for a period of time defined in the Backoff
time defined in the Backoff message. message.
If during the backoff period, a third router sends a new better If during the Backoff_Period, a third router sends a new better offer,
offer, the Backoff message is repeated for the new offer and the the Backoff message is repeated for the new offer and the Backoff_Period
backoff period restarted. restarted.
Before the backoff period expires, the acting DF nominates the router Before the Backoff_Period expires, the acting DF nominates the router
having made the best offer as the new DF using a Pass message. This having made the best offer as the new DF using a Pass message. This
message includes the IDs and metrics of both the old and new DFs. message includes the IDs and metrics of both the old and new DFs. The
The old DF stops performing its tasks as soon as the transmission is old DF stops performing its tasks as soon as the transmission is made.
made. The new DF assumes the role of the DF as soon as it receives The new DF assumes the role of the DF as soon as it receives the Pass
the Pass message. All other routers on the link take note of the new message. All other routers on the link take note of the new DF and its
DF and its metric. metric.
4.2.3 Winner Metric Changes 3.5.2.3. Winner Metric Changes
If the DF's routing metric to reach the RP changes to a worse value, If the DF's routing metric to reach the RP changes to a worse value, it
it sends a set of 3 randomly spaced Offer messages on the link, sends a set of Election_Robustness randomly spaced Offer messages on the
advertising the new metric. Routers who receive this announcement but link, advertising the new metric. Routers who receive this announcement
have a better metric may respond with an Offer message which results but have a better metric may respond with an Offer message which results
in the same handoff procedure described above. All routers assume in the same handoff procedure described above. All routers assume the
the DF has not changed until they see a Pass or Winner message indi- DF has not changed until they see a Pass or Winner message indicating
cating the change. the change.
There is no pressure to make this handoff quickly if the acting DF There is no pressure to make this handoff quickly if the acting DF still
still has a path to the RP. The old path may now be suboptimal but it has a path to the RP. The old path may now be suboptimal but it will
will still work while the re-election is in progress. still work while the re-election is in progress.
If the routing metric at the DF changes to a better value, a single If the routing metric at the DF changes to a better value, a single
Winner message is sent advertising the new metric. Winner message is sent advertising the new metric.
4.2.4 Winner Loses Path 3.5.2.4. Winner Loses Path
If a router's path to the RP switches to be through a link for which If a router's RPF_interface to the RP switches to be on a link for which
it is acting as the DF, then it can no longer provide forwarding ser- it is acting as the DF, then it can no longer provide forwarding
vices for that link. It therefore immediately stops being the DF and services for that link. It therefore immediately stops being the DF and
restarts the election. As its path to the RP is through the link, an restarts the election. As its path to the RP is through the link, an
infinite metric is used in the Offer message it sends. infinite metric is used in the Offer message it sends.
Note: At this stage the old DF will have a new RPF neighbor on the Note: At this stage the old DF will have a new RPF neighbor on the link
link (indicated by unicast routing) which it could use in a Pass mes- (indicated by unicast routing) which it could use in a Pass message but
sage but this adds unnecessary complication to the election process. this adds unnecessary complication to the election process.
4.2.5 Late Router Starting Up 3.5.2.5. Late Router Starting Up
A late router starting up will have no knowledge of a previous elec- A late router starting up will have no knowledge of a previous election
tion outcome. As a result it will start advertising its metric in outcome. As a result it will start advertising its metric in Offer
Offer messages. As soon as this happens, the Winner will respond messages. As soon as this happens, the Winner will respond either with a
either with a Winner or with a Backoff message. Winner or with a Backoff message.
4.2.6 Winner Dies 3.5.2.6. Winner Dies
Whenever the DF dies, a new DF has to be elected. The speed at which Whenever the DF dies, a new DF has to be elected. The speed at which
this can be achieved depends on whether there are any downstream this can be achieved depends on whether there are any downstream routers
routers on the link. on the link.
If there are downstream routers, typically their RPF neighbor If there are downstream routers, typically their RPF_neighbor as
reported by unicast routing will be the DF. They will therefore reported by the MRIB before the DF dies will be the DF itself. They will
notice a change in RPF neighbor away from the DF and will restart the therefore notice a change in RPF_neighbor away from the DF and will
election by transmitting Offer messages. If the RP is now reachable
through the link via another upstream router, an infinite metric will restart the election by transmitting Offer messages. If according to
be used in the Offer. the MRIB the RP is now reachable through the same link via another
upstream router, an infinite metric will be used in the Offer.
If no downstream routers are present, the only way for other upstream If no downstream routers are present, the only way for other upstream
routers to detect a DF failure is by the timeout of the PIM neighbor routers to detect a DF failure is by the timeout of the PIM neighbor
information, which will take significantly longer. information, which will take significantly longer.
4.3 Election Protocol Specification 3.5.3. Election Protocol Specification
4.3.1 Protocol State This section provides the definitive specification for the DF election
process. If any discrepancy exists between section 3.5.2 and this
section, the specification in this section is to be assumed correct.
3.5.3.1. Election State
The operation of the election protocol makes use of the variables and The operation of the election protocol makes use of the variables and
timers described below. These are maintained per RP for each multi- timers described below. These are maintained per RP for each multicast
cast enabled interface on the router. enabled interface on the router as introduced in section 3.1:
Offer-Count (O-count) Acting DF information
Used to maintain the number of times an Offer or Winner mes- Used to store the election winner who is the currently acting
sage has been transmitted. DF.
Election-Timer (DFT)
Used to schedule transmission of Offer, Winner and Pass
messages.
Offer-Count (OC)
Used to maintain the number of times an Offer or Winner
message has been transmitted.
Best-Offer Best-Offer
Used by the DF to record who has made the last offer for Used by the DF to record who has made the last offer for
sending the Pass message. sending the Pass message.
Offer-Timer (O-timer) 3.5.3.2. Election Messages
Used to schedule transmission of Offer and Winner messages.
Pass-Timer (P-timer)
Used on the DF to schedule transmission of a Pass message.
4.3.2 Message Summary
The election uses the following control messages: The election process uses the following PIM control messages the packet
format of which is described in section 3.7:
Offer (OfferingID, Metric) Offer (OfferingID, Metric)
Sent by routers that believe they have a better metric to Sent by routers that believe they have a better metric to the
the RP than the metric that has been on offer so far. RP than the metric that has been on offer so far.
Winner (DF-ID, DF-Metric) Winner (DF-ID, DF-Metric)
Sent by a router when assuming the role of the DF or when Sent by a router when assuming the role of the DF or when re-
re-asserting in response to worse offers. asserting in response to worse offers.
Backoff (DF-ID, DF-Metric, OfferingID, OfferMetric, Backoff (DF-ID, DF-Metric, OfferingID, OfferMetric,
BackoffInterval) BackoffInterval)
Used by the DF to acknowledge better offers. It instructs Used by the DF to acknowledge better offers. It instructs
other routers with equal or worse offers to wait till the DF other routers with equal or worse offers to wait till the DF
passes responsibility to the sender of the offer. passes responsibility to the sender of the offer.
Pass (Old-DF-ID, Old-DF-Metric, New-DF-ID, New-DF-Metric) Pass (Old-DF-ID, Old-DF-Metric, New-DF-ID, New-DF-Metric)
Used by the old DF to pass forwarding responsibility to a Used by the old DF to pass forwarding responsibility to a
router that has previously made an offer. The Old-DF-Metric router that has previously made an offer. The Old-DF-Metric
is the current metric of the DF at the time the pass is is the current metric of the DF at the time the pass is sent.
sent.
4.3.3 Protocol Events 3.5.3.3. Election Events
During protocol operation, in addition to the expiration of the two During protocol operation, in addition to the expiration of the
timers and reception of the four messages, the following events can Election-Timer and the reception of the four control messages, the
take place: following events can take place:
o Discovery of new RP o Discovery of new RP
o Metric change o Metric reported by the MRIB to reach the RP changes
o DF loses path o DF loses path to RP
o Detection of DF failure (unicast routing changed for downstream or o Detection of DF failure
Hello expired)
4.3.4 Protocol Operation 3.5.3.4. Election State Transitions
In the two tables below the following rules and notation apply: In the state machine presented below a router is considered to be an
acting DF if it is in the Win or Backoff states.
o Whenever the notation "?=" is used to assign a value to a timer, When an action of "set DF to Sender or Dest" is encountered during
the value is assigned only if the timer is not running or the time receipt of a Winner, Pass or Backoff message it means the following:
left running is longer than the new value.
o When a new DF is discovered through the receipt of a Winner or Pass o On receipt of a Winner message set the DF to be the originator of
message, if it is not already a PIM neighbor, a neighbor entry is the message and record it's metrics.
created with the default expiration interval.
o Whenever the DF is set, the associated metrics are also recorded. o On receipt of a Pass message set the DF to be the target of the
message and record it's metrics.
o Timers in square brackets are randomly chosen between 0.5 and 1 o On receipt of a Backoff message set the DF to be the originator
times the supplied value. of the message and record it's metrics.
o When a router has a path to the RP through the link on which the +-----------------------------------+
election is taking place, an infinite metric is used in Offer mes- | Figures omitted from text version |
sages. +-----------------------------------+
Event Condition Non-DF action DF action Figure 3: Designated Forwarder election state-machine
=======================================================================
Offer |Local metric |O-count = 0 |Send Backoff
rcvd |worse |O-timer = [Offer-Int] |Best-Offer = sender
| | + 3 * Offer-Int |P-timer = Backoff-Int
| | |Stop O-timer
|--------------------------------------------------------------
|Local metric |O-count = 0 |Send Winner
|better |O-timer ?= [Offer-Int] |Stop P-timer
-----------------------------------------------------------------------
Winner |Local metric |O-count = 0
rcvd |worse |Stop O-timer
| |DF = sender
| |Stop P-timer
|--------------------------------------------------------------
|Local metric |O-count = 0
|better |O-timer ?= [Offer-Int]
| |DF = sender
| |Stop P-timer
-----------------------------------------------------------------------
Backoff |Local metric |O-count = 0
rcvd |worse or to us |O-timer = Backoff-Int + [Offer-Int]
| |DF = sender
| |Stop P-timer
|--------------------------------------------------------------
|Local metric |O-count = 0
|better |O-timer ?= [Offer-Int]
| |DF = sender
| |Stop P-timer
-----------------------------------------------------------------------
Pass |Local metric |O-count = 0
rcvd |worse or to us |Stop O-timer
| |DF = destination
| |Stop P-timer
|--------------------------------------------------------------
|Local metric |O-count = 0
|better |0-timer ?= [Offer-Int]
| |DF = destination
| |Stop P-timer
-----------------------------------------------------------------------
Event Condition Non-DF action DF action
=======================================================================
New RP | |O-count = 0 |N/A
| |O-timer ?= [Offer-Int] |
-----------------------------------------------------------------------
Metric |DF metric |nop |O-count = 0
change |better (*) | |O-timer ?= [Offer-Int]
| | |Stop P-timer
|--------------------------------------------------------------
|DF metric |O-count = 0 |Send Winner
|worse (*) |O-timer ?= [Offer-Int] |Stop P-timer
-----------------------------------------------------------------------
No path | |nop |Send Offer (**)
to RP | | |O-count = 1
| | |O-timer ?= [Offer-Int]
| | |DF = unknown
| | |Stop P-timer
-----------------------------------------------------------------------
DF | |O-count = 0 |N/A
failure | |O-timer ?= [Offer-Int] |
-----------------------------------------------------------------------
O-timer |O-count <= 3 |Send Offer
expires | |O-count++
| |O-timer ?= [Offer-Int]
|---------------|----------------------------------------------
|else |O-count = 0
| |Send Winner (***)
| |DF = us
-----------------------------------------------------------------------
P-timer | |DF = Best-Offer
expires | |Send Pass
-----------------------------------------------------------------------
(*) These comparisons are made against the previously stored DF In tabular form, the state machine is:
metrics. In the case of the DF, the old local metrics are used to
compare against. So "DF metric better" means that the metric has
actually become worse.
(**) As the path to the RP is now through the link an infinite metric +-------------++--------------------------------------------------------+
is used in the offer. | || Event |
| Prev State ++------------------+------------------+------------------+
| || Recv better | Recv better | Recv better |
| || Pass / Win | Backoff | Offer |
+-------------++------------------+------------------+------------------+
| || -> Lose | - | - |
| || DF = Sender | Set Timer to | Set Timer to |
| Offer || | Robustness * | Robustness * |
| || | offer_int; Set | offer_int; Set |
| || | Count to 0 | Count to 0 |
+-------------++------------------+------------------+------------------+
| || - | - | -> Offer |
| || DF = Sender or | DF = Sender or | Set Timer to |
| Lose || Dest | Dest | Robustness * |
| || | | offer_int; Set |
| || | | Count to 0 |
+-------------++------------------+------------------+------------------+
| || -> Lose | -> Lose | -> Backoff |
| || DF = Sender or | DF = Sender or | Set Best to |
| Win || Dest; Stop | Dest; Stop | Sender; Send |
| || Timer | Timer | Backoff; Set |
| || | | Timer to |
| || | | backoff_int |
+-------------++------------------+------------------+------------------+
| || -> Lose | -> Lose | - |
| || DF = Sender or | DF = Sender or | Set Best to |
| Backoff || Dest | Dest | Sender; Send |
| || | | Backoff; Set |
| || | | Timer to |
| || | | backoff_int |
+-------------++------------------+------------------+------------------+
(***) We only become the DF and send a Winner message if we have a +-----------++----------------------------------------------------------+
path to the RP (which is not through the link with the ongoing elec- | || Event |
tion). | ++-------------+--------------+--------------+--------------+
|Prev State ||Recv Backoff | Recv Pass | Recv Worse | Recv worse |
| ||for us | for us | Pass / Win / | Offer |
| || | | Backoff | |
+-----------++-------------+--------------+--------------+--------------+
| ||- | -> Win | - | - |
| ||Set Timer to | Stop Timer | Set DF to | Set/Lower |
| ||Hi; Set | | Sender or | Timer to |
|Offer ||Count to 0 | | Dest; | Low; Set |
| || | | Set/Lower | Count to 0 |
| || | | Timer to | |
| || | | Low; Set | |
| || | | Count to 0 | |
+-----------++-------------+--------------+--------------+--------------+
| ||-> Offer | -> Offer | -> Offer | -> Offer |
| ||DF = Sender | DF = Sender | DF = Sender | Set Timer to |
| ||or Dest; Set | or Dest; Set | or Dest; Set | offer_int; |
|Lose ||Timer to | Timer to | Timer to | Set Count to |
| ||offer_int; | offer_int; | offer_int; | 0 |
| ||Set Count to | Set Count to | Set Count to | |
| ||0 | 0 | 0 | |
+-----------++-------------+--------------+--------------+--------------+
| ||-> Offer | -> Offer | -> Offer | - |
| ||DF = Sender | DF = Sender | DF = Sender | Send Winner |
| ||or Dest; Set | or Dest; Set | or Dest; Set | |
|Win ||Timer to | Timer to | Timer to | |
| ||offer_int; | offer_int; | offer_int; | |
| ||Set Count to | Set Count to | Set Count to | |
| ||0 | 0 | 0 | |
+-----------++-------------+--------------+--------------+--------------+
| ||-> Offer | -> Offer | -> Offer | -> Win |
| ||DF = Sender | DF = Sender | DF = Sender | Send Winner; |
| ||or Dest; Set | or Dest; Set | or Dest; Set | Stop Timer |
|Backoff ||Timer to | Timer to | Timer to | |
| ||offer_int; | offer_int; | offer_int; | |
| ||Set Count to | Set Count to | Set Count to | |
| ||0 | 0 | 0 | |
+-----------++-------------+--------------+--------------+--------------+
4.4 Election Message Formats +-----------------------------------------------------------------------+
| In Offer State |
+-----------------------+------------------------+----------------------+
| Timer Expires and | Timer Expires and | Timer Expires and |
| Count is less than | Count is equal to | Count is equal to |
| Robustness | Robustness and we | Robustness and |
| | have path to RP | there is no path |
| | | to RP |
+-----------------------+------------------------+----------------------+
| - | -> Win | -> Lose |
| Send Offer; Set | Send Winner | Set DF to None |
| Timer to | | |
| offer_int; | | |
| Increase Count by | | |
| 1 | | |
+-----------------------+------------------------+----------------------+
All election messages are sent with a TTL of 1 and are multicast to +-----------------------------------------------------------------------+
the ALL-PIM-ROUTERS group. The structure of Encoded-Unicast addresses | In Lose State |
is described in [1]. +-----------------------------------+-----------------------------------+
| Detect DF Failure | Metric changes and now |
| | is better than DF |
+-----------------------------------+-----------------------------------+
| -> Offer | -> Offer |
| DF = None; Set timer to | Set timer to offer_int; |
| offer_int; Set Count to | Set Count to 0 |
| 0 | |
+-----------------------------------+-----------------------------------+
4.4.1 Common Header +-----------------------------------------------------------------------+
| In Win State |
+------------------------+------------------------+---------------------+
| Metric changes and | Timer Expires and | No path to RP |
| is now worse | Count is less than | |
| | Robustness | |
+------------------------+------------------------+---------------------+
| - | - | -> Offer |
| Set Timer to | Send Winner; Set | Set DF to None; |
| offer_int; Set | Timer to | Set Timer to |
| Count to 0 | offer_int; | offer_int; Set |
| | Increment Count by | Count to 0 |
| | 1 | |
+------------------------+------------------------+---------------------+
The header below is common to all election messages. +-----------------------------------------------------------------------+
| In Backoff State |
+-----------------------------------+-----------------------------------+
| Metric changes and is | Timer Expires |
| now better than Best | |
+-----------------------------------+-----------------------------------+
| -> Win | -> Lose |
| Stop Timer | Send Pass; Set DF to |
| | stored Best |
+-----------------------------------+-----------------------------------+
3.6. Timers and Constants
BIDIR-PIM maintains the following timers, as discussed in section 3.1.
All timers are countdown timers - they are set to a value and count down
to zero, at which point they typically trigger an action. Of course
they can just as easily be implemented as count-up timers, where the
absolute expiry time is stored and compared against a real-time clock,
but the language in this specification assumes that they count downwards
to zero.
Per Rendezvous-Point (RP):
Per interface (I):
DF Election Timer: DFT(RP,I)
Per Group (G):
Upstream Join Timer: JT(G)
Per interface (I):
Join Expiry Timer: ET(G,I)
PrunePending Timer: PPT(G,I)
When timers are started or restarted, they are set to default values.
This section summarizes those default values.
Timer Name: DF Election Timer (DFT)
+--------------------+-------------------------+------------------------+
| Value Name | Value | Explanation |
+--------------------+-------------------------+------------------------+
| Offer_Period | 100 ms | Interval to wait |
| | | between repeated |
| | | Offer and Winner |
| | | messages. |
+--------------------+-------------------------+------------------------+
| Backoff_Period | 1 sec | Period that acting |
| | | DF waits between |
| | | receiving a better |
| | | Offer and sending |
| | | the Pass message |
| | | to transfer DF |
| | | responsibility. |
+--------------------+-------------------------+------------------------+
| OPLow | rand(0.5, 1) * | Range of actual |
| | Offer_Period | randomised value |
| | | used between |
| | | repeated messages. |
+--------------------+-------------------------+------------------------+
| OPHigh | Election_Robustness | Interval to wait |
| | * Offer_Period | in order to give a |
| | | chance to a router |
| | | with a better |
| | | Offer to become |
| | | the DF. |
+--------------------+-------------------------+------------------------+
Timer Names: Join Expiry Timer (ET(G,I))
+----------------+----------------+-------------------------------------+
| Value Name | Value | Explanation |
+----------------+----------------+-------------------------------------+
| J/P HoldTime | from message | Hold Time from Join/Prune Message |
+----------------+----------------+-------------------------------------+
Timer Names: Prune Pending Timer (PPT(G,I))
+--------------------------+--------------------+-----------------------+
| Value Name | Value | Explanation |
+--------------------------+--------------------+-----------------------+
| J/P Override Interval | Default: 3 secs | Short period after |
| | | a join or prune to |
| | | allow other |
| | | routers on the LAN |
| | | to override the |
| | | join or prune |
+--------------------------+--------------------+-----------------------+
Timer Names: Upstream Join Timer (JT(G))
+-------------+--------------------+------------------------------------+
|Value Name |Value |Explanation |
+-------------+--------------------+------------------------------------+
|t_periodic |Default: 60 secs |Period between Join/Prune Messages |
+-------------+--------------------+------------------------------------+
|t_suppressed |rand(1.1 * |Suppression period when someone |
| |t_periodic, 1.4 * |else sends a J/P message so we |
| |t_periodic) |don't need to do so. |
+-------------+--------------------+------------------------------------+
|t_override |rand(0, 0.9 * J/P |Randomized delay to prevent |
| |Override Interval) |response implosion when sending a |
| | |join message to override someone |
| | |else's prune message. |
+-------------+--------------------+------------------------------------+
For more information about these values refer to the PIM-SM [9]
documentation.
Constant Name: DF Election Robustness
+--------------------------+-------------------+------------------------+
| Constant Name | Value | Explanation |
+--------------------------+-------------------+------------------------+
| Election_Robustness | Default: 3 | Minimum number of |
| | | election messages |
| | | that must be lost |
| | | in order for |
| | | election to fail. |
+--------------------------+-------------------+------------------------+
3.7. PIM DF Election Packet Formats
This section describes the details of the packet formats for BIDIR-PIM
control messages. BIDIR-PIM shares a number of control messages in
common with PIM-SM [9] well as the format for the Encoded-Unicast
address. For details on the format of these packets please refer to the
PIM-SM documentation. Here we will only define the additional packets
that are introduced by BIDIR-PIM. These are the packets used in the DF
election process.
All PIM control messages have IP protocol number 103.
BIDIR-PIM messages are multicast with TTL 1 to the `ALL-PIM-ROUTERS'
group `224.0.0.13'.
All DF election BIDIR-PIM control messages share the common header
below:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type |Subtype| Rsvd | Checksum | |PIM Ver| Type |Subtype| Rsvd | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address | | Encoded-Unicast-RP-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender Metric Preference | | Sender Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sender Metric | | Sender Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type PIM Ver
TBD PIM Version number is 2.
Type All DF-Election PIM control messages share the PIM message Type of
X.
Subtype Subtype
Used to distinguish between different election messages and Subtypes for DF election messages are:
is set according to the table below:
Message Subtype 1 = Offer
----------------------- 2 = Winner
Offer 1 3 = Backoff
Winner 2 4 = Pass
Backoff 3
Pass 4
Rsvd Rsvd Set to zero on transmission. Ignored upon receipt.
Set to zero by senders and ignored by receivers.
Checksum Checksum
Calculated as specified in [1]. The checksum is standard IP checksum, i.e. the 16-bit one's
complement of the one's complement sum of the entire PIM message.
For computing the checksum, the checksum field is zeroed.
RP-Address RP-Address
The address of the bidir RP for which the election is taking The address of the bidir RP for which the election is taking place
place. (note that the length of this field is more than 32 bits).
Sender Metric Preference Sender Metric Preference
Preference value assigned to the unicast routing protocol Preference value assigned to the unicast routing protocol that the
that the message sender used to obtain the route to the RP- message sender used to obtain the route to the RP-address.
address.
Sender Metric Sender Metric
The unicast routing table metric used by the message sender The unicast routing table metric used by the message sender to
to reach the RP. The metric is in units applicable to the reach the RP. The metric is in units applicable to the unicast
unicast routing protocol used. routing protocol used.
The Backoff and Pass messages have the additional fields described In addition to the fields defined above the Backoff and Pass messages
below. have the extra fields described below.
4.4.2 Backoff Message 3.7.1. Backoff Message
The Backoff message uses the following fields in addition to the com- The Backoff message uses the following fields in addition to the common
mon ones described above. election message format described above.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-Offering-Address | | Encoded-Unicast-Offering-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Offering Metric Preference | | Offering Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Offering Metric | | Offering Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interval | | Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Offering Address Offering Address
The address of the router that made the last (best) Offer. The address of the router that made the last (best) Offer (note
that the length of this field is more than 32 bits).
Offering Metric Preference Offering Metric Preference
Preference value assigned to the unicast routing protocol Preference value assigned to the unicast routing protocol that the
that the offering router used to obtain the route to RP- offering router used to obtain the route to RP-address.
address.
Offering Metric Offering Metric
The unicast routing table metric used by the offering router The unicast routing table metric used by the offering router to
to reach the RP. The metric is in units applicable to the reach the RP. The metric is in units applicable to the unicast
unicast routing protocol used. routing protocol used.
Interval Interval
The backoff interval in milliseconds to be used by routers The backoff interval in milliseconds to be used by routers with
with worse metrics than the offering router. worse metrics than the offering router.
4.4.3 Pass Message 3.7.2. Pass Message
The Pass message uses the following fields in addition to the common The Pass message uses the following fields in addition to the common
ones described above. election fields described above.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-New-Winner-Address | | Encoded-Unicast-New-Winner-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| New Winner Metric Preference | | New Winner Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| New Winner Metric | | New Winner Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
New Winner Address New Winner Address
The address of the router that made the last (best) Offer. The address of the router that made the last (best) Offer (note
that the length of this field is more than 32 bits).
New Winner Metric Preference New Winner Metric Preference
Preference value assigned to the unicast routing protocol Preference value assigned to the unicast routing protocol that the
that the offering router used to obtain the route to RP- offering router used to obtain the route to RP-address.
address.
New Winner Metric New Winner Metric
The unicast routing table metric used by the offering router The unicast routing table metric used by the offering router to
to reach the RP. The metric is in units applicable to the reach the RP. The metric is in units applicable to the unicast
unicast routing protocol used. routing protocol used.
4.5 Timer Values
The Offer-Interval is 100 ms. 4. RP Discovery
The default Backoff-Interval used in Backoff messages is 1 sec. Routers discover that a range of multicast group addresses operates in
bi-directional and the address of the Rendezvous-Point serving the group
range through the PIM Bootsrtap mechanism. For a description of the PIM
BSR RP-distribution protocol refer to PIM-SM [9].
5 Advertising Bi-directional Groups By default the BSR protocol advertises RPs that operate the PIM-SM
protocol. In order to identify a RP as operating in BIDIR mode, the
Routers discover that a group operates in bi-directional mode from Encoded-Group Address field in Bootstrap and Candidate-RP Advertisement
the Encoded-Group Address fields in PIM Bootstrap and Candidate-RP messages has been extended by adding the BIDIR bit (B-bit) as specified
Advertisement messages. The Encoded-Group Address field is modified below:
to include the Bidir-bit (B bit) as specified below:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Addr Family | Encoding Type |B| Reserved | Mask Len | | Addr Family | Encoding Type |B| Reserved | Mask Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Multicast Address | | Group Multicast Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
When the Bidir-bit is set, all upgraded bi-directional PIM routers B-bit
will follow the forwarding rules described in this specification. When the Bidir-bit is set, all BIDIR capable PIM routers will
operate the protocol described in this document for the specified
6 Security Considerations group range.
All PIM control messages MAY use IPsec to address security concerns.
7 References
[1] Estrin, et al., "Protocol Independent Multicast-Sparse Mode (PIM-
SM): Protocol Specification", RFC 2362, June 1998.
[2] D. Estrin, D. Farinacci, "Bi-directional Shared Trees in PIM-SM",
Work In Progress, <draft-farinacci-bidir-pim-01.txt>, May 1999.
[3] Wei, L., Farinacci, D., "PIM Version 2 DR Election Priority Option",
INTERNET-DRAFT, March 1998.
8 Acknowledgments
The bidir proposal in this draft is heavily based on the ideas and
text presented by Estrin and Farinacci in [2]. The main difference
between the two proposals is in the method chosen for upstream for-
warding.
We would also like to thank Deborah Estrin at ISI/USC as well as
Nidhi Bhaskar, Yiqun Cai, Tony Speakman and Rajitha Sumanasakera at
cisco for their contributions and comments to this draft.
9 Author Information
Mark Handley
mjh@aciri.org
AT&T Center for Internet Research at ICSI
Isidor Kouvelas 5. Security Considerations
kouvelas@cisco.com
cisco Systems
Lorenzo Vicisano All PIM control messages MAY use IPsec [6] to address security concerns.
lorenzo@cisco.com The authentication methods are addressed in a companion document [7].
cisco Systems Keys may be distributed as described in [8].
Appendix A: Election Reliability Enhancements 5.1. Appendix A: Election Reliability Enhancements
For the correct operation of bi-directional PIM it is very important For the correct operation of bi-directional PIM it is very important to
to avoid situations where two routers consider themselves to be avoid situations where two routers consider themselves to be Designated
Designated Forwarders for the same link. The two precautions below Forwarders for the same link. The two precautions below are not required
are not required for correct operation but can help diagnose for correct operation but can help diagnose anomalies and correct them.
anomalies and correct them.
A.1 Missing Pass 5.1.1. A.1 Missing Pass
After a DF has been elected, a router whose metrics change to become After a DF has been elected, a router whose metrics change to become
better than the DF will attempt to take over. If during the re- better than the DF will attempt to take over. If during the re-election
election the acting DF has a condition that causes it to lose all of the acting DF has a condition that causes it to lose all of the election
the election messages (like a CPU overload), the new candidate will messages (like a CPU overload), the new candidate will transmit three
transmit three offers and assume the role of the forwarder resulting offers and assume the role of the forwarder resulting in two DFs on the
in two DFs on the link. This situation is pathological and should be link. This situation is pathological and should be corrected by fixing
corrected by fixing the overloaded router. It is desirable that such the overloaded router. It is desirable that such an event can be
an event can be detected by a network administrator. detected by a network administrator.
When a router becomes the DF for a link without receiving a Pass mes- When a router becomes the DF for a link without receiving a Pass message
sage from the known old DF, the PIM neighbor information for the old from the known old DF, the PIM neighbor information for the old DF can
DF can be marked to this effect. Upon receiving the next PIM Hello be marked to this effect. Upon receiving the next PIM Hello message from
message from the old DF, the router can retransmit Winner messages the old DF, the router can retransmit Winner messages for all the RPs
for all the RPs for which it acting as the DF. The anomaly may also for which it acting as the DF. The anomaly may also be logged by the
be logged by the router to alert the operator. router to alert the operator.
A.2 Periodic Winner Announcement 5.1.2. A.2 Periodic Winner Announcement
An additional degree of safety can be achieved by having the DF for An additional degree of safety can be achieved by having the DF for each
each RP periodically announce its status in a Winner message. RP periodically announce its status in a Winner message. Transmission
Transmission of the periodic Winner message can be restricted to of the periodic Winner message can be restricted to occur only for RPs
occur only for RPs which have active groups, thus avoiding the which have active groups, thus avoiding the periodic control traffic in
periodic control traffic in areas of the network without senders or areas of the network without senders or receivers for a particular RP.
receivers for a particular RP.
Appendix B: Interoperability with legacy code 5.2. Appendix B: Interoperability with legacy code
The rules provided in [2] for interoperating between legacy PIM-SM The rules provided in [10] for interoperating between legacy PIM-SM
routers and new bi-directional capable routers change only slightly routers and new bi-directional capable routers change only slightly to
to support this new proposal. The only difference is in the defini- support this new proposal. The only difference is in the definition of a
tion of a boundary between a bi-directional capable area and a legacy boundary between a bi-directional capable area and a legacy area of the
area of the network. In [2], a bidir capable router forwarding network. In [10], a bidir capable router forwarding upstream, register
upstream, register encapsulates the data packet to the RP if its RPF encapsulates the data packet to the RP if its RPF neighbor is not bidir
neighbor is not bidir capable. capable.
In our proposal, since all the routers on a link need to co-operate In our proposal, since all the routers on a link need to co-operate to
to elect the Designated Forwarder, if even one of the routers on the elect the Designated Forwarder, if even one of the routers on the link
link is a legacy router, the election cannot take place. As a result is a legacy router, the election cannot take place. As a result register
register encapsulation is necessary if one or more routers on the RPF encapsulation is necessary if one or more routers on the RPF interface
interface are not bi-directional capable. are not bi-directional capable.
As in [2], a Hello option must be used to differentiate between bi- As in [10], a Hello option must be used to differentiate between bi-
directional capable and legacy routers, and (S,G) state must be directional capable and legacy routers, and (S,G) state must be created
created on the router doing the register encapsulation to prevent on the router doing the register encapsulation to prevent loops.
loops.
Appendix C: Comparison with PIM-SM 5.3. Appendix C: Comparison with PIM-SM
This section describes the main differences between Bidir PIM and This section describes the main differences between Bidir PIM and
sparse-mode PIM: sparse-mode PIM:
o Bidir PIM uses a single shared tree for distributing the data for o Bidir PIM uses a single shared tree for distributing the data for
all the sources of a multicast group. The use of a signle tree sig- all the sources of a multicast group. The use of a single tree
nificantly reduces state requirements on a router. The drawback is significantly reduces state requirements on a router. The
that it may produce suboptimal paths from sources to receivers pos- drawback is that it may produce suboptimal paths from sources to
sibly resulting in higher network latency and less efficient receivers possibly resulting in higher network latency and less
bandwidth utilisation. efficient bandwidth utilisation.
o In Bidir PIM, packets traveling from a source to the RP, are o In Bidir PIM, packets traveling from a source to the RP, are
natively forwarded on the shared tree. In contrast sparse-mode PIM natively forwarded on the shared tree. In contrast sparse-mode
uses unicast encapsulation or source-specific state. PIM uses unicast encapsulation or source-specific state.
o In Bidir PIM, sender-only branches do not need to keep group state. o In Bidir PIM, sender-only branches do not need to keep group
Data from the source can be natively forwarded towards the RP using state. Data from the source can be natively forwarded towards the
RP-specific forwarding state. RP using RP-specific forwarding state.
o The Bidir Designated Forwarder (DF) assumes all the responsibili- o The Bidir Designated Forwarder (DF) assumes all the
ties of the sparse-mode DR. In a multi-access link, the DF responds responsibilities of the sparse-mode DR. In a multi-access link,
to IGMP notifications. Downstream routers on the link use the DF as the DF responds to IGMP notifications. Downstream routers on the
their upstream neighbor and direct all Join/Prune messages towards link use the DF as their upstream neighbor and direct all
it. Join/Prune messages towards it.
o To enforce a single forwarder on multi-access links, sparse-mode o To enforce a single forwarder on multi-access links, sparse-mode
PIM uses the Assert mechanism which requires data-packets to PIM uses the Assert mechanism which requires data-packets to
trigger protocol events. In Bidir PIM, data-driven events are com- trigger protocol events. In Bidir PIM, data-driven events are
pletely eliminated as a correct route is always available at packet completely eliminated as a correct route is always available at
forwarding time. packet forwarding time.
The DF election problem is easier than the assert problem because The DF election problem is easier than the assert problem because
there is a small number of RPs and the per RP DF election can be there is a small number of RPs and the per RP DF election can be
done in advance. With the assert mechanism, in addition to each RP, done in advance. With the assert mechanism, in addition to each RP,
a forwarder has to be elected for each possible source to a group. a forwarder has to be elected for each possible source to a group.
This can not be done before data is available. This can not be done before data is available.
o With sparse-mode PIM, when forwarding packets using shared-tree o With sparse-mode PIM, when forwarding packets using shared-tree
(*,G) state, a directly-connected-source check has to be made on (*,G) state, a directly-connected-source check has to be made on
every packet. This is done to determine if the packet was ori- every packet. This is done to determine if the packet was
ginated by a source which is directly connected to the router. For originated by a source which is directly connected to the router.
a connected source, source-specific state has to be created to For a connected source, source-specific state has to be created
register packets to the RP and prune the source off the shared to register packets to the RP and prune the source off the shared
tree. tree.
With Bidir PIM directly connected sources do not need any special With Bidir PIM directly connected sources do not need any special
handling. The DF for the RP of the group the source is sending to, handling. The DF for the RP of the group the source is sending to,
seamlessly picks-up and forwards upstream traveling packets. seamlessly picks-up and forwards upstream traveling packets.
Appendix D: Comparison with UMP based bidirectional PIM 6. Authors' Addresses
Using an UMP option for upstream forwarding has the following disad- Mark Handley
vantages: ACIRI/ICSI
1947 Center St, Suite 600
Berkeley, CA 94708
mjh@aciri.org
o Using the DF election, only routers willing to be forwarders can be Isidor Kouvelas
elected. In contrast in [2], the downstream router designates the Cisco Systems
upstream neighbor responsible for forwarding (using Joins and UMP kouvelas@cisco.com
packets). Lorenzo Vicisano
Cisco Systems
lorenzo@cisco.com
o Using the UMP option, regular data packets are overloaded with con- 7. Acknowledgments
trol information for the routing protocol.
o Inserting the extra option in multicast packets transmitted from a The bidir proposal in this draft is heavily based on the ideas and text
source may result in a packet size exceeding the MTU which will presented by Estrin and Farinacci in [10]. The main difference between
result in fragmentation. the two proposals is in the method chosen for upstream forwarding.
o The use of an option complicates the router forwarding mechanism. We would also like to thank Deborah Estrin at ISI/USC as well as Nidhi
Additional code to process the new special packet type needs to be Bhaskar, Yiqun Cai, Tony Speakman and Rajitha Sumanasakera at cisco for
written. their contributions and comments to this draft.
o The contents of the UMP option have to be rewritten and the packet 8. References
checksum adjusted on each hop towards the RP at data forwarding
time. This introduces additional per packet processing overhead.
In bidir PIM [2], if the router elected as the DR is different from [1] T. Bates , R. Chandra , D. Katz , Y. Rekhter, "Multiprotocol
that chosen by downstream neighbors for joining the tree, loops can Extensions for BGP-4", RFC 2283
occur. The main shortcoming of the DR is that its election does not
take into consideration the location of the RP. To resolve this prob-
lem the DR priority draft [3] provides a method for manually confi-
guring the DR election winner. Although this provides a solution it
has two drawbacks:
o It requires a case by case manual configuration. [2] S.E. Deering, "Host extensions for IP multicasting", RFC 1112, Aug
1989.
o It cannot solve the problem if there are different RPs in a domain [3] W. Fenner, "Internet Group Management Protocol, Version 2", RFC
serving separate multicast group ranges. In this scenario the 2236.
requirements of each RP for the DR positioning on a particular link
may differ. [4] IANA, "Address Family Numbers", linked from
http://www.iana.org/numbers.html
[5] T. Narten , H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", RFC 2434.
[6] S. Kent, R. Atkinson, "Security Architecture for the Internet
Protocol.", RFC 2401.
[7] L. Wei, "Authenticating PIM version 2 messages", draft-ietf-pim-
v2-auth-01.txt, work in progress.
[8] T. Hardjono, B. Cain, "Simple Key Management Protocol for PIM",
draft-ietf-pim-simplekmp-01.txt, work in progress.
[9] B. Fenner, M. Handley, H. Holbrook, I. Kouvelas "Protocol
Independent Multicast - Sparse Mode (PIM-SM): Protocol
Specification (Revised)", Work In Progress, <draft-ietf-pim-sm-
v2-new-01.txt>, 2000.
[10] D. Estrin, D. Farinacci, "Bi-directional Shared Trees in PIM-SM",
Work In Progress, <draft-farinacci-bidir-pim-01.txt>, May 1999.
[11] D. Estrin et al, "Protocol Independent Multicast-Sparse Mode (PIM-
SM): Protocol Specification", RFC 2362, Nov 1999.
9. Index
DownstreamJPState(G,I) . . . . . . . . . . . . . . . . . . . . . . . 12
ET(G,I). . . . . . . . . . . . . . . . . . . . . . . . . . . . .10,14,29
ET(RP,I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I_am_DF(RP,I). . . . . . . . . . . . . . . . . . . . . . . . . .11,13,16
J/P_HoldTime . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
J/P_Override_Interval. . . . . . . . . . . . . . . . . . . . . . . 16,30
JoinDesired(G) . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
joins(G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
JT(*,G). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
JT(G). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10,30
local_receiver_include(G,I). . . . . . . . . . . . . . . . . . . . . 11
NLT(N,I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Offer_Period . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
olist(G) . . . . . . . . . . . . . . . . . . . . . . . . . . . .11,13,18
OT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
pim_include(G) . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
PPT(G,I) . . . . . . . . . . . . . . . . . . . . . . . . . . . .10,14,30
RPF_interface(RP). . . . . . . . . . . . . . . . . . . . . . . . . 11,13
t_override . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18,30
t_periodic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18,30
t_suppressed . . . . . . . . . . . . . . . . . . . . . . . . . . . 18,30
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/