```
policy_parent_selection_optimized_spf_run(vertex root,
PolicyFunction PF,
Tree PreviousTree)
/* representation of the
* same tree as computed
* in the previous iteration,
* as a reference for the
* policy tie-breaker
* function below.
*/
{
/* L, LL are lists of tuples of the following form */
List<(vertex, parent, link(parent, vertex),
link-cost(parent, vertex))> L, LL;
/* Heap is ordered in ascending order of link-costs */
HEAP tmpHEAP;
/*
* map_vertex2parents_heap maps a vertex to a HEAP of tuples, each
* identifying a parent relationship. The heap is ordered in
* ascending order of link-cost. It also serves as a representation
* of the set TENT.
*/
Mapping map_vertex2parents_heap{vertex -> HEAP of tuples of type
(vertex, parent,
link(parent,vertex)
link-cost(parent, vertex)) };
initialize_to_empty(L);
initialize_to_empty(LL);
initialize_to_empty(map_vertex2parents_heap);
set distance_from_root(root) = 0;
root.in_PATH = TRUE;
for each vertex in graph {
if vertex is not root then {
set distance_from_root(vertex) = infinity;
vertex.in_PATH = FALSE;
}
}
for each link (root,w) that is attached to root {
if (distance_from_root(w) > link-cost(root, w)) {
map_vertex2parents_heap{w}.add_to_heap(
(w, root, link(root,w), link-cost(root,w)));
}
}
/*
* Iterate while there are vertices (tuples) in TENT - TENT is
* realized via map_vertex2parents_heap which is a mapping table
* with each entry of the table representing a (per-vertex) HEAP
* of tuples. Each tuple represents a link i.e. a parent-child
* relationship along with the identity of the link and its
* associated cost.
*/
while (not_empty(map_vertex2parents_heap)) {
/* There may be more than one vertex in TENT at
* minimal cost, and this modified SPF algorithm considers all
* of them simultaneously, processing them as a list, in terms
* of their addition to PATH, and in terms of equally
* considering them parents of as-yet undiscovered vertices
* that may be reachable at the same total cost from members
* of the list.
* priority_heap_minimum_multiple_dequeue_as_list takes the
* heap and returns a list of all the vertices that are at
* minimum cost from v. The list is returned as a list of
* tuples, where each tuple is of the form:
*
* (vertex, parent, link(parent, vertex),
* link-cost(vertex, parent))
*/
initialize_to_empty(tmpHEAP);
/*
* Iterate over the vertices comprising the key-space of
* map_vertex2parents_heap. The minimum from TENT is
* derived in two steps - find the minimum cost for each vertex
* from its potential parents (find the tuples), and then find
* the minimum of the above minimums using a temporary HEAP.
*/
foreach (vertex v in valid-keys(map_vertex2parents_heap)) {
/* tuples of the type
* (vertex, parent,link(vertex, parent),
* link-cost(vertex, parent)) at a minimum cost per hash
* table vertex entry are inserted to tmpHEAP. Each entry
* in tmpHEAP is a tuple.
*/
if (is-empty(map_vertex2parents_heap{v})
continue; /* Move on to the next vertex */
/*
* heap_dequeue_multiple_minimum_as_list() is assumed to
* dequeue multiple entries at the top of the heap, if
* they are all at the same minimum value of the heap
* metric (cumulative reachability cost for this
* application).
*/
tmpHEAP.add_to_heap(heap_dequeue_multiple_minimum_as_list(
map_vertex2parents_heap{v}));
}
L = heap_dequeue_multiple_minimum_as_list(tmpHEAP);
/*
* L is a list of tuples.
* L could potentially have only a single tuple. Iterate over
* tuples in L. Tuples in L are optimally reachable from nodes
* already in PATH.
*/
foreach tuple (y, y_parent, link(y_parent, y),
link-cost(y_parent, y)) in L {
/*
* Identify tuples relating to y's potential parents,
* all leading to optimal reachability for y, collect
* them in LL.
*/
LL =
heap_dequeue_multiple_minimum_as_list(
map_vertex2parents_heap{y});
/*
* Policy function below takes a vertex, and its list of
* eligible parents, and chooses one parent for the vertex.
* It can be simply thought of as a tie-breaker that
* chooses one parent out of the list of eligible parents
* of y, but based on the consideration specified by the
* policy, and being able to do so in a way that
* minimizes tree churn.
*
* ASSUMPTION: Policy Function needs to be able
* to deal with trivial case of the list of eligible
* parents having only one parent, it must select that one
* parent unconditionally in that case for the given child
* node.
*
* PolicyFunction is not explicitly defined here, it may
* take other parameters such as a representation of the
* previous computation of the tree to help it make a
* selection that helps preserve links in the tree as they
* existed prior to this computation.
*/
if (y_parent ==
PolicyFunction(y, LL, PreviousTree, TreeNumber)) {
/*
* if y is allowed by the policy function, it can get
* pulled into PATH here. However, if y is excluded
* at this stage, then it must have another potential
* instance in TENT (but see assumption above), which
* will pull it in thru a different parent -
* correctness of the SPF algorithm guarantees this,
* and this other instance with other parent must be
* in the list L.
*/
y.in_PATH = TRUE;
distance_from_root(y) = distance_from_root(y_parent)
+ link-cost(y_parent,y);
if (y_parent == root)
/* directly connected */
nexthop_link_at_root(y) = link (root ,y);
else
nexthop_link_at_root(y) =
nexthop_link_at_root(y_parent);
/* y is now in PATH.
* identify_vertex_as_parent() is defined further
* down below, it marks y as parent for its
* downstream children, by updating the
* map_vertex2parents_heap entry for nodes
* immediately downstream of y. Note
* that we do not need to iterate over all instances
* of y in LL - this can be done by simply examining
* all links on y and pulling in those not in PATH.
*/
identify_vertex_as_parent(y, map_vertex2parents_heap);
}
}
}
for each vertex w in graph {
if (w is not root) {
print (distance_from_root(w), nexthop_link_at_root(w));
}
}
}
identify_vertex_as_parent(vertex y, Mapping map_vertex2parents_heap)
{
for each link (y,x) that is attached to y {
if (x.in_PATH == FALSE) // Ignore nodes already in PATH.
{
/*
* Routine is only called on vertices y that are already
* in PATH, so child x will be reachable optimally from y.
*
* y is a potential parent of x, because x is optimally
* reachable from y. But we need to see if there are
* other potential parents of x at the same optimal cost,
* the policy function will help pick the right parent,
* during its invocation in the SPF run for x's selection
* into path.
*
*/
if (distance_from_root(x) > (distance_from_root(y)
+ cost(y,x)) {
map_vertex2parents_heap{x}.add_to_heap(x, y, (y,x),
distance_from_root(y)+cost(y,x));
}
}
}
}
``````
5.1 Choice of the policy function.
In order to solve the discussed problem, the policy function must
preserve parent-child links as they existed in previous stable
(non-transient) computations of the tree and it can do this by
consulting a a stable snapshot of the previous tree computation,
to identify and preserve parent-child relationships in the prior
SPF tree. Text further below discusses how a stable snap-shot of
the tree may be collected.
The policy is applied on a best-effort basis and is consulted
during the normal SPF tree construction mechanism if a given child
node can be pulled in from a choice of multiple parents, and if one
of those parents had pulled in that child node in a previous
stable snap-shot of the tree.
When used in this manner, if a parent node in the previous
tree computation is no longer alive/reachable, it should not be used
as a reference in preserving any parent-child relationships. A
secondary policy or fall-back tie-break may be needed to identify
a parent for a given child in this situation, and this is detailed
below. Also, as noted in the code section, in the case where a child
node has only one parent, the policy function must unconditionally
select that parent node to pull in the child to the SPF tree. The
policy function may also take the tree number as an input
parameter to determine a fallback tie-break.
When the policy function is set up in the manner described above,
it helps maintain parent affinity by explicitly breaking ties
preferentially so that prior parent relationships are maintained in a
new computation of the tree. However, the challenge of synchronizing
the application of the policy across rBridges in the network needs to
be addressed. In order to do this, the following rules may need to be
adhered to when using this approach:
a. The first tree computation is carried out using the Trill
specified tree construction mechanism.
b. So long as the root node for that tree number does not
change physically, the second and subsequent tree computations
for the same tree can use this algorithm and try to feed in a
representation of the previous steady-state tree computation as
a guide to the policy function, subject to the following caveats:
i. if a non-root node that was a parent in a previous computation
is no longer reachable/alive, then the computation should fall
back to the default Trill parent selection rule for the set of
remaining parent candidates and the associated children. Note
that if a sibling of the parent node goes away or changes its
link connectivity, it does not impact the concerned parent's
child relationships.
ii. If a link between a parent node and child node (in the previous
computation of that tree) goes down, then that child alone should
be pulled in to the SPF tree using vertices that are upstream of
it during tree computation, subject to the Trill specified
tie-break rule. Note that this may result in some children being
pulled in through the old parent and some pulled in through a
different node, subject to the Trill tie-break rule.
c. If the root node for that tree number changes to a physically
different node, then the first tree computation for that tree number
with that new tree root should be carried out using the default
Trill parent-selection rules. The second and subsequent computations
can leverage this approach, each feeding in a representation of the
previous stable snap-shot as a reference for the current computation.
d. There may be cases during tree construction with this approach
where more than one parent finds a match in the representation
of the previous tree - in this case the tie should be broken
according to the default Trill parent selection rule. This can
happen, when a node that was a parent in a previous computation
becomes a child node of its former child in the current tree,
due to a link going down, for example. In this case, the
directional nature of the (parent, child) link in the prior
snapshot may need to be ignored while trying to find a match in
the prior snapshot.
e. The policy function must always be given the representation of the
most recent stable snapshot of the prior tree computation for that
tree number. Ideally, a stable snap-shot should reflect a converged
tree state after all recent network churn events have been absorbed.
A timer may be started when any tree computation starts. The timer
would need to be long enough to capture a converged tree state.
Any interim network events received while the timer is running
extend the timer. When the timer expires the tree is deemed to have
converged, and a snap-shot of the tree may be collected.
f. Alternatively, snap-shot collection may be triggered when the tree
routes are being programmed in the RIB, since most implementations
would download routes to the RIB only when the tree calculation has
completed successfully. An additional stabilization timer may be
used, after the RIB trigger, to capture cases where multiple tree
computations run back to back. The snapshot must collect all links
in the tree, reflecting parent child relationships in the tree.
g. To handle situations where a link or node is flapping constantly,
nodes in the network should turn off snap-shot matching on the
child nodes connected to the affected link or node
(if the node happens to be a parent node, then snapshot matching
should be turned of for all the children of that node) and fallback
to default Trill parent selection rules for the affected children.
Nodes in the network may examine parent-child relationships in the
successive collected snapshots to see if a specific sub-tree is
toggling within a pre-configured interval of time.
6. Network wide selection of computation algorithm.
Alternative A does not need any operational change to the
Trill protocol, beyond the usage of the affinity sub-TLV (which is
already in the proposed standard) for the use case identified in
this draft.
For alternative B, this draft takes no position on how rBridges in
the network may negotiate and select the modified algorithm as the
preferred tree computation mechanism at this time.
7. Relationship to draft-ietf-trill-resilient-trees.
Given that both draft-ietf-trill-resilient-trees, and
draft-rp-trill-parent-selection-02 drafts use the affinity sub-TLV,
it is worthwhile to examine if there is any functional overlap
between the two drafts.
At a high level, draft-ietf-trill-resilient-trees relates to link
protection, and defines the notion of a primary distribution tree
and a backup distribution tree (DT), where these trees are
intentionally kept link disjoint to the extent possible, and the
backup tree is pre-programmed in the hardware, and activated either
upfront or upon failure of the primary distribution tree.
On the other hand, draft-rp-trill-parent-selection-02 protects
parent-child relationships of interest on the primary DT, and has
no direct notion of a backup DT.
draft-ietf-trill-resilient-trees considers the following algorithmic
approaches to the building the backup distribution tree (section
numbers listed are from that draft):
1. Pure operator configuration for links on the backup DT/manual
generation of affinity sub-TLV - this is very tedious and unlikely
to scale or be implemented in practice, and hence is disregarded
in the analysis here.
2. Section 3.2.1.1a: Use of MRT algorithms (which will produce conjugate
trees - link disjoint trees with roots for primary and backup trees
that are coincident on the same physical rBridge).
3. Section 3.2.1.1b: Once the primary DT is constructed, the links
used in the primary DT are additively cost re-weighted, and a
second SPF is run to derive the links comprising the backup DT.
Affinity sub-TLV is used to mark links on the back-up DT which are
not also on the primary DT. This approach can handle conjugate
trees as well as non-conjugate trees (link disjoint trees that are
rooted at different physical nodes).
4. Section 3.2.2: A variation on the section 3.2.1.1b approach, but
without affinity sub-TLV advertisement. Once the primary DT is
constructed, costs for links on the primary DT are multiplied by a
fixed multiplier to prevent them from being selected in a
subsequent SPF run, unless there is no other choice, and the
subsequent SPF yields links on the backup DT.
All of the approaches above yield maximally link disjoint trees,
when applied as prescribed.
Approach 4 above does not seem to use affinity sub-TLVs and instead
seems to depend upon a network wide agreement on the alternative
tree computation algorithm being used.
Approaches 2 and 3 use affinity sub-TLV on the backup DT, for links
that are not already on the primary DT. The primary DT does not
appear to use affinity sub-TLVs. Additionally, from an end-to-end
perspective the backup DT comes into picture when the primary DT
fails (this is effectively true even in the 1+1 protection mechanism
and in the local protection case), and then again, only until the
primary DT is recalculated. Once the primary DT is recalculated, the
backup DT is recalculated as well, and can change corresponding to
the new primary DT.
draft-ietf-trill-resilient-trees cannot directly prevent/mitigate a
parent node shift on the primary DT at a given parent node, and while
usage of the affinity sub-TLV on the backup DT might confer a parent
affinity on some nodes on the backup DT, these are not necessarily
the nodes on which the network operator may want/prefer an explicit
parent affinity. Further, the backup DT is only used on a transient
basis, from a forwarding perspective, until the primary DT is
recomputed.
In situations involving a node failure, there is no direct functional
overlap between draft-ietf-trill-resilient-trees, and the
draft-rp-trill-parent-selection-02 draft. The two drafts have different
goals and appear to solve unrelated problems, in this situation.
Sometimes, a parent shift can be triggered by link failure. In a
situation where both drafts are active in the implementation, failure
of a specific link may cause the backup DT to kick in, but when the
primary DT is re-calculated, draft-rp-trill-parent-selection-02 can be
used to preserve parent-child relationships on the primary DT, to the
extent possible, during the re-calculation. So, in this case also,
there does not appear to be a direct functional overlap in the
simultaneous usage of these drafts.
8. Security Considerations.
The proposal primarily influences tree construction and tries to
preserve parent-child relationships in the tree from prior computations
of the same tree, without changing any of operational aspects of the
protocol. Hence, no new security considerations for Trill are raised
by this proposal.
9. IANA Considerations.
No new registry entries are requested to be assigned by IANA. The
Affinity Sub-TLV has been defined in [RFC7176], and this proposal
does not change its semantics in any way.
10. Informative References.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
```.
[RFC6325] Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A.
Ghanwani, "Routing Bridges (RBridges): Base Protocol
Specification", RFC 6325, DOI 10.17487/RFC6325, July 2011,
.
[RFC7780] - Eastlake 3rd, D., Zhang, M., Perlman, R., Banerjee, A.,
Ghanwani, A., and S. Gupta, "Transparent Interconnection of
Lots of Links (TRILL): Clarifications, Corrections, and
Updates", RFC 7780, DOI 10.17487/RFC7780, February 2016,
.
[RFC7783] Senevirathne, T., Pathangi, J., Hudson, J., "Coordinated
Multicast Trees (CMT) for Transparent Interconnection of
Lots of Links (TRILL)", RFC 7783, February 2016,
[RFC4971] Vasseur, JP., Shen, N., Aggarwal, R., "Intermediate
System to Intermediate System (IS-IS) Extensions for
Advertising Router Information", RFC 4971, July 2007,
Author's Address:
R. Parameswaran,
Brocade Communications, Inc.
120 Holger Way,
San Jose, CA 95134.
Email: parameswaran.r7@gmail.com
Copyright and IPR Provisions
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. The definitive version of
an IETF Document is that published by, or under the auspices of, the
IETF. Versions of IETF Documents that are published by third parties,
including those that are translated into other languages, should not
be considered to be definitive versions of IETF Documents. The
definitive version of these Legal Provisions is that published by, or
under the auspices of, the IETF. Versions of these Legal Provisions
that are published by third parties, including those that are
translated into other languages, should not be considered to be
definitive versions of these Legal Provisions. For the avoidance of
doubt, each Contributor to the IETF Standards Process licenses each
Contribution that he or she makes as part of the IETF Standards
Process to the IETF Trust pursuant to the provisions of RFC 5378. No
language to the contrary, or terms, conditions or rights that differ
from or are inconsistent with the rights and licenses granted under
RFC 5378, shall have any effect and shall be null and void, whether
published or posted by such Contributor, or included with or in such
Contribution.