* WGs marked with an * asterisk has had at least one new draft made available during the last 5 days

Idr Status Pages

Inter-Domain Routing (Active WG)
Rtg Area: Alvaro Retana, Alia Atlas, Deborah Brungard | 1994-Aug-15 —  

IETF-99 idr minutes

Session 2017-07-20 0930-1200: Congress Hall III - Audio stream - idr chatroom


minutes-99-idr-01 minutes

          IDR meeting at IETF 99 (version 06)
          9:30-12:00 am, 7/20/2017
          Congress Hall III
          Sue: The show^Wmeeting is about to start in a few
          minutes, we have not that full agenda this time.
          Sue: Mike procedures. I
          am Sue, the chair of IDR.
          JGS: Note Well, the new Note Well.
          Sue: Please
          take a look at the new Note Well. It covers more sessions. Anything you
          say here is under the NW.
          JGS: It references RFC8179 on IPR.
          JGS: Please
          state your name clearly at the mike.
          Chair's administrivia [9:30-9:40
          0) Agenda bashing and Chair's slides (10)
          JGS: We have been a
          busy WG. No new RFCs issued, one is pretty close. 2 sent to the IESG,
          barring surprises those should move soon. Finished WGLC, waiting for
          shepherd. 2 WGLCs in progress, please provide feedback. WGLC for tunnel
          encapsulations, active discussion that called for a revision. This will
          be discussed during the meeting today, WGLC to follow soon. Plus one
          is better than silence, encourage to provide better feedback. Anything
          but silence. 5 new WG documents. Discussion on whether SR related ones
          need to be merged. There was no enthusiasm to merge them, they still
          remain as 3 separate ones. Update on the OPEN policy, waiting for an
          implementation. We suspect there are implementations but that is not
          reported. One thing that is not on the slides which will likely have
          to defer to a later discussion is the question of early allocations. I
          will discuss allocation stuff after the status. It is not allocation,
          it is implementation. IDR has a relatively unusual requirement tradition
          of having protocol implementations before sending to IESG. The question
          came up a few months ago why do we allow things to go to WGLC before
          we have evidence of implementation. That is a fair question. If WGLC
          concluded and you do not have implementation then you end up in this
          state, and if you sit there for long enough then we will ask the drafts
          to fix the found defects and will do another WGLC. Should we change our
          procedures in this group that we want two reported implementations before
          we proceed to WGLC? Pausing now and asking the room.
          Randy identifying
          himself John Scudder/Juniper Networks :-) :-): This will make a queue
          that has mixed things in it, now we have two separate queues.
          JGS: This
          is a reordering of the queue.
          Randy: There will be many things waiting
          for implementation.
          Job Snijders: We should reorder the procedure. We
          have fairly large drafts that have tons of codepoints assigned. WGLC time
          would be a much better time to say that we have an implementation that at
          least compiles. If we discover mistakes later we have to deal with that. I
          would see that the quality will improve this way.
          JeffH: I disagree with
          Job. The purpose of this is to have the clear specification that would
          allow for doing an interoperable implementation. If you find issues as
          part of implementation.
          Bruno: If you delay for a long time before you
          get as second implementation, people from first implementation will not
          accept any technical changes.
          Acee: I like to agree with Jeff. If you
          delay you are backing further away the whole process, how do you manage
          the change. You are raising the barrier for implementations.
          Sue: I was
          at GROW and we missed yet another RFC that was useful for operations
          people - AS4 communities. How does that affect the waiting?
          Any questions on BGP YANG model. What is the status, what are we doing
          with it?
          Sue: We have a challenge. The YANG doctors new requirement is
          that all models adopt the revised datastore. We will see whether the
          models are deployed, I have spoken to a friendly AD and we will likely
          publish the current one and then the revised one. It is a compromise mode
          to be friendly to other WGs. This will be brought to the list.
          I agree with that. There are implementations. For those implementations
          that are shipping it makes sense to document the existing work.
          This is what versioning in YANGis for. It is fine to ship 2.0 if it fits
          the new thing. It is not a huge challenge to reorganize the code.
          Question for you Jeff. If that versioning does not work and we use
          two distinct models?
          JeffH: Even if we have version, the BGP model
          will not be backward compatible with revised datastores. Therefore,
          we will need 2 different versions.
          Mahesh: If you are asking for two
          different mane spaces  then probably it is not a big problem. Do you
          need those models to be backwards compatible?
          Sue: Generally yes. To
          reiterate the question, do we require that two models be backwards
          JeffH: That is fundamentally not backwards compatible.
          Back to the boring status update. We have 2 new IPR disclosures.
          Is my perception correct that we are doing more code points than in the
          past, what has changed? Is there a permanent change that we need to adapt
          to it?
          JGS: That is a great question and I will ask to hold it until
          the next talk.
          JGS: We have a wiki.
          JGS: Hum, if you use a wiki once a
          quarter. Some people do use it. Hum or raise your hand if you never used
          a wiki. Quite a few. It would be interesting to know what we are not
          doing right. Is there information that is not available that you would
          Robert: There is another draft waiting for implementation, ORR,
          both are reported in the implementation report as of 10 minutes ago,
          it is ready to WGLC now.
          JGS: Please add that to the minutes.
          I was wondering before we were discussing YANG model, are we finished
          with that?
          JGS: To summarize what I heard - I heard more people saying
          lets try keeping roughly the way as it is. I do not want to say what the
          WG policy is on the fly, instead to report to the WG in a few weeks.
          We keep having a topic on codepoints. It is thought to be a good policy
          in IETF on getting codepoints from registry and not just use out of
          band or self allocation of codepoints. As or right now the WG policy is
          not to put codepoints to documents until they are allocated. If you are
          targeting your individual drafts as a WG document, you need to address
          this. People have reasons to do what the are doing, what might those
          reasons be, what could we do differently. Three possible ways forward. I
          started the conversation on the list a week ago. One of the answers is
          that publishing in my draft is better than quietly allocating and using it
          without telling anyone. I hope we have a better option. Code development
          moves and IETF process moves at different timescales, I do not want to
          slow the development process in order to get a codepoint. Lets talk about
          this. I was looking through RFC8126, it has good stuff in it. It captures
          the essence of our problem. [quote from 8126, see slides]. This is what
          we are seeing in a nutshell. There is a zoo of 10 different policies
          outlined in that RFC. We are not required to use those templates. If we
          can reuse some work form there would be better than invent out own. How
          many people have registered something in FCFS registry here? It is easy
          to use. It may be slightly slower than putting a number and sticking into
          the draft, but that is the same time scale. Virtually no overhead. One
          of objections to changing everything to FCFS was no sanity check at all
          and whether we do not need to go all the way of cats and dogs, there
          is also expert review process. It is almost the same as FCFS, there
          is an expert who guards the registry. As long as the expert has right
          expectations of what the job is it rather lightweight. All of those
          processes basically say that you have to have an RFC number to get a
          codepoint. This is a circular dependency. To break that we have an early
          allocation. The criteria for early allocation are that you have to have
          a spec that people can read and understand, the spec has to be stable,
          there has to be interest from community in deploying this thing. If
          you cannot get your document adopted you havent demonstrated that that
          bottom expectations are met. Once you have done that in parallel with
          WG adoption, you can request early allocation, we poll the list, if
          there are no objections we ask for a codepoint and AD generally says ok,
          then we send to IANA and we are done. This may take 3 weeks-ish. What
          can we do about getting people the codepoint? Three options: beatings
          will continue until morale improves, we should yell louder at those
          putting codepoints into drafts. We can embrace the anarchy and say do
          whatever you have to do. We will stop controlling what you put into your
          drafts. Speaking as a WG member that does not make sense to me. Maybe we
          should stop maintaining the registries at all in this case. We can also
          say that we have a process and people find that the process is not working
          maybe we should change the process. Should we reclassify some or all of
          our registration policies? It is an administrative work to do that, you
          then write a short RFC to cover IANA considerations. I do not intend to
          come to a final conclusion in this room.
          Keyur: Typically these days when
          someone implements an extension it is a bit more formal. 3 weeks is a long
          time. If there is a process that can get us a codepoint in 2 days that is
          Bill Fenner: When we went through this exercise 8 years ago,
          there was a lot of concerns to convert 200 point registries to FCFS. What
          if someone came and registered 200 codepoints? That question needs ot be
          JGS: I thought about that. RFC8126 talks about that. My own
          answer is that if we are concerned we can choose expert review instead
          of FCFS. This gives a gatekeeper layer. Another option is to divide the
          space into parts for FCFS and other allocations.
          Randy: Not that I am
          uncomfortable with this approach, but what is missing is the ignore the
          squatting. We are taking the cost, the blame is over there.
          JGS: That
          is the anarchy option.
          Alexander Azimov: The proper process is idea -
          draft = allocation - RFC. What if I have an idea, then implementation,
          then draft in order to find out whether my idea is working? This way I
          have squatted a codepoint for testing purposes. What is the proper process
          JGS: It is a good question. Some registries have experimental use
          points, they are supposed to be used for this. The problem is what if
          you start with idea, you get an implementation, your implementation is
          fielded, and then you need to get a properly allocated one. Then you need
          a flag date to replace that code. Another answer you make your codepoint
          configurable and force your user to do a configuration. We should never
          ship another spec that has code fields smaller than 16 bits. That would
          allow for use of permissive policies.
          Alexander: If I am not aware of any
          processes? When I got an idea I was unaware how IETF process works.
          I have no good answer for that.
          Sue: Take Randys answer for
          Alexander: My suggestion is to have a pool of experimental
          codepoints in each registry.
          Job: I like to respond to what Randy said -
          to railroad the squatter.
          Job: The developers and the operators that
          were experimenting with codepoints. RFC8093 was written for exactly
          that problem. If we observe squatted codepoint we need to obsolete
          them. Lets make getting codepoints easier.
          Bruno. Plus one. We may
          have very large registries in future with very permissive policy. We
          deprecate a squatted codepoint and therefore it is very permissive. We
          lose the deprecated code point.
          JGS: Excellent discussion. I hope that
          in next few weeks we will come to the decision. What I hear is the
          request to reorganize registries to fit peoples needs better. Would
          someone write a draft that clarifies what that policy would be? We need
          someone stepping up and writing the draft.
          Keyur: For me the problem
          is not the process, you can put any process as long it takes less than
          2 days. That is the source of squatting.
          JGS: The fact that we have a
          problem does not mean that people are bad. We need to fix the process
          and the problem will go away.
          Robert: Why are we discussing this at
          IDR? All of RTG area has this problem, it is not unique to BGP.
          My answer is because I am IDR co-chair and or the RTG AD. The way how
          IETF is structured is that most of the work is done in the WGs. We need
          to pick a process that works well for the community.
          JeffH: This is a
          problem for other WGs too. BGP tends to have a global scope. The person
          that suffers from the squatted codepoint is the one that tries to get
          the codepoint legitimately.
          Updates on existing IDR drafts [9:40-10:40
          1) Dissemination of Flow Specification Rules [Christoph Loibl]
          Requesting WGLC.
          JeffH: This
          is very useful for dealing with both internal defects and also other
          implementations. Extended communities of this type have this problem
          in the IETF, it is littered with the problems of magic communities
          and what to do when you have more than one of it. We probably need
          to write a general draft what to do with it.
          JGS: My reaction is
          that the magic communities semantics depends on the use case.
          One of the possibilities is to say that the for these classes of
          things there needs to be exactly one and there is a value associated
          with it.
          Sue: Work in this draft is good and it can go forward -
          is my understanding right, and we need to drain the swamp?
          2) The BGP Tunnel Encapsulation Attribute [Keyur Patel]
          JGS: We can start WGLC next
          3) Making Route Servers Aware of Data Link Failures at IXPs [Jeff
          Haas] (10)
          Jeff Haas
          Keyur: Back when we did ORR
          we had NH-SAFI to simplify ORR computations. Can we leverage that SAFI
          here with the idea that we can use a generic NH-SAFI that requires per
          nexthop computations?
          JeffH: Previous draft used that, and we decided
          that it was not a good fit for the purpose of this solution.
          Acee: Given
          that this is a lot of new mechanisms that need to be implemented on the
          client and RS, why cannot you run OSPF among all the clients and BFD and
          reuse the existing technology?
          JeffH: Two things - route server has a
          little more work to do than a router does, and the RS has policy that
          is the key for IX, it is not only as plumbing reachability. Speaking
          of OSPF, using a shared instances for thousands of peers.
          Acee: That
          could be done?
          Joel Jaeggli: What is one of the peers gets out of sync -
          that happens all the time.
          JeffH: Not with our implementation.
          Good maybe you should talk to Cisco.
          Joel: Blackholes exist in the
          infrastructure all the time even when things are working normally. The
          way how you solve that is not to use route reflectors. You cannot see
          which neighbor is blackholing through you. I think this work is super
          JeffH: This problem has existed since frame relay times.
          The case of a partition is a very clean cut, but operational practice
          that if IX becomes partitioned means that things are failing not in a
          clean way. I would prefer to shut down everything down that does through
          that exchange. If participants have very fast detection that is a good
          solution. Otherwise you need to wait a long time, few hours, to get routes
          back. There are other use cases where RSes can be useful other than
          IX, would be good to document that in the document.
          JeffH: It can be a
          telemetry for a route server to see that something has gone awry.
          This all came from DE-CIX. This is an attempt to fix a problem in their
          environment. The reason I like this idea despite Jeff trying to complicate
          it :-) that this allows to measure the things. The distinct problem that a
          fair number of participants in the IX are on the routers with insufficient
          resources. Two reasons to use RS - less configuration, and two - I cannot
          hold all the routes. And cannot handle addpath. The small routers will
          take a while before they have BFD. The route servers if you read the RFC,
          they are designed for this, they are keeping a separate RIB for each
          customer. If that customer says that they cannot get to /32.
          Chris Loibl:
          Thank you for this draft, we have hit this problem a few times. I think
          it is a good idea to use some form of BFD autoprovisioning. I really like
          route server to give me routes. Not sure that putting this state on the
          routers that are connected to IX and making more overhead and making
          the decision on RS more complex, I would vote for addpath. And I want
          to select what routes I want to put in my table.
          JeffH: This proposal
          does not preclude addpath.
          Chris: It adds much to the complexity. If
          you just have BFD autoprovisioning in there it may be much easier. No
          blackholing at IXes.
          JeffH: Every single BGP implementation needs to
          implement NH reachability. Most of them do proxying from the route
          server. This is a question of how do you get state to a RS,
          You made this being complex for current hardware.
          JeffH: I did not say
          complex, I said unsupported.
          Robert: Unsupported?
          JeffH: BFD requires
          endpoints to be provisioned. That is a simple change to protocol.
          That changes the establishment moment. The runtime for dataplane is the
          JeffH: You are mistaken on how BFD operates on dataplane.
          Scudder: Due to time, we'll need to take this to the mailing list.
          Route Leak Prevention using Roles in Update and Open messages [Alexander
          Azimov] (10)
          Alexander presenting.
          Alexander: Please share your
          thoughts on this.
          Sriram/NIST: Since complex is removed now, do we need
          to have a recommendation in the document on what to do when OTC is not
          Alexander: IOTC is a path attribute. If the roles are configured the
          attribute is filled in correctly. In your case you would not need to do
          Sriram: Should you draft say that it is possible to derive IOTC
          from the configuration?
          Alexander: It is not done automatically. You need
          to configure it manually. If you have another complex partner you can set
          IOTC on a per prefix policy, but you need to do that yourself.
          You can automate from configuration, cant you?
          Alexander: What do you
          mean automatically?
          Sriram: You can take from the configuration what
          the role is and you can set IOTC from that,
          Alexander: My meaning of
          automation is that you are not configuring IOTC per peer, you derive that
          from existing configuration.
          Sriram: Something in the configuration does
          not automatically mean what the role is. We have intra-AS messaging,
          and that messaging is IOTC in this draft, someone else may be using
          communities. Those are per prefix roles and I will send IOTC or community,
          that can happen automatic and it does not need to be OPEN time.
          Alexande Lyamin: Dont you see that complex case should be moved to a
          separate document?
          Alexander Azimov: I do not think so. It shows that the
          prefix was learned from a peer.
          AL: You are providing instructions without
          hurting the other peer?
          AA: Maybe we need to clarify that in a separate
          Sue: You may want to talk offline about the specifics.
          Somewhere in the draft you say that if OPEN policy mechanism is used
          it is consistent with the reject policy. Reject RFC applies to all
          AFIs. Exchanging nothing unless you do configure otherwise is a safe
          thing to do. OPEN policy applies to 1/1 (IPv4) and 2/1 (IPv6) and not
          other AFIs, the authors may consider to limit that explicitly. Are you
          saying this should be mandatory?
          AA: Not any more.
          Job: Thank you for
          Keyur: I am in favor making it generic and making it default
          for all address families, Particularly for VPN AFIs. In VPN it can be
          used in a different manner than in intra-AS level.
          Job: Internet use and
          VPN use are similar, but we do not know what future AFIs will mean. It is
          unsafe to open that door and go against the BGP reject RFC.
          AA: Reject RFC
          does not specify which AFI it applies?
          Job: It applies to all AFIs.
          We do not add anything here, if there is no policy you need to configure
          Job: If you configure a role, you also open other AFIs.
          AA: Same
          as BGP reject. You are able to configure it in a way that you want.
          Time check, the conversation is useful.
          Ben Madison: On multiple AFIs -
          I have come across instances where people have multiple AFIs active on the
          same session. In this case it is not clear what the peer role for internet
          AFI will work with VPN AFI. It may be pretty hard. What I like with this
          - route distribution policy lives in AFs, and we are trying to enforce
          it on a session level.
          AA: I will try to discuss this during lunch with
          AA: We do not have a consensus here. First one is minor - we have
          two scenarios to send notifications. If we have conflict in roles, we
          are sending a notification. If we have one side having a strict role and
          other no role, we also send notification. Do we need separate subpoints
          for notification code? Do whatever we want? No best practice?
          Ruediger: I
          do not know about the BCP or precedent, but looking at the question these
          are different scenarios where the signaling to the potentially failing
          end should be distinct, so use two codes.
          AA: Will do if there is no
          Job: On the flip side the subcode within its context will mean
          something and strict/no strict context may have a different meaning.
          I may have a different opinion.
          Randy: It is 128in the registry but you
          can use the first 64 only :-)
          AA: If the roles are globally deployed,
          do we need route leak detection and mitigation then? There are two
          drafts, they give out some hint about the peering relationship between
          SPs. Route leak detection and mitigation is also important. Route leak
          detection and mitigation could be helpful in partial deployments to
          limit the damage. Do we need it?
          JGS: Please keep brief.
          It is important to have detection and mitigation, there will be partial
          deployments for a long time. SPs would like to have a mechanism to work
          in a case customer AS is not using it.
          Sriram: [].
          AA: The problem is
          that this mechanism will give a hint about peering relationship. Are
          those SPs willing to reveal that relationship in trade for the benefits
          of this mechanism?
          AL: Given the publicly available information, can
          you say with a good probability what is the relationship between Job and
          Job: Lovers. :-)
          AA: Complex. :-) There is such possibility.
          We are out of time, encourage for this topic to go a few layers more. Is
          there an interest in having an interim? Hum please. I hear some hums.
          Updates on existing individual drafts [10:40-11:10 am]
          1) Carry congestion status in BGP extended community [Zhenqiang Li] (10)
          Zhenqiang presenting.
          JGS: We
          can have a discussion at the adoption call on the list.
          I missed in the presentation on what the dynamics of this attribute
          will be. That should be well understood. The other comment - you
          are limited to links at 256Gbps. In this we are beyond that scope
          in actual deployments.
          Zhenqiang: The new unit is 10Gbps.
          Still this gives limited life time.
          Zhenqiang: I want the WG to adopt
          the document and optimize the solution.
          Sue: Jie, could you sit with
          Zhenqiang to give him a few pointers?
          JGS: The bar for accepting the
          document should be to check whether there is a problem, how to address
          it, and then we should discuss accepting it.
          ?/Cisco: Why cannot you use
          MED or local preference instead of the extended community?
          Large communities can do everything. :-)
          Randy: Just ask Job. :-)
          Zhenqiang: Maybe I should choose another community to deliver this. The
          community container (wide community) is a possible solution too.
          Encourage to have a discussion at the back of the room.
          JGS: This is
          deal time - the adoption poll time - then engage with questions like
          2) Populate to FIB Action for FlowSpec [Zhenqiang Li] (10)
          Zhenqiang presenting.
          Zhenqiang: Asking for adoption.
          Robert: Who sets the L bit? If the guy who sets the L
          bit injects the route then you do not inject the rote at all.
          Robert: It makes no sense at all.
          JGS: There are no other
          comments. We can take to the list.
          Sue: Robert, perhaps you
          should ask this question again on the list.
          3) BGP Logical
          Link Discovery Protocol (LLDP) Peer Discovery [Acee Lindom] (10)
          David Lamparter: We are
          one of those implementations that have ICMP based discovery. Regardless
          of having that feature, my view is that LLDP is completely wrong place
          for this. I do not see why we are going for L2 protocol that may not be
          ported on a GRE tunnel that may not even support L2. Why dont you take
          all of the protocol mechanism and run it over IPv6 with a new protocol
          Acee: Do you realize I may retire in next 5 years? :-)
          Are you saying that it is easier to get IEEE OUIs than ND?
          Acee: It is
          easier to implement and get.
          David: It is multiple times more complex
          to implement.
          Sue: We need very brief comments, we are running out of
          Keyur: Do not implement it, we will get you a free code? There is
          no generic discovery protocol that applies to many of the network. It
          is a compromise.
          Robert: Error handling - if you mess up and BGP fails
          it is even worse. You can get a peering address in the OPEN. You do
          not need any additional TLVs.
          Donald: This is RFC7177.
          New Work
          1) BGP Support for Fast Link Status Notification [Marcus
          Sun] (10)
          Marcus presenting.
          Acee: Did you
          think about BGP-LS?
          Marcus: Yes, we did think something.
          Burjiz via
          Meetecho: To Acees comment - we can use BGP-LS but it is too heavy,
          we are configuring link detection via community that makes detection
          a bit better. You can do a proprietary protocol also.
          Sue: Is there
          an interest on having virtual interim on link availability and BGP? Hum
          please? No hum. We will take that to the list.
          [End of meeting]

Generated from PyHt script /wg/idr/minutes.pyht Latest update: 24 Oct 2012 16:51 GMT -