I have five threads discussing LC: Gen-ART Review of draft-ietf-behave-nat-behavior-discovery-06 draft-ietf-behave-nat-behavior-discovery changes FW: Last Call: draft-ietf-behave-nat-behavior-discovery (NATBehavior Discovery Using STUN) to Experimental RFC Last Call: draft-ietf-behave-nat-behavior-discovery (NAT Behavior Discovery Using STUN) to Experimental RFC --------------- From Peter McCann's Gen-ART review The Abstract incorrectly defines STUN to be: Simple Traversal Underneath Network Address Translators Later in the draft, STUN is correctly expanded to: Session Traversal Utilities for NAT fixed End of Section 2.2: apocryphal evidence Did you mean "anecdotal evidence"? apocryphal -> anecdotal (honestly, in the ietf world, the "uncanonical" definition of apocryphal works in some sense) Section 6.1 states: If the Request contained a PADDING attribute, PADDING MUST be included in the Binding Response. The server SHOULD use a length of PADDING equal to the MTU on the outgoing interface. Section 7.7 states: PADDING MUST NOT be longer than the length that brings the total IP datagram size to 64K and SHOULD be an even multiple of four bytes. Because STUN messages with PADDING are intended to test the behavior of UDP fragments, they are an exception to the usual rule that STUN messages be less than the MTU of the path. These two statements are not incompatible; however, I wonder if you wanted to mandate literally "length of PADDING equal to the MTU". Together with other fields, this would bring the total length up to something above the MTU, so fragmentation would happen (as desired). It might help the reader to repeat the SHOULD from 6.1 in the text in 7.7. I updated the text so both sections say SHOULD be MTU rounded up to nearest multiple of 4 bytes. Section 8: This draft defines new STUN response code. SHOULD BE: This draft defines two new STUN response codes. fixed ------------------ possible issues for minor clarification from discussion thread: More generally, one of the important differences between 3489 and ICE is that ICE ensures there is always a fallback to TURN, and thus avoids the problem experienced by 3489-based applications that tried to determine in advance whether they would need a relay and what their peer reflexive address will be, which are both impossible. behavior-discovery requires an application using it to have a fallback, but unlike ICE's focus on the problems inherent in VoIP sessions, doesn't assume that it will only be used to establish a connection between a single pair of machines, and so alternative fallback mechanisms may make sense. i.e. in a P2P application, it may be possible to simply switch out of the role where such connections need to be established, or to select an alternative indirect route if the peer discovers that in practice, 10% of its connection attempts fail. text was included in applicability section -- I suppose we can mention this: > > Section 4.2 - Can't really separate the topic from if UDP is blocked > from if the STUN server is down. The draft recommends multiple STUN servers for redundancy, but do we really want to engage in a reduction to the absurd of "it's impossible to diagnose network behavior because you can never differentiate between host failure vs network failure in the absence of responses"? True. But not interesting. I renamed this test checking for UDP connectivity with the STUN server. I assume someone who really wants to check for blocked UDP can work out how to combine multiple stun servers, tcp checks, etc to accomplish their goals. -- from Apr 5 response to Bernard, and following discussion, add the following description (or something like it) for a description of experiment/how it might be used. When P2P node A starts up, it evaluates its NAT(s) relative to other nodes already in the overlay. Let's say that its testing indicates it's behind a good NAT, with endpoint-independent mapping and filtering. In this case, the peer will join the overlay and establish connections with appropriate peers in the overlay, but it will advertise to any node in the overlay that wants to reach it that they don't need to route through the overlay network formed by the P2P nodes to reach it (which is the normal routing mode in a P2P overlay), they can just send directly to its IP address. So when node B wants to send a message to A, it sends the message directly to A's IP address and starts a timer. If it doesn't receive a response within a certain amount of time, then it routes the message to A across the overlay instead. (Alternatively, B could simultaneously send the message to A's IP address and across the overlay, which guarantees minimum response latency, but can waste bandwidth.) A over time observes what percentage of the time it receives direct messages compared to overlay messages. If the percentage of direct connections is below some threshold (say 66%, picking a random number) then may stop advertising for direct connections. But if the percentage is high enough, it continues to advertise because it may be helping performance. If at some point, the NAT changes its behavior, A will notice a change in its direct connection percentage and may re-evaluate its decision to advertise a public address. (There are a lot of other details how this might work, how it would deal with multiple levels of NATs, and what the actual cost benefits are. I don't want to get into all of the details of how it would work here.) This is a good example because behavior-discovery is used for initial operating mode selection, but the actual decision for whether to continue advertising that public IP/port pair is made based on actual operating data. It's also using the result of the behavior-discovery work as an optimization, not in a manner where the application will fail if a percentage of the nodes in the overlay are unable to make a connection. edited and incorporated into S2.2 ------------------ possibly address Dave Thaler's comment: I think one point is that "Example Use with P2P Overlays" is not necessarily the best example. Perhaps a better example, which is mentioned elsewhere in the doc, is a diagnostic tool to fingerprint the NAT and maybe alert the user if it observes any behavior that is against the recommendations. But if the results aren't reliable, a P2P Overlay developer may not be able to reliably use it. I do understand part of the point of the document is to suggest an experiment in this area. I might suggest using a diagnostic tool as an "example", and then in a section after that, have "Experiment with P2P Overlays". I think section 2.1 already tried to express this. I added a bit more text and changed the title so that 2.1 is "Example Diagnostic Use" and 2.2 is "Example Use with P2P Overlays" ------------------------ also from Thaler: can't change state issue, but need to clarify what protects RESPONSE-TARGET (keeps coming up) On the topic of RESPONSE-TARGET, I completely agree with Cullen here... I couldn't follow from the current text, what the RESPONSE-TARGET was useful for that couldn't be done without it. I would prefer that it be either removed, or barring that, that it be better explained what it's needed for that can't be done without it. The requirement this introduces to keep state on the STUN server is undesirable in my view. Numerous questions were brought up about what XOR-RESPONSE-TARGET is used for. I have ensured that every mention of it is coupled with binding lifetime discovery through Section 5. The security requirements surrounding XOR-RESPONSE-TARGET have also been clarified (and made consistent, since there were several inconsistencies in the text since these requirements were debated and changed in the evolution of the draft). The Security discussion section sums this up: The server MUST NOT respond to requests with XOR-RESPONSE-TARGET unless they have cached state that a binding request with CACHE-TIMEOUT has previously been received from the target address.The server MUST either authenticate all requests using XOR-RESPONSE-TARGET or rate-limit its responses to such requests. Rate-limiting is RECOMMENDED even if authenticating requests, unless the server is deployed for an application requiring more frequent responses.Requests containing both XOR-RESPONSE-TARGET and PADDING are rejected by the server.Implementing XOR-RESPONSE-TARGET is optional, allowing servers that cannot store the required state and/or deployed for applications that don't require its use to automatically reject any requests containing it. The decision of the working group was that rate limiting was sufficient to address security concerns since there is no amplification. The cacheing feature is additional protection, at the expense of extra state. There has been no concern about the extra state (I agree that, in general, extra state is undesireable, but the memory requirements are only a few bytes per client transaction in progress. Interestingly, 3489 had a shared-secret mechanism where the server could essentially generate a magic cookie that the client had to use in subsequent requests. That mechanism could be used by a server to implement this security without state since it could generate a cookie combining a secret key with the client's source address and then verify that the cookie matches when an XOR-RESPONSE-TARGET request comes in. We could introduce a similar mechanism triggered by CACHE-TIMEOUT, etc, but I'm not convinced that the state saved here is worth the additional protocol complexity. However, I'm certainly happy to bring the issue up for wg list discussion if you believe we should. --------------------- from Apr 14, maybe add something along these lines to Section 4 I've been thinking about whether there is any more accurate way to describe what the tests indicate. 4.3 currently uses the phrase "the NAT currently has Endpoint-Independent Mapping". It might be slightly more accurate to say that "the test produces no evidence to indicate that the NAT is not currently using Endpoint-Independent Mapping." Of course, given all of the requests to shorten the text, that's a lot of verbage to update all relevant statements. It's also going to be very hard to read. I can add another explanation to the beginning of Section 4, though. The intro to 4 was rewritten to address this, as well as a number of other comments. -------------------------- magnus summary of LC change requirements: Make the applicability statement more clear on that any determination is transient and may contain error requiring a user to have a fall back. reworked to clarify more carefully - Make the proposed experiments clearer on how they can utilize the mechanism. Especially the P2P needs to make clear why this could be applicable given the limitations. section 2 on the experiments has been significantly expanded - Make clear that the algorithms are proposals of what can be determined using the STUN attributes and their behaviors with a single source address. Additional determinations are possible. Multiple interior source addresses could reveal additional behaviors. Section 4 intro had significant changes, including addressing this point. (I would add that we definitely need to clarify RESPONSE-TARGET more carefully)