Internet Draft B. Ford Document: draft-ford-behave-gen-01.txt M.I.T. Expires: November 8, 2005 P. Srisuresh Caymas Systems S. Sivakumar Cisco Systems May 2005 Operating Principles and General Behavioral Requirements for Network Address Translators (BEH-GEN) Status of this Memo By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Distribution of this document is unlimited. Copyright Notice Copyright (C) The Internet Society (2005). All Rights Reserved. Abstract This document discusses the operating principles of Network Address Translator (NAT) devices and the behavioral properties required to Ford, Srisuresh & Sivakumar [Page 1] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 make NAT more predictable and compatible with diverse application protocols. First, this document presents an architectural model for NAT devices and defines important terms used in conjunction with NAT operation. The architectural model sets the stage for a set of concrete recommendations for NAT implementers. The recommendations made by this document are independent of transport protocol. A set of companion documents provide behavioral recommendations specific to particular transport protocols. Table of Contents 1. Introduction & Scope ......................................... 2. Operating principles and terminology ......................... 2.1. Address/Port Maps ....................................... 2.2. Address/Port Bindings ................................... 2.3. NAT Session ............................................. 2.4. Cone/Symmetric NAT behaviors ............................ 2.4.1. Symmetric NAT ......................................... 2.4.2. Port Restricted Cone NAT .............................. 2.4.3. Address Restricted Cone NAT ........................... 2.4.4. Full Cone NAT ......................................... 2.5. Multi Level NAT topology ................................ 2.6. Hairpin NAT Session & Hairpin NAT translation ........... 3. General Behavioral Requirements for NATs ..................... 3.1. Transport Protocol support .............................. 3.2. Address Binding and/or Port Binding support ............. 3.3. Fragment support for inbound IP packets ................. 3.4. Fragment processing on the outbound ..................... 3.5. Hairpin NAT translation ................................. 3.6. DHCP-Configured NATs .................................... 3.7. Honor the DF bit in IP header ........................... 3.8. ICMP Error packet handling .............................. 3.9. Rejection of IP packets not permitted by NAT ............ 3.10. ALG support ............................................ 3.11 Denial-of-Service Protection ............................ 4. Hints to implementers ......................................... 4.1. Inbound fragmented packet processing .................... 4.2. Port reservation ........................................ 4.3. DHCP Configured NATs .................................... 5. Summary of Requirements ...................................... 6. Security Considerations ...................................... 1. Introduction & Scope Due to various technical and market pressures, Network Address Translators (NATs) have become a ubiquitous part of today's Ford, Srisuresh & Sivakumar [Page 2] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 Internet. NATs cause well-known problems for applications, especially those that carry IP addresses in their message payloads [NAT-PROT]. RFC 3235 [NAT-APPL] provides some recommendations for making application protocols compatible with NAT. But, these recommendations do not adequately address applications with "peer-to-peer" (P2P) communication patterns, which by their nature carry IP addresses in message payloads. Peer-to-peer applications that suffer from this problem include Voice Over IP and Multimedia Over IP [SIP, H.323], as well as online games. In the face of the prevalence of NAT, applications are forced to use ad-hoc techniques in an attempt to function reliably across NATs. A companion document [BEH-STATE] describes the current "state of the art" in NAT traversal techniques adapted by peer-to-peer applications. As stated in RFC 3424 [UNSAF], there is wide degree of variability in how NAT devices behave. This document defines a set of requirements for NAT behavior that will reduce the unpredictability and brittleness of the NAT devices and enable applications to traverse them reliably. The requirements specified here apply generally to all NAT variations described in RFC 2663 [NAT-TERM], including Traditional NAT (i.e., Basic NAT and NAPT), Bi-directional NAT, and Twice NAT. However, the primary focus of this document is NAPT, a variant of Traditional NAT that is most widely deployed today. Traditional NAT inherently mandates a certain level of firewall-like functionality. However, firewall functionality in general is out of the scope of this specification. NAT traversal strategies that involve explicit signaling between the application and the NAT [SOCKS, RSIP, MIDCOM, UPNP] are out of the scope of this document. This document focuses strictly on the behavior of the NAT itself, and not on the behavior of applications that traverse NATs. A separate companion document [BEH-APP] provides recommendations for application designers on how to make applications work robustly over NATs that follow the behavioral requirements specified here and the companion Behave documents. The following section is devoted to describing the principles of NAT operation and the various terms used throughout the behave working group documents. This section serves to provide a technical perspective behind the recommendations made in the next section. Section 3 lists the general recommendations to NAT vendors, independent of the transport protocol specific recommendations. Section 4 provides some hints on how to implement some of the requirements listed in section 3. Lastly, section 5 summarizes all the requirements in one place. Ford, Srisuresh & Sivakumar [Page 3] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 2. Operating principles and terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [KEYWORDS]. Readers are urged to refer to RFC 2663 [NAT-TERM] for information on basic NAT taxonomy and terminology. Most NAT devices deployed today fall under the category of Traditional NAT, which implement an asymmetric translation scheme that multiplexes many hosts on a "private" network onto one or a few "public" IP addresses. Readers may refer to RFC 3022 [NAT-TRAD] for a detailed description on traditional NAT. This section uses the combination of RFC 2663 [NAT-TERM] and RFC 4008 [NAT-MIB] to outline the basic principles of NAT operation, and defining technical terms used throughout. Newer terms not found in either of the documents may also be found defined in this section. The following diagram presents a general architectural model describing NAT operation. The model does not provide a method of implementing NAT, but serves as a framework for understanding the rationale behind behavioral requirements described later in the document. Ford, Srisuresh & Sivakumar [Page 4] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 NAT Device +------------------------------------------------+ | | | +------------+ +----------+ +----------+ | | | Address/ | | Address/ | | | | | | Port |==>| Port |==>| NAT | | | | Maps | | Bindings | | Sessions | | | | (Admin | | (Static &| | (Dynamic)| | | | Configured)| | Dynamic) | | | | | +------------+ +----------+ +----------+ | | ^ | | | | | +----------------------+ | | | | Incoming | +-------+------+ +-------------+ |Outgoing Packets | | NAT Packet | | IP routing/ | |Packets -------->+ ---->| Translation |---->| Forwarding |----> |-------> | | | | | | | +--------------+ +-------------+ | | | +------------------------------------------------+ Figure 1: Architectural model describing NAT operation Figure-1 above is a pictorial overview of how a NAT operates and encompasses the building blocks of a NAT device. Central to the NAT operation is three tables comprised of "Address/Port Maps", "Address/Port Bindings", and "NAT Sessions". The tables are described in detail in the following subsections. Implementation of these tables vary across different vendors. All vendors implement Address/port maps and NAT Sessions in some form. Vendors supporting Symmetric NAT behavior do not implement Address/Port Bindings, but derive the NAT Sessions directly from the address/port maps. Packet processing within a NAT device works roughly as follows. The NAT device first looks up the incoming packet against the known set of NAT Sessions. If a match is not found and this is likely the first packet of a new session, a new NAT session may be created based on either the existing bindings or the address maps. NAT translates the packet as specified by the matching NAT Session. The translated packet is looked up in the routing table for forwarding to the next hop. 2.1. Address/Port Maps This document used the term Address Map as defined in RFC 4008 Ford, Srisuresh & Sivakumar [Page 5] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 [NAT-MIB]. The Address/Port Map table represents the configuration database set up by the NAT administrator. The entries of the table determine the type of NAT the device implements. For example, Basic NAT, NAPT, Bidirectional NAT, Twice-NAT, or a combination thereof. The Address Map table may also contain static (i.e., permanent) mappings between IP addresses and/or port endpoints. Say, a mapping between port 80 on the NAT's public IP address and a particular Web server located on the internal network. 2.2. Address/Port Bindings This document uses the term "Address Binding" as defined in RFC 3022 and "port binding" as defined in RFC 4008. A NAT device may implement both Address and Port Bindings. Address Binding is a persistent association between a particular IP address on the NAT's internal address realm, and an IP address on its external realm, which the NAT assigns as the internal node's "public address" for the purpose of communicating across the NAT. A Port Binding similarly represents a persistent association between an internal (IP address, TCP/UDP port) endpoint and a corresponding external (IP address, TCP/UDP port) endpoint. A NAPT generally establishes a Port Binding while setting up the first outgoing NAT session originating from a particular internal (IP address, TCP/UDP port) endpoint. Once that Binding has been established, the NAT re- uses the same Port Binding whenever it subsequently establishes a new outgoing NAT Session originating from the same internal endpoint (but possibly targeted at a different external endpoint). Not all existing NATs use Address or Port Bindings to determine their IP address and port translation behavior, however. Some existing NATs setup NAT Sessions directly from the Address/Port maps without creating any Bindings. The notion of address/port binding is crucial to making NATs behave in a predictable fashion for a number of application communication patterns. A Binding represents the association between internal and external entities (addresses or endpoints) the NAT has set up. The NAT can create Bindings in the Binding table either statically, to reflect the static one-to-one configuration entries in the NAT Address Map table, or dynamically prior to setting up NAT session flows across the NAT. A Binding can be between a pair of IP addresses or a pair of end points representing the same entity across two realms. Binding between a pair of addresses is called Address Binding. Binding between a pair of TCP/UDP endpoints is called TCP/UDP Port Binding, or simply, Port Binding, for short. Bindings are Ford, Srisuresh & Sivakumar [Page 6] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 directional. 2.3. NAT Session The term "NAT Session" was first defined in RFC 4008[NAT-MIB] to represent the dynamic state the NAT uses to translate the individual packets comprising a particular communication session flowing across the NAT. A NAT session is conceptually defined by a tuple of the following form: (origin session, origin side, target session, target side) The "origin side" and "target side" components of this tuple are simply the tags "Internal" or "External". An actual NAT implementation might use a one-bit flag, for example, or a physical network interface name or number, to represent these "side" tags. The "origin side" indicates which side of the NAT, and thus which IP address realm, the logical session originated from: that is, on which side the NAT received the packet that first initiated the session. The "target side" indicates which side of the NAT and which address realm the logical session is targeted toward. A NAT Session's origin and target endpoints are usually on opposite sides of the NAT, but not always. The "origin session" and "target session" components are ordinary session tuples as described above, describing the session's identity within the IP address realm indicated by the corresponding "side" component. For example, the "origin session" is the complete (source IP, source port, dest IP, dest port) tuple describing the session's identity on the side of the NAT from which the session was initiated, and the "target session" is the complete (source IP, source port, dest IP, dest port) tuple describing the session's identity as it appears on the side of the NAT to which the session was targeted. 2.4. Cone/Symmetric NAT behaviors RFC 3489 [STUN] defines terminology for several different NAT variations. In particular, it uses the terms "Full Cone", "Restricted Cone", "Port Restricted Cone" and "Symmetric" to refer to different variations of a NAT's IP address and port assignment behavior. These terms have historically been used only to describe a NAT's behavior with respect to the UDP transport, although the issues these terms were intended to address are also important for TCP. Unfortunately, besides being historically attached to the UDP transport protocol, these categories have proven insufficient to Ford, Srisuresh & Sivakumar [Page 7] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 represent the full range of NAT behaviors that have been observed to exist. This specification therefore defines specific NAT behaviors individually instead of using the broad Cone/Symmetric terminology. The specific relationship between the historical Cone/Symmetric terminology and the individual NAT behaviors may be defined as below. 2.4.1. Symmetric NAT Symmetric NAT behavior is one exhibited by a NAT device that does NOT use Address Binding or Port Binding for TCP/UDP based NAT sessions. If a TCP/UDP endpoint on a private host, denoted by the tuple of (IP address, port no), originated multiple sessions, a new public TCP/UDP port is assigned to translate the private endpoint in each new session. 2.4.2. Port Restricted Cone NAT Port Restricted Cone NAT behavior is one exhibited by a NAT device that uses Address/Port Binding and behaves as follows. If the TCP/UDP endpoint on a private host, denoted by the tuple of (IP address, port no), originated multiple sessions, the same public TCP/UDP port is assigned to translate the private endpoint in each new session. Further, only the packets that belong to the sessions initiated by the private host are permitted on the inbound. 2.4.3. Address Restricted Cone NAT Address Restricted Cone NAT behavior is one exhibited by a NAT device that uses Address/Port Binding and behaves as follows. Once a UDP endpoint on a private host, denoted by the tuple of (IP address, port no), originated a UDP session, Address-restricted Cone NAT will accept incoming UDP packets to the corresponding public port from only those external endpoints whose IP address match the address of external endpoint to which the private endpoint had initiated outgoing sessions. 2.4.4. Full Cone NAT Full Cone NAT behavior is one exhibited by a NAT device that uses Address/Port Binding and behaves as follows. Once a UDP endpoint on a private host, denoted by the tuple of (IP address, port no), originated a UDP session, Full Cone NAT will accept incoming UDP packets to the corresponding public port from any external endpoint. 2.5. Multi Level NAT topology The term "Multi Level NAT" is not defined in any earlier documents. Ford, Srisuresh & Sivakumar [Page 8] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 The term is being defined for the first time in this document. Multi Level NAT topology is a network topology in which NAT devices can be found at two or more levels between communicating endpoints. NATs are increasingly, and often unintentionally, used to create hierarchically interconnected clusters of private networks in which some hosts are separated from the Internet by more than one level of Traditional NAT. The following diagram illustrates this situation. Internet (public IP addresses) ------------------+---------------+-- | | | | +-------------+ Host S | NAT-1 | +-------------+ | | Private Network 1 (private IP addresses) ----+---------------------------+---- | | | | +-------+ +-------+ | NAT-2 | | NAT-3 | +-------+ +-------+ | | | | Private Network 2 Private Network 3 (private IP addresses) (private IP addresses) ----+-----------+---- ----+-----------+---- | | | | | | | | Host A Host B Host C Host D Figure 2. Multi Level NAT topology NAT-1 may for example be a large enterprise NAT deployed by an ISP that does not have enough IP addresses to assign one to each of its customers, an increasingly common situation especially in developing countries. NAT-2 and NAT-3 are consumer-level NATs deployed independently by the ISP's customers to multiplex their small home or business networks onto the single IP address their ISP gives them via DHCP. Ford, Srisuresh & Sivakumar [Page 9] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 Neither the ISP nor the customers necessarily intend to create this hierarchical Multi Level NAT topology. Multi Level NAT topologies arise merely as a consequence of the same technical and economic factors that drove the wide deployment of NATs in the first place. 2.6. Hairpin NAT Session & Hairpin NAT translation The terms "Hairpin NAT Session" and "Hairpin NAT translation" are not defined in any earlier documents. The terms are being defined for the first time in this document. Hairpin NAT Session is a NAT Session having the form (origin session, Internal, target session, Internal), and represents a logical communication session whose endpoints are both on the internal network, but which nevertheless flows "through" the NAT and requires address translation. The translation of packets subject to a Hairpin NAT Session is called Hairpin NAT translation, or simply Hairpin NAT. Packets subject to Hairpin NAT translation would undergo translation for both source and destination endpoints within the same NAT device. The need for Hairpin NAT arises out of the necessity to support application traversal through Multi Level NATs. Hairpin NAT translation refers to the ability of a NAT device to allow multiple private endpoints behind the NAT to communicate with each other using each other's *public* (translated) endpoints. When two hosts reside on different private networks but nevertheless have at least one NAT in common, it is not possible for the two hosts to establish direct peer-to-peer communication with each other unless the common NAT(s) support hairpin translation. The following diagram illustrates this situation. Ford, Srisuresh & Sivakumar [Page 10] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 Server S (S:s) | ^ Session A-S ^ | ^ Session B-S ^ | (A1:a1,S:s) | | | (B1:b1,S:s) | | +-------------+ | NAT-1 | +-------------+ | | Private Network 1 +------------------------+------------------------+ | | | ^ Session A-S ^ ^ Session B-S ^ | | | (A2:a2,S:s) | | (B3:b3,S:s) | | | | | ^ Session A-B | ^ Session B-A ^ | | | (A2:a2,B1:b1) | | (B2:b2,A1:a1) | | | | +-------------+ +-------------+ | NAT-2 | | NAT-3 | +-------------+ +-------------+ | | | Private Network 2 Private Network 3 | ---+---+------------- ------------------+---+----- | | | ^ Session A-S ^ ^ Session B-S ^ | | | (A:a,S:s) | | (B:b,S:s) | | | | | ^ Session A-B ^ ^ Session B-A ^ | | | (A:a,B1:b1) | | (B:b,A1:a1) | | | | Host A Host B (A:a) (B:b) Figure 3. Hairpin NAT translation in a Multi Level NAT scenario Suppose Host A in the topology above initiates an outgoing session A-S from private endpoint A:a to public endpoint S:s on Host S, a "well-known" server on the Internet. In setting up this outgoing session, NAT-2 first creates an outgoing NAT Session that translates the session tuple (A:a, S:s) on Private Network 2 to a corresponding session tuple (A2:a2, S:s) on Private Network 1. This outgoing session then traverses through NAT-1, which creates a new NAT session mapping the session tuple (A2:a2, S:s) on Private Ford, Srisuresh & Sivakumar [Page 11] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 Network 1 to the session tuple (A1:a1, S:s) on the main Internet. Host B similarly initiates an outgoing session from B:b to S:s, causing NAT-3 to assign "intermediate" source endpoint B3:b3 to this session as it appears on Private Network 2, and in turn causing NAT-1 to assign public source endpoint B1:b1 to this session as it appears on the Internet. Client hosts A and B now obtain from S each other's public source endpoints as known to S, namely B1:b1 and A1:a1, respectively. Each client then attempts to open a peer-to-peer communication session targeting the other host's public endpoint, as described fully in the companion document [BEH-APP]. To NAT-1, the common NAT, Host A's attempt to open a peer-to-peer connection to B appears as an attempt received from private endpoint A2:a2, and directed to "public" endpoint B1:a1. This "public" endpoint, however, is merely one of the temporary public endpoints that NAT-1 itself previously assigned to represent B's "intermediate" private endpoint B3:b3! In order to handle this communication attempt properly, NAT-1 needs to set up a Hairpin NAT session for packets traveling from A to B. Subsequent to that, NAT-1 would translate A's "intermediate" private source endpoint A2:a2 into A's corresponding public source endpoint A1:a1, and simultaneously translates B's public destination endpoint B1:b1 into B's corresponding "intermediate" private endpoint B3:b3, before forwarding the translated packet on to B3:b3 on its private network. This packet will then traverse NAT-3 and reach Host B with a destination endpoint of B:b and a source endpoint of A1:a1. Conversely, for packets flowing from B to A, NAT-1 translates B's intermediate private source endpoint B3:b3 into its corresponding public source endpoint B1:b1, and simultaneously translates A's public destination endpoint A1:a1 into A's intermediate private endpoint A2:a2, before forwarding the translated packet on to A2:a2 and eventually to A:a. 3. General Behavioral Requirements for NATs This section lists the general behavioral requirements for a NAT device when processing IP packets. Even though ICMP is a transport protocol on top of IP, ICMP packet processing is considered an integral of IP processing itself. With the exception of ICMP, the behavioral requirements laid out below are independent of the transport protocol. Associated with each requirement, the rationale behind the requirement is discussed in detail. Ford, Srisuresh & Sivakumar [Page 12] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 3.1. Transport Protocol support TCP and UDP are by far the most common and widely deployed IP transport protocols, although other transports exist as well. Of the various NAT types, NAPT is the most restrictive in terms of the transport protocol support. For widespread application compatibility, therefore, it is crucial that any NAT support at least the TCP and UDP transports, and NATs are encouraged to support other transport protocols as well as they become standardized and deployed. REQ-1: A NAT MUST support the traversal of TCP based applications and unicast UDP based applications. 3.2. Address Binding and/or Port Binding support Several applications use the same endpoint within a realm to establish multiple simultaneous sessions. Many peer-to-peer applications use the public endpoint registration of peering hosts to initiate sessions into. In order to support peer-to-peer applications and applications that entertain multiple simultaneous session using the same TCP/UDP endpoint, NAT MUST retain the association it assigned to an endpoint between realms and reuse the same endpoint association when multiple sessions using the same endpoint are routed through the NAT device. This issue is of general relevance for any transport protocol that use port numbers to represent communication endpoints, including TCP and UDP. Such a binding between endpoints can occur when a NAT device maintains Address Bindings or Port Bindings. REQ-2: A NAT device MUST support Address and/or Port Bindings. Specifically, Symmetric NAT type behavior MUST be deprecated. 3.3. Fragment support for inbound IP packets Routers in the network are able to forward fragmented IP packets just as they do any other non-fragmented IP packets because packet forwarding is based solely on looking up the destination IP address in the routing table and finding the largest prefix match to identify the next-hop to forward to. Routers do not need to retain any state pertaining to fragmented packets traversing them. A NAT device operates differently from a router in that the NAT device must find the matching NAT Session for an IP packet and perform NAT translation on the packet, prior to forwarding. NAT Session lookup requires the full 5-tuple of the IP datagram. Only Ford, Srisuresh & Sivakumar [Page 13] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 the first fragment of the IP datagram contains the full-tuple. Subsequent fragmented packets contain the fragment Id, but do not contain transport protocol specific details such as source and destination port numbers. The NAT device must be able to associate the same session tuple for all fragments by virtue of the fragment ID and use that information to locate the NAT Session the packets belong to. Note however, the IP fragments cannot be assumed to arrive in order. Some operating systems transmit the fragments of an IP datagram out of their logical order as a matter of course. In addition, network conditions can also cause dynamic packet reordering in transit. A NAT device MUST be capable of processing all fragments of an IP datagram inbound to the NAT device. The NAT device MUST retain the assembly state pertaining to a fragmentation ID until all fragments of the IP datagram are processed. By doing this, NAT is able to process all fragments pertaining to an IP datagram using the same NAT Session the IP datagram belongs to. Applications such as NFSv2 over UDP assume a default read/write buffer size of 8192 bytes and rely on IP fragmentation support in the network. A fully assembled IP datagram will be about 8300 bytes long. NAT devices MUST support the traversal of common applications such as this. REQ-3: A NAT device MUST be able to process (i.e., receive, translate, and forward) all fragments of an IP datagram, whether they arrive in order or out of order. A NAT device MUST process IP fragments that assemble to a maximum of 8300 byte IP datagrams. 3.4. Fragment processing on the outbound Say, two private hosts originated TCP/UDP packets (fragmented or not) to the same destination host and both packets transit the same NAPT device and use the same fragmentation identifier. Say, the NAPT device assembled the IP packets (in the case they were fragmented) and translated the same using the appropriate NAT Sessions. When NAPT translates IP datagrams, it would assign all outbound IP datagrams the same Public IP, but different TCP/UDP numbers. While forwarding, an IP datagram may be fragmented on the way out. Only the first fragment contains the TCP/UDP header that would be necessary to associate the packet to a specific session. Subsequent fragments do not contain TCP/UDP port information, but simply carry the same fragmentation identifier specified in the first fragment. When the target host receives the two unrelated datagrams, carrying Ford, Srisuresh & Sivakumar [Page 14] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 same fragmentation id, and from the same assigned host address, the target host is unable to determine which of the two sessions the datagrams belong to and might corrupt both sessions. In order to avoid problems of this kind, the NAPT device SHOULD further translate fragment ID in the outgoing packets such that the tuples of (SrcIP, DestIP, fragment Id, Protocol) are unique and distinguishable across all outgoing packets from the NAT device. REQ-4: When fragmenting packets on the outbound, a NAT device SHOULD ensure that the tuples of (SrcIP, DestIP, fragment Id, Protocol) are unique across all outgoing packets. This requirement pertains specifically to NAPT devices. 3.5. Hairpin NAT translation Multi Level NATs are commonly deployed. Private hosts behind the Multi Level NATs use their public endpoint identity to communicate with each other. Hairpin NAT translation MUST be supported in the NAT devices to allow applications on the private hosts to communicate with each other. REQ-5: All NATs MUST support hairpin NAT translation. 3.6. DHCP-Configured NATs Many NATs, particularly consumer-level devices designed to be deployed by nontechnical users, also act as DHCP clients. In its default configuration, a consumer NAT typically obtains its public IP address, default router, and other IP configuration information via DHCP from an ISP or other "upstream" network. On its internal network side, the NAT then automatically sets up its own private "downstream" subnet in one of the private IP address regions assigned to this purpose in RFC 1918 [PRIV-ADDR]. The NAT typically acts as a DHCP server for its private downstream network, managing its pool of private IP addresses automatically and handing them out to the hosts (and perhaps other NATs) on the private network on demand. This auto-configuration of private networks can be problematic, however, if the NAT's upstream network is also in RFC 1918 private address space. In the Multi Level NAT scenario described in section 2.5, NAT-2 and NAT-3 are likely to be consumer-level NATs that obtain their "external" IP addresses on Private Network 2 from NAT-1's DHCP server. Thus, from the viewpoint of NAT-2 or NAT-3, both their "internal" and "external" networks are probably in the private Ford, Srisuresh & Sivakumar [Page 15] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 RFC 1918 address regions, and may even use numerically overlapping IP addresses. DHCP configured NAT vendors must carefully design their NATs to ensure that they function correctly and robustly even in such problematic scenarios. REQ-6: A NAT device whose external IP interface can be configured via DHCP MUST operate correctly even if its external interface's IP address and subnet configuration numerically conflicts with the IP address and subnet configuration of the NAT's internal interface(s). 3.7. Honor the DF bit in IP header A NAT device MUST honor the DF (Don't Fragment) bit in the IP header of the packets that transit the NAT device. Majority of the TCP sessions have the DF bit set and will expect the devices enroute to not fragment the TCP segments. If the MTU on the forwarding interface of the NAT device is such that the IP datagram cannot be forwarded without fragmentation, NAT MUST send a destination unreachable ICMP message (ICMP type 3, Code 4) with a suggested MTU back to the sender and drop the IP packet. The sender will resend after taking an appropriate corrective action. REQ-7: If DF bit is set on an inbound IP packet and the NAT device cannot forward the packet without fragmentation, the NAT device MUST send a destination unreachable ICMP message (ICMP type 3, Code 4) with a suggested MTU back to the sender prior to dropping the IP packet. 3.8. ICMP Error packet handling A NAT device MUST transparently forward ICMP error messages ([ICMP]) it receives from intermediate or end nodes in either realm to the intended endnode. Unlike other IP packets, the basis for translation of an ICMP error packet is the NAT Session to which the packet embedded within the ICMP error message payload belongs to, not the IP and ICMP headers in the outer layer. Consider the following scenario in figure 4. Say, NAT-xy is a traditional NAT device connecting hosts in private and external networks. Router-x and Host-x are in the external network. Router-y and Host-y are in the private network. The subnets in the external network are routable from the private as well as the external domains. Whereas, the subnets in the private network are only routable within the private domain. When Host-y initiated a session to Host-x, let us say that the NAT device assigned an IP address of Ford, Srisuresh & Sivakumar [Page 16] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 Host-y' to associate with Host-y in the external network. Host-x | | ---------------+------------------- | | +-------------+ | Router-x | +-------------+ | External Network | --------------------+--------+------------------- | | ^ | | | (Host-y', Host-x) | | | v | +-------------+ | NAT-xy | +-------------+ | | Private Network ----------------+------------+---------------- | | | +-------------+ | Router-y | +-------------+ | | ----------------+-------+-------- | | ^ | | | (Host-y, Host-x) | | | v | Host-y Figure 4. NAT topology with intermediate routers in both realms Say, a packet from Host-y to Host-x triggered an ICMP error message from one of Router-x or Host-x (both of which are in the external Ford, Srisuresh & Sivakumar [Page 17] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 domain). Such an ICMP error packet will have one of Router-x or Host-x as the source IP address and Host-y' as the destination IP address. When the NAT device receives the ICMP error packet, the NAT device must use the packet embedded within the ICMP error message (i.e., the IP packet from Host-y to Host-x) to look up the NAT Session the embedded packet belongs to and use the NAT Session to translate the embedded payload. The NAT device must also use the NAT Session to translate the outer IP header. In the outer header, the source IP address will remain unchanged because the originator of the ICMP error message (Host-x or Router-x) is in external domain and routable from the private domain. The destination IP address Host-y' must however be translated to Host-y using the NAT Session parameters. Now, say, a packet from Host-x to Host-y triggered an ICMP error message from one of Router-y or Host-y (both of which are in the private domain). Such an ICMP error packet will have one of Router-y or Host-y as the source IP address and Host-x as the destination IP address. When the NAT device receives the ICMP error packet, the NAT device must use the packet embedded within the ICMP error message (i.e., the IP packet from Host-x to Host-y) to look up the NAT Session the embedded packet belongs to and use the NAT Session to translate the embedded payload. The NAT device must also use the NAT Session to translate the outer IP header. In the outer header, the destination IP address will remain unchanged, as the IP addresses for Host-x is already in the external domain. If the ICMP error message is generated by Host-y, the NAT device must simply use the NAT Session to translate the source IP address Host-y to Host-y'. However, if the ICMP error message is generated by the intermediate node Router-y, the NAT device will not have had a translation entry for Router-y within the NAT Session. The NAT may also not have an Address Binding in place for Router-y. In such a case, the NAT device must simply use its own IP address in the external domain to translate the source IP address. Changes to ICMP error message ([ICMP]) MUST include changes to IP and ICMP headers on the outer layer as well as changes to the relevant IP and transport headers of the packet embedded within the ICMP-error message payload. Section 4.3 of the RFC 3022 describes the various items within the ICMP error message that MUST be translated by the NAT device. REQ-8: A NAT device MUST transparently forward the ICMP error messages it receives from the intermediate or end nodes in either realm to the intended endnode. The NAT device MUST use the packet embedded within the ICMP error message payload as the basis to translate not only the embedded payload, but also the IP and ICMP headers in the outer layer. In the case the ICMP error packet is Ford, Srisuresh & Sivakumar [Page 18] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 generated by an intermediate node for which the NAT has no Binding translation, the NAT device MUST use its own IP address in the realm of the recipient node to translate the intermediate node IP address. 3.9 Rejection of IP packets not permitted by NAT Unlike a router, a NAT device is session oriented and permits sessions from/to specific endpoints in either the external or internal realm based on how the NAT device is configured with Address/Port Maps. For example, a TCP packet is not permitted across a NAT device unless the specific TCP session is already in progress and known to the NAT device. Further, inbound sessions are not permitted into a traditional NAT device by default. In addition, a new session may not be able to transit a NAT device due to the NAT device running of addresses in the address pool or ports in the TCP/UDP port pool or because of an administrative policy. In each of these scenarios, where an inbound packet is prohibited by a NAT device to traverse through it for resource/authorization considerations, the NAT device SHOULD not simply drop the packet silently. Instead, the NAT device SHOULD send ICMP destination unreachable message, with a code of 10 (Communication with destination host administratively prohibited) to the sender prior to dropping the packet. Unfortunately, there is not another ICMP code currently defined to indicate "Communication with destination host port administratively prohibited". So, the same code should be used for host as well as port filtering. Lastly, it is also advisable for the NAT device to log the error or record the event in a statistic counter. REQ-9: When an inbound packet is prohibited by a NAT device due to resource/authorization considerations, the NAT device SHOULD send ICMP destination unreachable message, with a code of 10 (Communication with destination host administratively prohibited) to the sender prior to dropping the packet. 3.10. ALG support Strictly speaking, NAT devices are not required to include ALGs. However, vast majority of the NAT devices in deployment do support Application Level Gateways (ALGs) for FTP and DNS applications. The ALG for FTP is discussed briefly in RFC 3022 and RFC 2766. The ALG for DNS is described in detail in [DNS-ALG]. Given that majority of the applications assume this to be part of NAT devices, and majority of the NAT devices support these ALGs anyway as an integral part, we recommend the NAT devices to support the ALGs for these two applications by default. Ford, Srisuresh & Sivakumar [Page 19] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 REQ-10: A NAT device SHOULD support ALGs for FTP and DNS, so as to enable traversal of these applications across NAT. In addition, NAT devices SHOULD explicitly identify the ALGs supported, and make ALG support configurable (enable/disable). 3.11 Denial-of-Service Protection Since NAT devices are Internet hosts they can be the target of a number of different attacks. NAT devices should employ the same sort of protection techniques as Internet-based servers do. For example, storing incomplete IP packet fragments unfortunately creates a well-known vulnerability to denial-of-service attacks, against which NATs should protect themselves. NATs typically do so by limiting the length of time they retain an incomplete IP packet before discarding it, or alternatively by limiting the amount of internal buffer space such incomplete IP packets may consume before the oldest fragments are discarded. The appropriate values of these limits vary depending on the size and purpose of the NAT, however, and therefore are left to be determined by the NAT designer or network administrator. Further, the NAT device should impose a rate limit on the ICMP error messages it generates for whatever reason. REQ-11: A NAT device SHOULD protect itself against Denial of Service attacks arising out of fragment processing and generating ICMP error responses to unauthorized packets. 4. Hints to implementers The following subsections provides hints to implementers on how to go about the requirements outlined in the previous section. Note, these are merely hints, not requirements. Implementers may choose to ignore the hints. 4.1. Inbound fragmented packet processing Large IP datagrams (sometimes, even the small IP datagrams) may arrive as fragmented IP packets into a NAT device. The tuples of (SrcIP, DestIP, Fragment Id, Protocol) will be unique for these packets. However, the fragments pertaining to any of the tuples may not arrive in order (i.e., first fragment, followed by subsequent fragments in the increasing offset). Further, due to a problem in some host Operating System IP stacks, the fragments originating from the host may have overlapping payload segments. In order to meet Req-4 in all of these scenarios, NAT Ford, Srisuresh & Sivakumar [Page 20] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 devices often fully re-assemble the incoming fragments into a complete IP datagram first and use the assembled datagram to look up the NAT Session table. However, there often are limitations on how long a state can be retained, the size of the largest assembled IP datagram it can support and how many simultaneous states a NAT device can retain at once. Implementers adapting the fragment reassembly approach should refer Section 3.3.2 of RFC 1122 for the various design options they might need to consider. 4.2. Port reservation A NAPT device often shares the source ports for its public IP address with nodes in the private realm. In order for the NAT device to do any of its own TCP/UDP session initiations, it MUST ensure that the ports it uses for itself are not shared with private nodes. This may be accomplished either through explicit address/port mapping for NAT use during config time (or) reserve a block of ports explicitly within the device for local use vs. NAT use. Reserving port blocks explicitly for local use vs. NAT use is valuable for several reasons. Consider the following scenario. Say, an application on the NAPT runs on port 5060 (SIP Server), but not enabled. A host in the private domain uses 5060 at this time and say, gets the port 5060 from the NAT device. While this Port Binding is active, say, the application running on NAT is activated. Several things can go wrong now depending on the implementation. 1. The application is totally unaware of NAT's existence, (maybe because NAT never does a bind on the ports it is using). So it starts using 5060 and the subsequent packets directed to this server could end up in the end host within the private domain or the packets meant for the application on the end host could end up in the NAT box's TCP/UDP stack. Both are bad and can cause unpredictable behavior. 2. The application on the NAT box is aware that someone is using 5060 so the Bind fails and the app fails to come up. The administrator would have to clear the NAT session and restart the application. 3. The application on the NAT box is intelligent enough to tell NAT to clean up any sessions that it plans to use and NAT cleans up its session(s). The application on the end host is effected as a result. Clearly, there can be unpredictable behavior when ports are not reserved explicitly for local use vs NAT use. Ford, Srisuresh & Sivakumar [Page 21] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 4.3. DHCP Configured NATs Many of the residential NAT devices acts as a DHCP client on the external interface and as DHCP server on the internal interface. When doing so, there is a possibility that the IP subnet on the external and internal interfaces could overlap, especially in the case of a Multi-level NAT setup. One way to avoid problems due to private IP address conflicts is by supporting multiple RFC 1918 address ranges for its private network. The NAT's DHCP server might for example hand out IP addresses in the 10.0.0.0/24 range to downstream hosts by default as long as its own DHCP-assigned "external" IP address is not in this region, and otherwise hand out addresses in the 172.16.0.0/12 private region. 5. Summary of Requirements This section summarizes the requirements specified and discussed at length in the preceding sections. A NAT that supports all of the mandatory requirements of this specification (the "MUST" requirements), is "compliant with this specification." A NAT that supports all of the requirements of this specification including the optional, "RECOMMENDED" requirements, is "fully compliant with all the mandatory and recommended requirements of this specification." REQ-1 A NAT MUST support the traversal of TCP based applications and unicast UDP based applications. REQ-2 A NAT device MUST support Address and/or Port Bindings. Specifically, Symmetric NAT type behavior MUST be deprecated. REQ-3: A NAT device MUST be able to process (i.e., receive, translate, and forward) all fragments of an IP datagram, whether they arrive in order or out of order. A NAT device MUST process IP fragments that assemble to a maximum of 8300 byte IP datagrams. REQ-4 When fragmenting packets on the outbound, a NAT device SHOULD ensure that the tuples of (SrcIP, DestIP, fragment Id, Protocol) are unique across all outgoing packets. This requirement pertains specifically to NAPT devices. REQ-5 All NATs MUST support hairpin NAT translation. REQ-6 A NAT device whose external IP interface can be configured Ford, Srisuresh & Sivakumar [Page 22] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 via DHCP MUST operate correctly even if its external interface's IP address and subnet configuration numerically conflicts with the IP address and subnet configuration of the NAT's internal interface(s). REQ-7: If DF bit is set on an inbound IP packet and the NAT device cannot forward the packet without fragmentation, the NAT device MUST send a destination unreachable ICMP message (ICMP type 3, Code 4) with a suggested MTU back to the sender prior to dropping the IP packet. REQ-8: A NAT device MUST transparently forward the ICMP error messages it receives from the intermediate or end nodes in either realm to the intended endnode. The NAT device MUST use the packet embedded within the ICMP error message payload as the basis to translate not only the embedded payload, but also the IP and ICMP headers in the outer layer. In the case the ICMP error packet is generated by an intermediate node for which the NAT has no Binding translation, the NAT device MUST use its own IP address in the realm of the recipient node to translate the intermediate node IP address. REQ-9 When an inbound packet is prohibited by a NAT device due to resource/authorization considerations, the NAT device SHOULD send ICMP destination unreachable message, with a code of 10 (Communication with destination host administratively prohibited) to the sender prior to dropping the packet. REQ-10 A NAT device SHOULD support ALGs for FTP and DNS, so as to enable traversal of these applications across NAT. In addition, NAT devices SHOULD explicitly identify the ALGs supported, and make ALG support configurable (enable/disable). REQ-11 A NAT device SHOULD protect itself against Denial of Service attacks arising out of fragment processing and generating ICMP error responses to unauthorized packets. 6. Security Considerations None yet. Normative References [ARP] David C. Plummer, "An Ethernet Address Resolution Protocol", RFC 826, November 1982. Ford, Srisuresh & Sivakumar [Page 23] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 [KEYWORDS] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [NAT-APPL] D. Senie, "Network Address Translator (NAT)-Friendly Application Design Guidelines", RFC 3235, January 2002. [NAT-MIB] R. Rohit, P. Srisuresh, R. Raghunarayan, N. Pai, and C. Wang, "Definitions of Managed Objects for Network Address Translators (NAT)", RFC 4008, February 2005. [NAT-PROT] M. Holdrege and P. Srisuresh, "Protocol Complications with the IP Network Address Translator", RFC 3027, January 2001. [NAT-PT] G. Tsirtsis and P. Srisuresh, "Network Address Translation - Protocol Translation (NAT-PT)", RFC 2766, February 2000. [NAT-TERM] P. Srisuresh and M. Holdrege, "IP Network Address Translator (NAT) Terminology and Considerations", RFC 2663, August 1999. [NAT-TRAD] P. Srisuresh and K. Egevang, "Traditional IP Network Address Translator (Traditional NAT)", RFC 3022, January 2001. [PRIV-ADDR] Y. Rekhter, B. Moskowitz, D. Karrenberg, G. J. de Groot, and E. Lear, "Address Allocation for Private Internets", RFC 1918, February 1996. [RFC 1122] Braden, R., "Requirements for Internet Hosts -- Communication Layers", STD 3, RFC 1122, October 1989. [RFC 1123] Braden, R., "Requirements for Internet Hosts -- Application and Support", STD 3, RFC 1123, October 1989. [RFC 1812] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812, June 1995. [FTP] Postel, J. and J. Reynolds, "FILE TRANSFER PROTOCOL (FTP)", STD 9, RFC 959, October 1985. [ICMP] Postel, J., "INTERNET CONTROL MESSAGE (ICMP) SPECIFICATION", STD 5, RFC 792, September 1981. [DNS-ALG] Srisuresh, P., Tsirtsis, G., Akkiraju, P. and A. Heffernan, "DNS extensions to Network Address Translators (DNS_ALG)", RFC 2694, September 1999. [FTP-EXT] Allman, M., Ostermann, S. and C. Metz, "FTP Extensions for IPv6 and NATs", RFC 2428, September 1998. Ford, Srisuresh & Sivakumar [Page 24] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 Informative References [BEH-APP] B. Ford, P. Srisuresh, and D. Kegel, "Application Design Guidelines for Traversal of Network Address Translators", Internet-Draft (Work In Progress), February 2005. [BEH-IGMP] D. Wing, "IGMP Proxy Behavior", Internet-Draft (Work In Progress), October 2004. [BEH-STATE] P. Srisuresh, B. Ford, and D. Kegel, "State of Peer-to-Peer (P2P) communication across Network Address Translators (NATs)", Internet-Draft (Work In Progress), December 2004. [BEH-TCP] P. Srisuresh, S. Sivakumar, K. Biswas, and, B. Ford, "NAT Behavioral Requirements for TCP", Internet-Draft (Work In Progress), January 2005. [BEH-TOP] B. Ford and P. Srisuresh, "Topological Complications from Network Address Translation (NAT-TOP)", Internet-Draft (Work In Progress), February 2005. [BEH-UDP] F. Audet and C. Jennings, "NAT Behavioral Requirements for Unicast UDP", Internet-Draft (Work In Progress), January 2005. [MIDCOM] P. Srisuresh, J. Kuthan, J. Rosenberg, A. Molitor, and A. Rayhan, "Middlebox communication architecture and framework", RFC 3303, August 2002. [H.323] "Packet-based Multimedia Communications Systems", ITU-T Recommendation H.323, July 2003. [RSIP] M. Borella, J. Lo, D. Grabelsky, and G. Montenegro, "Realm Specific IP: Framework", RFC 3102, October 2001. [SIP] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [SOCKS] M. Leech, M. Ganis, Y. Lee, R. Kuris, D. Koblas, and L. Jones, "SOCKS Protocol Version 5", RFC 1928, March 1996. [STUN] J. Rosenberg, J. Weinberger, C. Huitema, and R. Mahy, "STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs)", RFC 3489, March 2003. Ford, Srisuresh & Sivakumar [Page 25] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 [TURN] J. Rosenberg, J. Weinberger, R. Mahy, and C. Huitema, "Traversal Using Relay NAT (TURN)", Internet-Draft (Work In Progress), March 2003. [UNSAF] L. Daigle and IAB, "IAB Considerations for UNilateral Self- Address Fixing (UNSAF) Across Network Address Translation", RFC 3424, November 2002. [UPNP] UPnP Forum, "Internet Gateway Device (IGD) Standardized Device Control Protocol V 1.0", November 2001. http://www.upnp.org/standardizeddcps/igd.asp Authors' Addresses: Bryan Ford Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 77 Massachusetts Ave. Cambridge, MA 02139 U.S.A. Phone: (617) 253-5261 E-mail: baford@mit.edu Web: http://www.brynosaurus.com/ Pyda Srisuresh Caymas Systems, Inc. 1179-A North McDowell Blvd. Petaluma, CA 94954 U.S.A. Phone: (707)283-5063 E-mail: srisuresh@yahoo.com Senthil Sivakumar Cisco Systems, Inc. 170 West Tasman Dr. San Jose, CA 95134 U.S.A. Phone: Email: ssenthil@cisco.com Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Ford, Srisuresh & Sivakumar [Page 26] Internet-Draft NAT Behavioral requirements for IP & ICMP May 2005 This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Ford, Srisuresh & Sivakumar [Page 27]