AnalogX

Bridging the Gap, NAT Traversal and Peer Communications



Introduction


    Before you read this, make sure to get the Powerpoint file linked at the bottom of this article and ideally read it in conjunction with this - this is really more of a transcription of the talk I gave at the European Game Developer Conference in December of 2007. At the end you will also find example source code for an application which does both TCP and UDP hole punching - please keep in mind that I tried to keep this as simple and small as possible while still illustrating everything that needs to happen.


Terms


    For convenience I'm going to define a couple of terms I'll be using throughout this paper - these aren't necessarily industry standards or anything like that, I just picked them for simplicity sake, since they're how I typically describe a network.

Private:
This refers to the machine that is behind the NAT device, so in the case of your home network, your computer's IP address would be called your "Private IP".
Public:
Next down the list is your NAT box, but specifically the interface that's connected to the Internet - this is what I typically call your "Public IP".
Remote:
Since you are probably trying to connect to someone else, we need a way to differentiate them. So, any machine that is not inside your NAT network is considered a "Remote" host.
Rendezvous:
Finally, the last link in the NAT traversal chain is the server that helps everyone get talking. Typically this will be a server that you operate, but not necessarily (if using a public SIP server, etc). In this talk, it will be called the "Rendezvous" host.

NAT Overview


    Since this talk is all about getting through a NAT let's talk a bit about what exactly a NAT device is, and what it does. Right now pretty much every home user on the Internet is using a NAT device of some sort. While a NAT device isn't explicitly necessary, without one it is necessary to have a unique IP address for each computer that is connected. What a NAT device does is translate in real time internally issued IP's with the one public Internet IP - ultimately allowing the user to have multiple computers share the same IP seamlessly.
    Another benefit of using a NAT device is that it gives some limited protection, similar to a firewall. By default, an inbound request is stopped by the NAT device, typically the packet is discarded and it's difficult if not impossible to get any sense of what kind of network exists inside of it.
    All of this does come at a price, and there is a slight performance penalty for using a NAT. We'll talk about this in more detail later on, but in most typical scenarios the user isn't effected.


NAT vs Firewall


    NATs and Firewalls are similar, but they're not the same - there are some key differences that distinguish them. Firewalls are much more common in corporate settings, whereas NAT devices are more typical in home deployments or small business. One of the reasons for this is that NAT is trivial to configure, in many cases the end user merely plugs in the box and they're off and running. A firewall, on the other hand, normally has a very complex ruleset which gives it an amazing amount of flexibility, but at the cost of ease of use. Another contributing factor is that a firewalls main purpose (by default) is security, whereas a NAT's goal in life is allowing multiple computers to share a single IP. Finally, because of the security goals of a firewall, its default behavior is most commonly to deny any activity vs. a NAT device which normally has a default behavior to allow any activity.


NAT Flavors


    Now that you've got a better sense of the differences between NAT's and firewalls, let's take a look at some of the different NAT flavors that exist.
    In the real world, 99.9% of what you'll encounter is what's called an Outbound NAT, and what that basically means is that sessions can only initiated from a private IP - no connections can be initiated from the outside in.
    Next is a Bi-Directional NAT, and as the name implies, it's like an Outbound NAT but with one distinction - you guessed it, sessions now can be initiated from the outside in. It's worth noting that Bi-Directional NAT's also generally require the use of DNS_ALG in order to operate.
    Twice NAT can sometimes appear to be a misconfigured Outbound NAT because of its key difference, that Remote and Private IP's can overlap.
    Finally is a Multihomed NAT, this flavor is the same as an Outbound NAT, but instead of only having a single Internet link, it may have several (such as a DSL and cable modem connection). If one of the links fails, it would automatically fail over to the second link - or it might also alternate between sending data over different links (load balancing).


NAT Performance Issues


    This functionality does come at a price, and it manifests in several different ways. At the lowest level, every IP packet that flows through the NAT box requires it to re-write the IP header (source or destination address), and then recalculate the header checksum. If it's a TCP connection, then it may need to modify the packet further (source or destination port) which then causes the TCP checksum to need to be recalculated. Finally, the more active connections running on the system, the more memory and state tracking the device needs to perform. While none of these individually sounds like too burdensome a task, when combined and running on a very low speed processor, the results can be pronounced.
    It's also important to point out another performance issue you might encounter, and it has to do with what many routers call 'Gamer Mode'. This is really a hack that NAT device makers put in several years ago to allow games that weren't designed to work behind a NAT to still be playable on the Internet. To do this, they inspect the contents of the UDP packet for the four bytes that match the Private IP address of the machine sending the packet - if they find this, they then replace it with the four bytes of the Public IP address. The problem with this is obvious, if you happen to generate a packet that contains this pattern, it's never going to arrive at it's destination without being corrupted by the router. Encryption is a simple way of getting around this issue, since a retransmit of the same packet (encrypted differently) will more than likely not have the byte pattern again.


Types of NAT's


    NAT's are further broken down into four different types, Full Cone, Restricted Cone, Port Restricted Cone and Symmetrical. These terms have to do with how they handle the mapping of a connection from a Private IP/port to a Public IP/port. It's important to point out that just because a router works one way with UDP does not mean that it necessarily will perform the same way with TCP - you need to analyze it to verify its behavior.
    There's also one more term called 'Hair-Pinning' that should be pointed out here as well. Hair-Pinning simply refers to what the NAT device does when it receives a packet from a Private host with the Public IP as its destination (effectively contacting itself). If the NAT device sends the packet back to the Private host, then it is considered to support Hair-Pinning - if it discards the packet, then it doesn't.


Full Cone


    Full Cone is the least restrictive type of NAT (of the four), and in operation is very similar to how port mapping or port triggering works in most routers. All requests from the same Private IP/port are mapped to the same Public IP/port (you'll notice that this is the same for all types except Symmetrical). The difference comes into play when determining who can use the map once its been established. In this instance, any Remote host can contact the Private IP/port by contacting the mapped Public IP/port.
    Here's a simple example of how a hole punch would be performed on a Full Cone NAT (assuming a mapping had already been established):
Private host contacts Rendezvous server, reports available
Remote host contacts Rendezvous server, requests IP/port of Private host
Rendezvous server returns Public IP/port to Remote host
Remote host attempts to contact Public IP/port
Remote IP/port mapped to Private IP/port


Restricted Cone


    Restricted Cone is slightly more restrictive than Full Cone (what a surprise with a name like that). Everything is almost the same as Full Cone, the only difference is that now only Remote hosts that the Private IP has contacted before can use the Public mapping.
    Here's a simple example of how a hole punch would be performed with this type of NAT - notice that now the Private host needs to be instructed to contact the Remote host as well:
Private host contacts Rendezvous server, reports available
Remote host contacts Rendezvous server, requests IP/port of Private host
Rendezvous server returns Public IP/port to Remote host
Rendezvous server directs Private host to contact Remote IP/port
Remote host attempts to contact Public IP/port
Private host attempts to contact Remote IP/port
Remote IP/port mapped to Private IP/port


Port Restricted Cone


    This is the most restrictive type of NAT that can still be trivially traversed, so more than likely your journey to punch through a NAT will end here. As with the Restricted Cone was to Full Cone, this is slightly more picky - have an idea what it's added to the mix? Something having to do with the Port perhaps? That's right, this is effectively the same a Restricted Cone, except now it not only matters that the Private host has contacted the Remote host before, but it needs to use the same Port.
    As with the other types, here's how this type would typically be traversed. In a real world development scenario you'll only be doing the Port Restricted type of traversal for UDP hole punching. Why? Because the process of doing a Port Restricted Cone contains all the steps necessary for any of the previous types. Here it goes:
Private host contacts Rendezvous server, reports available
Remote host contacts Rendezvous server, requests IP/port of Private host
Rendezvous server returns Public IP/port to Remote host
Rendezvous server directs Public host to contact Remote IP/port
Remote host attempts to contact Public IP/port
Private host attempts to contact Remote IP/port
Remote IP/port mapped to Private IP/port


Symmetrical


    Symmetrical are the most restrictive of all the NAT's and for all practical purposes can't be traversed. The reason for this is because it has all of same limits that the previous types do, but now this actually adds a new piece to the puzzle - every Private IP/port request to a Remote IP/port is mapped to the same Public IP/port. This means that if anything changes, it gets a new mapping.
    There is research being done on traversing Symmetrical NAT's, the main thrust is STUNT (originally NUTSS), and what it basically does is profile the NAT device and determine how the next map is being chosen, then it can predict what the mapping would most likely be.


STUN


    STUN is a very cool, simple and super useful protocol that was developed to help negotiate this tangled web of NAT devices. What STUN does is contact a server, and through a series of UDP requests it identifies the Public IP address of the NAT as well as the type of NAT that it is. It's also possible with some additional testing to determine the lifetime of a mapping as well as whether or not Hair-Pinning is supported. Keep in mind that if you want all the bells and whistles the test could take upwards of 5 minutes, but it's only a trickle of packets so the user won't even see its effects.


SIP


    SIP is the workhorse of the VOIP world, and is primarily tasked with brokering communications between clients and exchanging data amongst them. The protocol is fairly straight forward, being a derivative of HTTP, but it does have some differences so be on the lookout for them. Since it's so widespread, there's loads of public servers and 3rd party libraries available.


TURN


    TURN to my mind is really just like a SOCKS server designed to be running on the Internet somewhere, instead of on the edge of a Private network's transition to Public space. It's a relay server that is used in situations where clients are unable to establish direct connections to one another for some reason.


ICE


    ICE is more of an approach to handling the issues of NAT than a protocol per say - it leverages off a variety of existing protocols such as SIP, STUN, TURN, etc and combines them into a methodology for traversing a NAT. ICE is designed to be transparent to the NAT, so unlike UPnP where it interacts with the NAT device to make connection establishment possible, ICE analyzes and infers the best way to make it possible. It's an IETF draft and constantly being updated, so if you're looking for a good source for the current thinking on how to handle NAT issues, it's excellent.


Connection Reversal


    Ok, so now we're on to the meat and potatoes of NAT traversal, how it's actually done. The first and absolute most basic approach is what is called Connection Reversal. The idea of how this works is very simple - if Remote host A wants to contact Private host B, it normally wouldn't be possible because we know Private host B is behind a NAT. Now, it is possible for Private host B to instead contact Remote host A - and this is connection reversal. Instead of the person initiating communications making the connection, the destination initiates the connection back.
    The big limitation here is that the Remote host needs to be publicly accessible - and this really curtails its effectiveness - but even with this limitation, it's a good first step, since it essentially gives you a second chance to do trivial traversal.


Relay


    Next is using a Relay and really this should be considered an essential part of any NAT traversal network. The reason for this is that it allows the two peers to communicate in virtually any scenario, short of one where a firewall is restricting communications in some way. The big disadvantage to using a Relays (and I do mean big) is the cost of doing so - not only the money in the bandwidth that is now needlessly moving through your network, but the infrastructure necessary to allow large scaling.


Proxy


    Proxies have been around forever, but are by far more common in corporate settings than in end user deployments. Traversing through a proxy requires you to mimic an HTTP session - the typical way is to make a GET request on one connection and a post request on another, then you send data on the post link and get it back on the GET. It can be a bit of a pain to implement, and detecting whether authentication is used is another adventure, but if corporate environments are important to your network strategy (or high-bandwidth peers), then it's worth the time.


SOCKS


    SOCKS is very much like TURN and comes in a couple of flavors, 4, 4a, 5 - but for all practical purposes, SOCKS5 is the implementation to go for. It's a pretty simple and straight-forward protocol, and supports many desirable features - but there's a problem. It's not that common - I mean it's REALLY not that common, so even though it has many pluses, it may not be worth the time to implement.


Port Forwarding


    Now this is the end all be all, short of the user just plugging their cable modem into the back of their computer. The problem with this is that it requires the user to add the rule to their router, and chances are they may have never even logged into it - and if they had, it was probably right around the time they bought it. To add to the confusion, every router implements things differently, some need to reboot before the changes take effect, etc etc. It's certainly worth trying to get users to do it, but in practice you're probably doing to see a VERY (and I do mean VERY) small percentage who do.


Universal Plug and Play (UPnP)


    UPnP is Microsoft's (and a few others) answer to the problem of device enumeration and control. One of the manifestations of this is in controlling Gateway devices, and in particular creating temporary port mappings and tweaking other settings. UPnP is widely supported and pretty much every new NAT device that ships supports it to some degree, and at very least for port mapping and Public IP disclosure. Of all the techniques it is the most complicated requiring a whole host of different protocols and manufactures that spend more time shipping products than reading implementation guides.


NAT-PMP


    Ah, NAT-PMP, I want to like you so badly - you're a simple protocol, easily implemented in an hour or two. You do everything I need you to, with no fuss or muss with anything else. But you're the brainchild of Apple, who didn't bother to get anyone else to support their standard. Oh, and then they left you off by default. Most users have no idea what this is and therefore you're almost guaranteed not to see this too often - but it's so easy to implement, it's worth spending the day to bang it out.


UDP Hole-Punch


    Now we're on to the big dog of NAT traversal - the undisputed reigning champ. This is supported by more than 80% of the NAT routers out there, and to make the proposition even sweeter, only one of the sides of the connection needs to support it to allow it to happen. It requires no user interaction to make it happen, and it will pretty much drop in to any UDP protocol.
    Important things to keep in mind is that you now are entering where a Rendezvous server is necessary (the previous methods do not generally require one). Also, if you need a reliable delivery, you're going need to do all the TCP goodness yourself.


TCP Hole-Punch


    While not as ubiquitous as UDP hole-punch, TCP is the next best thing. Just like UDP, TCP hole-punch only needs one of the two to be able to punch in order to establish the connection, and the user doesn't need to do anything to make the connection possible. Depending on how much time you have available to make your own reliable-UDP layer, this can be an excellent option.


Multi-Traverse TCP


    The best solution to NAT traversal isn't just one method - ideally you use some combination of methods, testing before trying the next method to determine which is the best option. If you need TCP, then the following would give you the best shot without requiring development on methods that won't yield substantive results.
    One thing to consider, depending on how you make UDP reliable, it may not be worth it to attempt the TCP hole-punch, and instead just go straight into UDP.
Public
Port mapping
UPnP
NAT-PMP
TCP Hole-Punch
UDP Hole-Punch w/ Reliable UDP
Relay


Multi-Traverse UDP


    For UDP, the method is almost identical, the big difference is that there's no reason to perform TCP penetration, and obviously you don't need reliability. As always, the method of last resort is Relay to ensure that communications can be established even in the worst scenario.
Public
Port mapping
UPnP
NAT-PMP
UDP Hole-Punch
Relay


Superpeers


    This is included just to give everyone a heads up about one of the techniques to control scaling in the P2P world. Depending on what you're planning on doing, and what can be handled by an untrusted source, it allows your network to scale beyond what typical infrastructure would allow. The key to a super-peer is that they need to be accessible without the use of a Rendezvous - in this way you can disclose superpeers to other clients without needing to play a role in their further communications.


UDT


    There are several projects on the Net that are attempting to put more of a reliable face on UDP, and one of the more notable is the UDT project. This protocol is geared more towards data transfer than communications, but nothing in its design precludes it from such use. It's currently being actively developed on many platforms (with a focus on Unix), and is available in both a source and library form. Unlike some of the other reliable UDP projects out there, UDT has an easy interface to experimenting with different windowing, flow control, etc.


SCTP


    SCTP is best described as what TCP should have been - it's built on top of UDP and can easily be paired with the holepunching methods detailed above. While it's not as advanced as UDT or as actively developed, it has been around much longer and has been used in more projects. The source is available from their website for most platforms, but if you're planning to make your own SCTP, be aware that it's more complicated than UDT.


Roll-Your-Own


    Ah yes, the game industry - why use someone else's wheel when there's an opportunity to make yet another wheel that's probably almost exactly the same. :) I'll confess I'm just as bad (if not worse) than most people when it comes to NIH (not invented here) syndrome, but sometimes it just isn't worth it. If you have to do it, personally I would implement one of the previous protocols - but if you simply must do your own, have at it.


Pairing reliable with unreliable


    Before you go any farther, really take a look at your requirements and protocol plans, and determine if you really need reliable or in order deliver. If you don't, or you can adapt your protocol to seamlessly handle it, it will save you headaches in the future.
    If you need reliable, consider pairing it with unreliable - this way you can just use plain old vanilla TCP for things you NEED to get there in the right order, and UDP for everything else. Since you're not using TCP for loads of stuff, things like going through a Relay is going to matter a whole lot less.


Conclusion


    I hope you found this article useful - keep in mind that this was written at 2am a few nights before I gave the talk so that I could give attendees a general transcript of the talk - so if I missed anything, please let me know! The goodies from the talk can be downloaded here:

Powerpoint slides from the talk at GDC (545kb PPT file)
Source code for simple client and server (59kb zip file)

    If you have any questions or would like clarification on anything that I've covered, please don't hesitate to contact me. Good luck!