"Peer-to-Peer Communication Across Network Address Translators" paper does pretty well job on describing how hole punching works. For new comers it is worth actually reading through the UDP version of hole punching as it is much easier to understand. In general the idea requires also an external server that is only used to established first communication and then rest of the conversation between node is done without the "proxy".
The main difference between the UDP Hole Punching lies in the fact that TCP sockets usually have a one-to-one correspondence to TCP port numbers on the local host: after the application binds one socket to a particular local TCP port, attempts to bind a second socket to the same TCP port fail. For TCP hole punching to work, however, single local TCP port has to listen for incoming TCP connections and to initiate multiple outgoing TCP connections concurrently. The magic is behind special TCP socket option, commonly named SO_REUSEADDR, which allows the application to bind multiple sockets.
- This approach is mostly reliable however not all routers support it. Paragraph 5 of already mentioned paper, provides all characteristics that routers need to have in order to be considered "P2P-Friendly".
- This approach requires that the operating systems understands SO_REUSEADDR. That is being implemented in Linux, most BSD Unix and any windows since Windows 2000 (https://docs.microsoft.com/en-us/windows/desktop/winsock/using-so-reuseaddr-and-so-exclusiveaddruse).
TCP Hole Punching vs gRPC