Add tcpquic.md

This commit is contained in:
SoniEx2 2019-02-09 15:40:38 -02:00
commit 6ef6ccecaa
1 changed files with 228 additions and 0 deletions

228
tcpquic.md Normal file
View File

@ -0,0 +1,228 @@
# Connection Latency
Hi! Let's say you go to visit your favorite website, and it takes 3 seconds to show up. You run a speed test, and your
connection is doing fine at around 100 Mbps. So how come it took 3 seconds to show up? That's what connection latency is.
Let's say our network has 3 nodes:
```
your phone ---------> our router ---------> the server
<--------- <---------
```
(This is a rather simplistic model, real networks are a lot more complicated. But it'll work for this demonstration.)
Let's say sending a packet between two adjacent nodes takes n seconds. This means that for a packet to go from you to
the server, it takes 2\*n seconds (n seconds to go from you to our router, then another n to go from our router to the
server). As such, we want to keep this n very low. Ideally, it'd be 0, but in practice we're limited by things like the
speed of light.
```
your phone ---------> our router ---------> the server
<--------- <--DATA---
your phone ---------> our router ---------> the server
<--DATA--- <---------
```
So you might be thinking, "it takes 3 seconds for the server to send me the page?! n is 1.5s?!" No, not quite.
Before the server can send you anything, it first needs to know what you want. You need to tell the server what you want.
So our n is no longer 1.5s but 0.75s instead.
```
your phone ---HTTP--> our router ---------> the server
<--------- <---------
your phone ---------> our router ---HTTP--> the server
<--------- <---------
your phone ---------> our router ---------> the server
<--------- <--DATA---
your phone ---------> our router ---------> the server
<--DATA--- <---------
```
But that's not all! You can't just ask the server to send you stuff and have it send you stuff! There are things like
spoofing that we need to be concerned about, otherwise we'd get massive DDoS amplification attacks! Instead, meet TCP:
## TCP Connection
Transmission Control Protocol, or TCP, is the protocol used to prevent evil hackers from bringing down the internet. It
accomplishes that by employing a 3-way handshake. So, how does it work? Well, first, you ask for a connection. This is
called a SYN in TCP:
```
your phone ---SYN---> our router ---------> the server
<--------- <---------
your phone ---------> our router ---SYN---> the server
<--------- <---------
```
This lets the server know you want to send data.
When the server receives the SYN, it then tells you that it got the SYN, and asks *you* for a connection. This is called
a SYN-ACK:
```
your phone ---------> our router ---------> the server
<--------- <-SYN-ACK-
your phone ---------> our router ---------> the server
<-SYN-ACK- <---------
```
This lets you know the server wants to send data, and acknowledges that you want to send data. But we're not quite done yet.
We still need to acknowledge that the server wants to send data. So, we send an ACK:
```
your phone ---ACK---> our router ---------> the server
<--------- <---------
your phone ---------> our router ---ACK---> the server
<--------- <---------
```
*Now you can get your data.*
We took 6n to get a connection, and 2n to get our data... and another 2n to request the data. As such, 10n = 3s, or
n = 0.3s... So if we were to simply send a packet to the server and get a similar packet back, it'd take about 1.2s. However,
we're not quite done yet. Before your phone can talk to the server, it needs to know where the server is. When you type an
address into the browser's address bar, that's only the name of the server - we need instructions to get the packets there.
This is where DNS comes in:
## DNS Queries
Domain Name System, or DNS, is the protocol that takes a domain name and converts it into an IP address - the latter is
basically a map/instructions on how to get the packets to the destination.
Thankfully, DNS is usually stored in the router. Additionally, it doesn't use TCP, so there's no 3-way handshake.
```
your phone ---NAME--> our router ---------> the server
<--------- <---------
your phone ---------> our router ---------> the server
<---IP---- <---------
```
If the router doesn't know a name, it has to ask another router about it. However, this generally only happens once every few
hours, so it's not something we have to worry about.
This adds another 2n to our time. We're up to 12n = 3s, or n = 0.25s. It only takes 1 second to send a packet to the server
and get it back! TCP is awful! ... Not so fast, tho. You might've noticed that the network is busy with only one packet at a
time. Maybe we can do something to improve this. Okay, we can't improve the DNS query, as it's required to happen before we
can do anything. But can we improve the TCP? What if we terminate the TCP at the router?
## Terminating the TCP at the router
While not strictly allowed by the internet specifications, it's not strictly disallowed either. If implemented, our flow can
look like this:
```
your phone ---NAME--> our router ---------> the server
<--------- <---------
your phone ---------> our router ---------> the server
<---IP---- <---------
your phone ---SYN---> our router ---------> the server
<--------- <---------
your phone ---------> our router ---SYN---> the server
<-SYN-ACK- <---------
your phone ---ACK---> our router ---------> the server
<--------- <-SYN-ACK-
your phone ---HTTP--> our router ---ACK---> the server
<--------- <---------
your phone ---------> our router ---HTTP--> the server
<--------- <---------
your phone ---------> our router ---------> the server
<--------- <--DATA---
your phone ---------> our router ---------> the server
<--DATA--- <---------
```
We're down from 12n to only 9n! With our n = 0.25s, we've shaved off 0.75s from our original 3s! This is a noticeable
improvement.
However, you might've noticed I've been talking about `HTTP` so far. Additionally, you can have both an ACK and an HTTP in
transit at the same time, this shaves off 1n from both our original 12n and our 9n, so we have 11n = 3s and an improvement
of approximately 0.81s. So it's even slightly better.
HTTPS, on the other hand, also has its own handshake after TCP's. I don't wanna get into this, because you can probably see
how ridiculous it's getting by now. This handshake can also be partially terminated by the router, so *it* can also be
optimized slightly, and we can shave off more n's.
But let's look at QUIC real quick:
## QUIC
(I don't know what QUIC stands for.)
QUIC is a protocol that does something similar to TCP, with one major difference: it uses UDP.
User Datagram Protocol, or UDP, is also used by DNS (see above). This means it has no handshake. QUIC implements its own
handshake, on top of UDP. This means QUIC is basically like TCP, but it comes with a serious caveat: being UDP-based, it
DOESN'T benefit from our TCP optimization from earlier!
As such, going QUIC over existing networks has one serious drawback: it adds back those 3n that we were able to shave off!
And if we optimize for QUIC in addition to TCP, we still only manage to shave off those 3n again.
So, is there any room for improvement? Can we shave off more n's?
... Maybe. It would require some changes to the web. More specifically, what if the router could serve some of the content
directly, without ever reaching the server?
That's where we need to change the protocols slightly:
## Terminating "HTTP" at the router
Rather than terminating just TCP at the router, can we go one step further?
Can we create a protocol such that the great majority of the connections look more like this:
```
your phone ---NAME--> our router ---------> the server
<--------- <---------
your phone ---------> our router ---------> the server
<---IP---- <---------
your phone ---SYN---> our router ---------> the server
<--------- <---------
your phone ---------> our router ---SYN---> the server
<-SYN-ACK- <---------
your phone ---ACK---> our router ---------> the server
<--------- <-SYN-ACK-
your phone --NHTTP--> our router ---ACK---> the server
<--------- <---------
your phone ---------> our router ---------> the server
<--DATA--- <---------
```
(shave off another n if you combine the ACK and the NHTTP)
We just managed to shave off another 2n! While this requires extensive changes to the existing infrastructure, the load
times go from the 2.25s/2.19s from our "Terminating TCP at the router" to an even lower 1.64s! This is almost half the
original 3s! However, this improvement is not as perfect as our "Terminating TCP at the router" and "Partially terminating
HTTPS at the router" - you want your private data to go encrypted all the way to the server, so anything dealing with
private data would be back to the original 3s/2.25s/2.19s depending on optimizations. This is okay tho, as most data on the
web - images, videos, HTML (page layout/behaviour), CSS (also page layout), Javascript (also page behaviour) - are generally
not private. For example, your neighbor probably watches the same videos as you - thus the videos are not private - but your
bank statement is exclusive to you - and as such, private.