TCP 相关
TCP
Below is a visual example of a TCP/IP packet and the information contained within that packet. Each of the sections of packet are filled with information that help route the packet to its proper destination.
As the name implies, TCP/IP is a combination of two separate protocols: TCP (transmission control protocol) and IP (Internet protocol). The Internet Protocol standard dictates the logistics of packets sent out over networks; it tells packets where to go and how to get there. IP allows any computer on the Internet to forward a packet to another computer that's one or more intervals closer to the packet's recipient. You can think of it like workers in a line passing boulders from a quarry to a mining cart.
The Transmission Control Protocol is responsible for ensuring the reliable transmission of data across Internet-connected networks. TCP checks packets for errors and submits requests for re-transmissions if any are found.
TCP 三次握手
1# SYN 是建立连接时的握手信号,TCP 中发送第一个 SYN 包的为客户端,接收的为服务端
2# TCP 中,当发送端数据到达接收端时,接收端返回一个已收到消息的通知。这个消息叫做确认应答 ACK
3假设有客户端A,服务端B。我们要建立可靠的数据传输。
4 SYN(=j) // SYN: A 请求建立连接
5 A ----------> B
6 |
7 ACK(=j+1) | // ACK: B 确认应答 A 的 SYN
8 SYN(=k) | // SYN: B 发送一个 SYN
9 A <-----------
10 |
11 | ACK(=k+1)
12 -----------> B // ACK: A 确认应答 B 的包
- 客户端发送 SYN 包(seq = j)到服务器,并进入 SYN_SEND 状态,等待服务器确认。
- 服务器收到 SYN 包,必须确认客户的 SYN(ACK = j + 1),同时自己也发送一个 SYN 包(seq = k),即 SYN+ACK 包,此时服务器进入 SYN_RECV 状态。
- 客户端收到服务器的 SYN+ACK 包,向服务器发送确认包 ACK(ACK = k + 1),此包发送完毕,客户端和服务器进入 ESTABLISHED 状态,完成三次握手。
TCP 四次挥手
- 主动关闭方发送一个 FIN,用来关闭主动方到被动关闭方的数据传送,也就是主动关闭方告诉被动关闭方:我已经不会再给你发数据了(在 FIN 包之前发送出去的数据,如果没有收到对应的 ACK 确认报文,主动关闭方依然会重发这些数据),但此时主动关闭方还可以接受数据。
- 被动关闭方收到 FIN 包后,发送一个 ACK 给对方,确认序号为收到序号+1(与 SYN 相同,一个 FIN 占用一个序号)。
- 被动关闭方发送一个 FIN,用来关闭被动关闭方到主动关闭方的数据传送,也就是告诉主动关闭方,我的数据也发送完了,不会再给你发数据了。
- 主动关闭方收到 FIN 后,发送一个 ACK 给被动关闭方,确认序号为收到序号+1,至此,完成四次挥手。
HTTP
HTTP (超文本传输协议,HyperText Transfer Protocol),建议使用 Wireshark 抓包查看详细过程。
HTTP 1.0
Under HTTP 1.0, connections should always be closed by the server after sending the response.
Since late 1996, developers of popular products (browsers, web servers, etc.) using HTTP/1.0, started to add an unofficial extension (to the protocol) named "keep-alive" in order to allow the reuse of a connection for multiple requests/responses.
If the client supports keep-alive, it adds an additional header to the request:
1Connection: keep-alive
When the server receives this request and generates a response, if it supports keep-alive then it also adds the same above header to the response. Following this, the connection is not dropped, but is instead kept open. When the client sends another request, it uses the same connection.
This will continue until either the client or the server decides that the conversation is over and in this case they omit the "Connection:"
header from the last message sent or, better, they add the keyword "close" to it:
1Connection: close
After that the connection is closed following specified rules.
Since 1997, the various versions of HTTP/1.1 specifications acknowledged the usage of this unofficial extension and included a few caveats regarding the interoperability between HTTP/1.0 (keep-alive) and HTTP/1.1 clients / servers.
在一些 TPS/QPS 很高的 REST 服务中,如果使用的是短连接(即没有开启keep-alive),则很可能发生客户端端口被占满的情形。
HTTP 1.1
在 HTTP/1.1 协议中,默认开启 keep-alive,除非显式地关闭它:
1Connection: close
In HTTP 1.1, all connections are considered persistent unless declared otherwise. The HTTP persistent connections do not use separate keepalive messages, they just allow multiple requests to use a single connection. However, the default connection timeout of Apache httpd 1.3 and 2.0 is as little as 15 seconds and just 5 seconds for Apache httpd 2.2 and above. The advantage of a short timeout is the ability to deliver multiple components of a web page quickly while not consuming resources to run multiple server processes or threads for too long.
浏览器请求队列机制-请求为什么会阻塞
拿 Chrome 来说,同域名下资源加载最大并发数为6
Maximum concurrent connection to the same domain for browsers
stalled
Time the request spent waiting before it could be sent. This time is inclusive of any time spent in proxy negotiation.Additionally, this time will include when the browser is waiting for an already established connection to become available for re-use, obeying Chrome’s maximum six TCP connection per origin rule.
从TCP连接建立完成,到真正可以传输数据之间的时间差。先让我们要分析TCP连接为什么要等待这么久才能用?我用Wireshark抓包发现(如下图),TCP连接过程中有多次重传,直到达到最大重传次数后连接被客户端重置。
The sender waits for an ACK for the byte-range sent to the client and when not received, resends the packets, after a particular interval. After a certain number of retries, the host is considered to be “down” and the sender gives up and tears down the TCP connection.
TCP三次握手后,发送端发送数据后,一段时间内(不同的操作系统时间段不同)接收不到服务端ACK包,就会以 某一时间间隔(时间间隔一般为指数型增长)重新发送,从重传开始到接收端正确响应的时间就是stalled阶段。而重传超过一定的次数(windows系统是5次),发送端就认为本次TCP连接已经down掉了,需要重新建立连接。 对比以下,没有重传的http请求过程。如下图:
总结一下:stalled阶段时TCP连接的检测过程,如果检测成功就会继续使用该TCP连接发送数据,如果检测失败就会重新建立TCP连接。所以出现stalled阶段过长,往往是丢包所致,这也意味着网络或服务端有问题。
Keepalive with chunked transfer encoding
Keepalive makes it difficult for the client to determine where one response ends and the next response begins, particularly during pipelined HTTP operation. This is a serious problem when Content-Length
cannot be used due to streaming. To solve this problem, HTTP 1.1 introduced a chunked transfer coding that defines a last-chunk
bit. The last-chunk
bit is set at the end of each response so that the client knows where the next response begins.
在HTTP协议中,Keep-Alive属性保持连接的时间长短是由服务端决定的,通常配置都是在几十秒左右,nginx 默认值在 http scope 里面 keepalive_timeout 属性。
- HTTP协议(七层)的 Keep-Alive 意图在于连接复用,希望可以短时间内在同一个连接上进行多次请求/响应。核心在于:时间要短,速度要快。
- TCP协议(四层)的 KeepAlive 机制意图在于保活、心跳,检测连接错误。核心在于:虽然频率低,但是持久。
在 http1.1中,默认开放了 keep-alive 特性,多个资源的请求可以服用同一个 tcp,降低了建链拆链的开销。这种方式被称为 pipeline。pipeline 的问题是,虽然 tcp 被复用了,但对资源的请求是串行的,如果排在前面的资源请求出现阻塞,则会影响后续的资源传输。这被称为 HOL(head of line) blocking。如果为了解决这个问题,采用并行建多个 tcp 链接的策略,那么无论是客户端还是服务端,都面临更高的开销,尤其是对服务端而言,在并发连接数有上限的情况下则并发客户端服务数量就会大幅度降低。
HTTP2
http2 采用了底层流技术,这个流技术对 http 上层的语义没有影响,只是在数据流的传输上,不再采用 plain text 这种方式。当一次请求中包含多个资源的请求时,将不同的资源映射到不同的二进制流上,每个流有唯一 id,并且通过 parent 字段描述不同的流资源间的相互依赖关系。同时,每个流还可以指定优先级,优先级数字越大则优先应答。对于一个资源的数据,通过流传输时,数据被进一步划分成更小的单位,称为 frame。在一个流通道中(流通道建立在 tcp 协议上),可以同时传输不同 id 的流,实现多个资源的并行,且资源传输的先后顺序可以有应用通过定义优先级的方式灵活定制。
- HTTP/2 is binary instead of textual like HTTP1.x – this makes it transfer and parsing of data over HTTP/2 inherently more machine-friendly, thus faster, more efficient and less error prone.
- HTTP/2 is fully multiplexed allowing multiple files and requests to be transferred at the same time, as opposed to HTTP1.x which only accepted one single request / connection at a time.
- HTTP/2 uses the same connection for transferring different files and requests, avoiding the heavy operation of opening a new connection for every file which needs to be transferred between a client and a server.
- HTTP/2 has header compression built-in which is another way of removing several of the overheads associated with HTTP1.x having to retrieve several different resources from the same or multiple web servers.
- HTTP/2 allows servers to push required resources proactively rather than waiting for the client browser to request files when it thinks it need them.
差别
- 传输模型(Transmission Model)
- 流量控制(Flow Control)
- Predicting Resource Requests
http2提出了叫server push的方式。如果预测到某些资源是可能会被后续请求的,则先向客户端推送一条PUSH_PROMISE帧,在这条帧中描述了即将推送过来的内容的元数据。如果客户端不需要某些资源,则可以应答一条RST_STREAM来取消某些资源。这样就避免了资源浪费。同时,客户端还可以发送SETTINGS
帧来改变server push的行为。 - Compression
http1.1只对消息体进行压缩,不对http header压缩,因为header一般很小。但是当请求量比较大时,header对网络带宽的开销也会增大。http2定义hpack的方式对header也进行了压缩,尤其是当两次请求或应答时头部仅有部分差异时,只传输差异部分。
https://stackoverflow.com/questions/53488601/what-is-difference-between-https-and-http-2