QUIC: QPACK, Loss Detection and Congestion Control, HTTP/3

Just my random feeling based thoughts and opinions while going through the documentation.


QPACK

I like the concept of insert count counter, simple solution knowing what data is in the dynamic table. Yet more complexity which can be optimized a lot. Same goes for the optimization not to emit Insert Count Increments, etc bit like TCP SACK.

Interesting QPACK version 10 appendix A static header list got inconsistency index 54 contente-type: "text/plain;charset=utf-8" that's without space after semicolon. So you have to know that with text/html you're supposed to use space after semicolon, but with text/plain, you're not. If you don't follow these rules, the static table can't be utilized for compression. Ref Index 52 content-type: "text/html; charset=utf-8", yes, that's with space. On all the other static fields there's always an space after the semicolon. Sounds like mistakes in passports, to check if you know every irregularity in the document. Yes, there's always space, except with index 54. ;) And always when someone says there's space, you'll immediately know if they're pro or not, if the mention the exception.

It's nice that it's allowed to add a new entry, which is reference to existing entry (even if it would get evicted before this insert), to get new ID so entries needed in future won't get evicted. That's interesting addition. Basically the rule is that table must not ever exceed size X and stuff must be evicted before that. But there's exception where entry being evicted can be added to the table as new entry, and that doesn't break the size rule. Of course that's clear when using the duplication (duplicate) instruction. Also usage of optional Huffman encoding for string literals is really nice approach. These features also create interesting extra caching and state requirements for browsers (clients) and servers. Not even wanting to think all kind of different race conditions slightly bad code could cause with these complexities.

Personally I would have used absolute references to dynamic table, for clarity. But when efficiency is in mind, then relative references are more space efficient, yet more complex to interpreter. Classic trade-off. My experience is that some technical trickery to make things more efficient, actually causes so much hassle, that it would have been more efficient to go with less efficient naive KISS solution. I've short my-self in the leg several times before learning this. Of course level of developer teams greatly affects this.

Security Considerations - Nice. As I wrote in the previous post, I'm pretty sure there will be all kind of interesting timing, resource consumption and exception crash denial of service attacks on this kind of extremely complex new protocol. And this comment of course isn't QPACK only. It's generic statement about HTTP/3.

Recovery - Loss Detection and Congestion Control

Ok, it was no surprise that the algorithms will be very TCP like. It's also nice these aspects are summarized at the very beginning. Also the packet order and loss detection are reasons why I'm using timestamps, even if some people are against those. I've also enabled ECN on all systems. ACK Delay information is very nice approach as well. RTT estimation looks pretty standard. I'm curious about pacing to avoid spurious re-transmits. If I remember correctly that was one of important aspects of BBR. rttvar value aka network jitter naturally affects how quickly lost packets are resent unless there's explicit loss notification / re-transmission request.

This is exactly why mobile data is so annoyingly slow. And even with slight packet loss fiber connections with around 1 ms latency feel so snappy. Even if the advertised speed would be even lower on the fiber. As example 100 Mbit/s fiber vs 300 Mbit/s mobile data.

Had to check out the TCP Non-Congestion Events (NCR) document as well.

Acknowledgement-based Detection - It was simpler than I thought. Also the increasing of the reordering threshold dynamically is nice.

One of the things I've been always wondering with fast-retransmits etc, is why send multiple acknowledgements. Why just not send a message requesting retransmission? I thought this was due to no such message exists with TCP, but with new protocol, it should be trivial to fix that. Packets 1,3,4,5,7 received, retransmit 2. Of course the reordering threshold affects this. Also the fact might be that requesting retransmission requires full RTT. If sender decides to send the packet again after some time has passed, it doesn't cause full RTT latency, depending on situation. These are just my initial thoughts, and I'm sure they've thought about this a bit more than I have. Some systems are so complex that initial observations might be highly misleading. And going through theory and simulations will give you true insights what's the best approach, as well as using real world data for the simulations. Using made up data based on expectations, can be just as misleading as the initial impression. Because every decision is kind of trade-off.

The example of the problem with duplicate or more ACKs and causing rate reduction is excellent. As well as the comments about link-layer retransmission / reordering effects. I actually never had thought that the indirect effect of using n ACKs for retransmission would be basically so seriously limiting the reordering. I were kind of assuming that the reordering capability would match the transmission window. Which it kind of does, but with inadvertent slowdown effect to transmission rate for that specific TCP stream as described in the document. Ok, this makes is very clear I haven't ever written TCP stack / run detailed simulations.

Back to the HTTP/3 QUIC Recovery

Time Threshold and Probe Timeout seems pretty normal. Yet the value of PTO and detecting tail loss is one of the reasons why chatty protocols suffer very badly from packet loss. Because loss of packets with small packet chatty protocol almost always lead to tail loss. Reading specs also nicely explains some of the things, you've seen in real world, but haven't known exactly why this is happening. Another problem with chatty protocols is that congestion window doesn't grow because the connection is more or less constantly under utilized for most of packet exchanges.

Initial path 500 ms value, ok, that should be enough in most cases. More than enough. I guess that's why they choose this value. It's also very good that the RTT data is saved (if client / server implementation does that), between connections.

It's also interesting question, if exponential or quadratic increase should be used. Depending on situation, I've seen both being used for back-off interval in one of my projects. I first used exponential but then replaced it with quadratic approach.

Including data when sending probe packets is nice idea. If data was lost and that's the reason why there were no ACK, then the receiver already got the data, and there's no extra RTT delay.

0-RTT before Initial packets, good observation. It remains to be seen if those are buffered or not by practical server implementations.

Congestion Control is based on TCP NewReno, ok, that's why it did seem so familiar. Ok, also Cubic (default with Linux) mentioned as alternative. No mention about TCP BBR (default with Google (?)) which I would have guessed to be the default. Nor the Compound TCP which is default with Windows. - On top of this, there are of course countless different "traffic shaping methods" which mess with packets in transit.

Slow Start & Congestion Avoidance - Window control (AIMD) seems to be the classic sawtooth approach. Yet Recovery Period statements change that. Nope, it just slows down the lowering of congestion window size to once per round trip.

Probe Timeout - It's not mentioned if it's ok to discard the Probe Data if it contains new data and the congestion window is exceeded. Yet, my conclusion is that it is. Because that's just what happens in that situation if the stack is running on limited device which simply can't hold more than the window data.

More historical prattle about Window - I've written earlier about devices which advertise max window of 16 bytes. Those have been cheapest possible SERIAL-TCP-ETHERNET adapters, which probably got really limited ram. Of course HTTP/3 is a huge problem for such IoT devices to begin with. Well, I guess we're going to see lot less of such extreme low end devices in future.

Persistent Congestion - Is interesting yet I don't really know what to say about it. Depending on rttvar and generic latency on network, wireless and mobile networks could in theory trigger this when roaming. Depending on network solution, this either does cause packet loss or it doesn't. For traditional TCP it's kind of a problem, that connection is lost for let's say 200x RTT and then resumes as it was. Depending on case, no data is lost, it's just delivered significantly late. But I don't know if this actually affects that in anyway. Probably not, because if the packets aren't lost then there won't be gap in the ACKs. And if data was lost, well, then it works just as designed.

Pacing - Ok, just suggestions for different educated approaches.

Under-utilizing the Congestion Window gotta read that one out. Everything else was pretty obvious.

Security Considerations - Not surprising that on path attacker can mess with traffic or prevent it completely. In some sense HTTP/3 reduces vulnerability to traffic analysis. But sure, as usual, traffic patterns can reveal lot of information. ECN marking modification, so what, if it leads to loss and that leads to reduction of window & transmission speed. Quickly thinking sounds like an non-issue. There's ECN which allows reducing rate without packet loss, ok. If that won't happen, packet loss will take care of it. I don't see any serious harm here? What am I missing here?

References - Didn't bother to read all of those RFCs. But without saying, it just makes it obvious that this is really complex matter. That's why TCP is so good. Replacing it with simple naive UDP "reliable" connection isn't trivial. Or of course it is, but it won't perform as well, or doesn't behave well with other TCP streams. Again reminds me from old thoughts about "bandwidth hog" protocol. Oh nice, there are some pseudocode examples at the end of the RFC. Based on old samples, probably these will get copy-pasted into production. Even if there would be a warning that these are naive samples and shouldn't be used in production.

Ref: IETF QUIC WG

Generic historical prattle about TCP stacks - Just history reminder. I remember that we used to use telnet client, which had parameters to configure packet size, RTT time, and window size. There were no logic to compute those values. You just gave the values as parameters or used defaults (Yet I don't know which those were) when the client was started. But I think the default window was very small and RTT time was high, so it was easy to greatly improve the performance, by increasing window and lowering RTT. Before you ask, no, it wasn't initial window. It was static window for the whole session. That was real KISS TCP stack implementation, just make it so that it works, and skip all the complex stuff.

kw: HTTP3, QUIC, HTTP/3, QPACK, IETF, Specification, Document, Working Group

2019-10-06