Docker, Bugs, Cloudflare, QUIC, Duplicati, Thunderbird
th IPv6 and dynamic IPv6 network prefixes... Yay, got it done and it's now working reliably. Yeah, load of custom shell script to deal with daemon.json and restart docker as necessary. Duh! It was nice to notice that NDP (@ Wikipedia) proxy works well and traffic got forwarded as needed to the hosts using the operators /56 6rd prefix on network.
Had massive week long bug hunt. It caused some analytic data to be invalid. But the issue was well hidden deep in the backbox of the analysis module. Yet we found it, by running lots of automated tests. Yahoo! I thought it would be harder than this. Sometimes things get really hairy. - After confirming that there is an confirmed issue. Lots of separate tests were made to isolate the type of the error, what are the key aspects. What it exactly takes to break working system, and on the other hand, what it takes to fix the broken system. Found several ways to break and fix it. After all it all seems to be some hidden internal state in the black box. I would say very very classic problem. I've done something similar countless times, variable is updated, is not updated and or is initiated outside loop, when it should be reset inside the loop or something similar. It just happens. Of course the architecture, design and approach how the code is constructed would and could make these things less likely to happen. Ie, always use same initialization function to for the process state etc. And separate the longer running counters in some other object / part of the program and so on. Yet this wasn't my code. But I've said it several times, that I do personally hate very complex internal state in programs, because it's often easy to mess up if making quick changes to the code.
Cloudflare outage 21.6.2022 (@ blog.cloudflare.com) - Good reading as always. I especially like the we're messing up things by reverting reverts causing the problem to re-appear sporadically. Which is the part which made me smile. Yet I love the honesty of the reports, even if the fails are as lame as everyone's fails usually are. But as we know, sometimes is hard to know especially how complex systems react to different failure models, especially if the systems are under production loads and there's no proper testing environment with simulations to run all the tests + staff creative enough focusing on breaking things. Yet after failure, it's usually easy to come up with several recommendations / solutions which at least would have helped, if not completely to prevent, but at least limit the mess which followed. - Business as usual - kw: BGP, fail, networking
I've got a curious view towards Mullvad (@ Wikipedia) security, everyone says, it's so secure. Yet their login / pass combination is around 53 bits in strength and the keyspace is shared between all customers. Is that enough? Just wondering, because there's lot of different talk on the net.
Absolutely marvelous site: The Illustrated QUIC Connection - Every byte explained and reproduced (@ quic.ulfheim.net) - I might have mentioned this and other similar sites earlier. But it's always a joy to refresh memory based on excellent work like this. Note: The site also contains similar explanation for X25519, TLS 1.3 and TLS 1.2. kw: HTTP/3, H3
Rechecked latest version of Duplicati if it would be working. But nope, unfortunately not. It's still absolutely and totally flawed. It claims that backup is completed and verified (!), even if the backup data is actually corrupted and unrestorable. It's totally insane, that this kind of problems still exists. - Usually if my program s have that kind of critical flaw, I'm ashamed, and then I'll work until the issue is fixed, because I can't live with such a bad code. For some reason the keep adding all kind of BS cruft features on top, even if the core logic of the project and program is totally rotten and mortally flawed. So classic. kw: fail, logic, bad, FUBAR
I can see why Thunderbird (@ Wikipedia) team did this "SMTP client will now ignore socket errors after QUIT command is sent". Some implementation just close TCP socket or even directly RST it after that, without properly terminating the TLS/SSL (@ Wikipedia) session. As mentioned, I "fixed" that for FTPS in the Python library for one of my projects suffering from the exactly same problem.