Blog‎ > ‎

CurveCP, CoDel, µTP, IMAP, Investing, Wlan power saving, Privacy, Podcats, Google power searching

posted Jul 23, 2012, 5:18 AM by Sami Lehtinen   [ updated Feb 24, 2015, 7:52 AM ]
This is like our summers here are, raining and raining more. So it's good time to be inside and do some basic stuff. When it's sunny, I'll be cycling outside.
  • Anomos, CurveCP all articles thoroughly studied which are available from their site, Extreme TCP, Compound TCP, Cubic and Vegas TCP. It's important to notice that CurveCP is much more than congestion control algorithm. It's complete secure communication protocol, which isn't using TCP at all, avoiding many known TCP problems. There's also DNSCurve which is competitor to DNSSEC standard. Some other solutions I have studied earlier: FAST TCP, CUBIC (Linux), BIC TCP, NewReno (FreeBSD), Compound (Microsoft), TCP Vegas, Chicago (CurveCP)
  • Studied buffer bloat issues and CoDel queue management algorithm, Active Queue Management (AQM) and how to fight router / TCP / OS buffer bloat issues. 
  • Studied µTP Mirco Transmission Protocol + LEDBAT Low Extra Delay Background Transport, in detail.
  • Studied IMAPv4, IMAPX, IMAP+ and offline client synchronization using several different push methods.
  • Studied even more about value investing. I have been spending months studing investing. Currently I have opted mostly for ETFs, because I don't want to use my time, money and energy to actively manage my investments. Many studies prove that if wouldn't have excellent methodology, so I only would do harm with active approach. I know one guy who has excellent methology and he seems to be making good profits, by utilizing value investing methodology which he personally perfected. Key difference? He's great with statistics and also got historical database of stock trade information, which he used to run massive simulations. Some other topics I have spent nights studying: ETF / ETF 2.0 (Synthetic, derivate based with substitute basked, counterparty risk, replication techniques, UCITS synthetic etf substitute basked rules and ESMA ETF draft), db x-trackers prospectus, mutual funds, taxation, tax optimization (minimzaiton), global investing, diversification, spread cost, cost optimization, value investing, market making, high frequency tradiding basics.
  • Studied HTTP/2.0 interest writings by Facebook and Twitter.
  • Studied several Wlan (Wi-Fi) power saving optimization documents, read four extensive papers about it and how if could be even improved further from what current standards offer. I personally liked the document which described how broadcasts could be rescheduled so that there won't be interference with other colliding transmissions on same channel. It was simple and beautiful and didn't break any existing standards. How we could have lived without it. It doesn't make any sense to have several base stations transmitting simutaneously. It just causes retransmissions and keeps clients awake wating for reception of non-garbled boardcast. (PDF)
  • Studied multi-armed bandit problem. (PDF) MAN, A/B, epsilon-greedy, UCB1
  • Few articles about privacy: What Facebook knows, 'I've Got Nothing to Hide' and Other Misunderstandings of Privacy, Why privacy matters
  • Checked out D10 KPCB Internet Trends 2012 - Keynotes.
  • One night I got bored, so I checked out something different: AESA / SAR / Radar and Fifth & Sixth Generation fighters, UAVs and many other weapon systems (mostly supersonic anti-ship missiles). After that I had to check what's latest tech in missile defence, like Skyshield.
  • I'm finally happy with Security Now situation, I don't have any queue right now.
  • Listened Danger in Download BBC world service series about internet threats, high and low level threats to individual people, corporations and governments.
  • Listened Social Engineer Podcasts about Social Engineering and attacking companies and people using traditional social trickery. How to get users to open your scam emails and infect their machines. It's actually quite easy. Most of users just click ok to what ever prompt, if they want to proceed doing something which is meaningful for them.
  • Completed and passed Google Power Searching course. - Google is the most powerful search tool, it's even more powerful if you know how to use it properly! Well, nothing special really. But I'm sure there are some things which most of us didn't know, even if we're using Google daily.
  • Checked out what's new with latest Linux Kernel 3.5 - Liked: Ext4 metadata checksums, TCP Early Retransmit, CoDel queue management (Yay), Reduced BtrFS latences.

Well well, phew. I think that's enough for now. More stuff coming soon. Now I have to study Google Compute Engine, because I got approved to the limited preview trial. 

Extra: Some thoughts about one issue which bogged my mind lately.

Fixed one app with bad flaw. App assumed that some entry fields would be filled with correct data. Therefore app only added padding when needed. Because output data was fixed length data, everything got messed up when one field happened to have one character of data more than was expected. Yes, very basic programmers assumption based mistake. Yeah, input data is 1-5 chars, and output needs to be 8, so let's pad it. But what happens when input data is 9 chars long and output should be 8 chars long? What kind of error handling process there is? Well, there wasn't any, actually it wasn't declard as error at all, padding routine just concluded that there is no need for padding because input is
already long enough. There was no warning what so ever, about the fact that input was too long!

After little fixing, process is now gracefull aborted in case of this error. It's also clearly stated in error message what was the data source, what data record it was in data source, and which field exactly caused this problem. Now it should be trivial to find the issue fix it, and start process again.

Some fields aren't so important that those should cause process to be aborted with error message. In that case warning is issued and data is truncated accordingly. With some key number fields there is also additional check in case of truncate if the result is still acceptable. Like if 10000,00 becomes 10000 due field length limit, it's ok, if it doesn't contain decimals that would get skewed in the process. It's yet another quesiton why field is so short and why there are so large numbers which won't fit in the field. Those are separate but stictly linked issues which should be fixed too. If value assertation fails, then error is issued as stated earlier.

What we can learn from this? There should be enough checks in code, and it's really hard to trust or assume anything about user input. Another thing is that due organizational processes, now this "small problem" became larger problem, because some important data wasn't properly passing this conversion task. If this check would have been initially implemented, it would have required much less resources than fixing it later.

Being lean is complex, early optimization, un-necessary checks etc. But who said it's un-necessary? Lack of some tests can lead to much bigger problems and require much more work time to fix than the implementation of those checks would have required in first place. Knowing where those checks need to be and where those aren't required, simply requires a lot of experience. In this case it was even more complex than this, because of course programmer did expect that the application receiving the data from user in first place would check it. But it didn't either. So when this app did receive the data, it was really reasonable to assume that the field we're talking about would contain only 1-5 chars. Discussion about overlapping checks could also be quite interesting. Is it reasonable to check data over and over again, or should there be some initial checkpoint? After that we should be able to trust that the data stored in database is valid, or is it? Maybe there are several apps writing to same database directly and therefore there is no API which would check all data.

So what is the best approach, when to validate everything, when to skip it. When there are issues, how to inform users and administrators etc. We all know there are best practices out there, but are those being followed?