Blog‎ > ‎

Scaling Micro Services, Jugaad, Performance, Test Mode, Failure Testing, Old Code, Build vs Buy

posted Nov 26, 2016, 2:56 AM by Sami Lehtinen   [ updated Nov 26, 2016, 2:57 AM ]
  • GOTO 2016 - What I Wish I Had Known Before Scaling Uber to 1000 Services. Uber engineering talk said it well, Everything is a trade-off. It is, like building around problems. Haha, been doing that over and over again. It's horrible. Technical debt, throw in more semi bad code to fix really bad code. Ouch!
  • Another great thing was the thing which I've been mentioning so many times. When you've got experts, they're experts in their own field and don't want to know anything else. Which causes serious issues in "understanding your thing in the larger context". Yep, seeing the big picture. As well as understanding that the small change we're asking you to do, and you claim it's impossible to do, because it isn't the right thing, could still save 100x the work because everything else doesn't need to be changed. That's what I've been also doing over and over again. Thing X does 99,8% of what's needed. Can you add that 0,2% no, it can't be done. Ok, then I'll rewrite that 99,8% badly so it does the 49,4% what I needed that project, re-implementing most of the code which we already had, adding that 0.2% and leaving many things out because it's out of my implementation scope. Now if anyone uses my code and needs any of the features I left out, they're going to have really bad time. Often because the code I wrote is focused on getting the 0,2% done which was the key factor rest of code might be bad and as said it might completely fail on the rest of the features I haven't even planned or implemented. Boom. Great, just great. But all this because the 0,2% was impossible to add where it should have been added. Only way to remain sane with those tasks is just to think, this is good generic learning. I'll learn in detail how system works and can say I've implemented feature X, yet of course in big picture this kind of coding doesn't make any sense at all. - If we take car analog, can we add optional connectors for roof rack to car? No it's impossible. Ok, let's hack something together using old scooter, bicycle parts, some iron pipes and old bed. Now, it's done. Now we can transport things with this, which could have been transpored using the cars roof rack. Task done. - Horrible, yes, it's really that bad. But at least I got it done the jugaad style and it works and does what's required. It also added ton of technical debt and multiple potential future failure points. If you Google for jugaad technology you'll find many great examples. Yet I often do the key parts so well that it works actually much better than expected from the car example. It's still solid scooter engine module, there's nothing wrong with using old bicycle parts for wheels and bed frame as the rack. It just works. But it ain't pretty.
  • Performance discussion and points were also great. But I guess it's also universal in software. Nobody wants to fix code which seem to work, but wastes huge amounts of resources. It works, I don't want to do anything. So what if it consumes tons of disk space, taxes network, overloads memory and wastes CPU cycles. Linked list works just as well as dictionary, you'll just need to add more cores. Or something like that. Performance fanout was great example in the talk. Distributed tracing. Repeated calls to semi-slow code. Just way too familiar.
  • Option for 'test' mode in production. That should have been obvious. I'll also often do logging so that if something bad happens, I can run replay on the data. It has been proven to be very useful in solving rarely occurring issues. It's also great for testing, because test case can be 'replayed' to the system.
  • Failure testing, ouch! That one hurts. Sure. For critical parts, it's required. I've had my part of 'surprising failures' which technically aren't that surprising at all usually. Like execution ending at any random point. It just can happen, nothing special about that. If your code misbehaves in such case, that's too bad.
  • They talk about old code being 6 months old. Ouch, in my case it's more like 15+ years old. And I still remember well how it works, so it isn't that old yet. ;)
  • Build vs buy trade-off. That's awesome question always. Also do we co-operate or compete with company X. - Generally a good point, keeping strict focus. Because wasting time on something which isn't strictly your product, is just consuming resources from building your product.