Blog

Google+
My personal blog is about stuff I do, like and dislike. If you have any questions, feel free to contact. My views and opinions are naturally my own personal thoughts and do not represent my employer or any other organizations.

[ Full list of blog posts ]

Data Center Security, BackBlaze B2, Crypto Monoculture, Thread Pools

posted May 28, 2016, 9:19 PM by Sami Lehtinen   [ updated May 28, 2016, 9:20 PM ]

  • Data center security and design best practices by Google - Really nice article. Using machine learning for data center optimization is awesome.
  • Lack of proper all around hydraulic or otherwise heavy duty Vehicle Entry Barriers is something I've been observing missing in Finland from many sites (not referring to Google, just the article and as general observation about also other sites than data centers). They say site is secure, but you still can drive  a truck with trailer next to it's wall or even in some worst cases under the building. - I'm still laughing at the Donald Duck cartoon where the The Beagle Boys try to break into Scrooge's money bin. They try absolutely everything against the massive impenetrable door. At the end when the 'camera zooms out' the only thing remaining is the massive door. Walls around the door have collapsed a good while ago. This seems to be the norm in many cases. The primary route might be extremely well protected in show off style, but what about the alternate routes? Who would think about those. Guests arrive using the primary route, and then the cleaning, maintenance and security staff uses backdoor for 'service people' which lacks all the security measures massively implemented for the main entrance in a show off way.
  • BackBlaze B2 integration requirements. Pretty obvious I would say. But as I've mentioned in my blog several times, obvious things might not be obvious to everyone.
  • Crypto Monoculture - A great post at cryptome.
  • Now I've done one integration to dedicated Financial Management Software (FMS) too. Yeah, there are just too many of these to count. Budgeting, Billing, Invoicing, etc. Basically and technically all the same stuff, BI, CRM, ERP, ECM, SCM. Just some data and stuff being done with it, from integration point of view, it doesn't really make any difference. Here's the data, you'll just do whatever you want with it, I really don't care. But as required I'll make it happen and work for you. No problem.
  • Thread Pools by Julia Evans - Nice timing. I've been just spending easter by implementing Multiprocessing and Threadpools. For one project. She didn't say it, but if you're using Python it makes perfect sense to mix processes & threads, because multiprocessing doesn't suffer from GIL as multithreading does. - https://wiki.python.org/moin/GlobalInterpreterLock -. With python there's funny thing that the queue for apply_async execution is unbound. If you want to fix that you'll need to subclass the Pool, or use additional bound semaphore which is being called back and releasing space for new items. Most annoyingligly the queue in thread / process spool is also processed further, so you can't even check the queue length, before submitting more stuff, because unless you're submitting really fast, it's likely going to be empty. So the fact that the pool queue is empty, doesn't really mean that there wouldn't be any queue. Yep, easily confusing. If you don't do that, you'll end up pushing everything into in memory processing queue and running out of memory just like Julia said. Using multiprocessing.BoundedSemaphore(queue_len) helps you to limit the depth of the queue. Just acquire when submitting and release when getting results back. Annoying, yes, but works very well. I'm not the only and first one cursing this, but that's not a problem when you'll get used to it. Yet sometimes exceptions and stuff like that can break havoc, if the release stuff doesn't get executed. So that's why it would be much better to have internal queue length limit, but there isn't. So this is a trick: pool. _taskqueue.qsize() <- it doesn't reveal the actual internal queue length. It only reveals very shortly kept "input" queue. It's like thinking that at airport you can get straight to the plan from taxi drop off, if you didn't see any queue when jumping out of taxi. But the actual internal queues are still hidden from you. Alternate way is not using pool at all and writing your own implementation with required features. Yet, the callback for apply_async is being called even if there's unhandle exception, so that's pretty safe way to do it.

Project management, SMTP STS, HTTP PKP, Cloud Storage, Personnel Security, Rubezh

posted May 25, 2016, 8:00 PM by Sami Lehtinen   [ updated May 25, 2016, 8:01 PM ]

  • More projects with tight schedule, but nobody knows in detail what should be done. We need the system to work by the end of month. Well, what does 'work' mean? Well, that we don't have any problems and it works. - Ahh, project management and planning as usual. Actually this isn't just one, there's actually quite many of these open right now. Depending from circumstances, this can be a great or very bad thing. Anyway, with my experience, these things can be managed, but often it means that the schedule has to give, because it's hard to get stuff done, if nobody can tell what needs to be done. As well as it sets great ground for arguments like if the project failed. If we don't know what the goal and requirements for the project are, how we can define if it failed later? I know the customer always tries to blame the software supplier.
  • SMTP Strict Transport Security (STS) - Yeah, it makes sense. Basically I've had all that, but it requires me to manually configure domains which are strictly checked, like my friends servers and gmail and many other services. But truth is, that MOST of email servers which DO support TLS actually still use self signed certificates. So that's the reason why I just couldn't enter blanket term to require valid TLS certificate from every server. Also self signed certificate actually isn't a problem, as long as the server administrator has confirmed with me the public key fingerprint. I've written several blog posts about this. Yes, email IS secure, it just requires you to configure your server to work securely with certain domains. This meechanism would automate the process. This is very similar to HSTS solution, naturally. KW: TLS, SSL, SMTP, DANE, TLSA, DNSSEC, HPKP, TOFY, CA, DNS, TXT, DMARC, webpki, sts-uri, sts-record, CN, MX, RFC.
  • Public Key Pinning Extension for HTTP (PKP) - Sure, that's logical extension to what HSTS was. KW: Public-Key-Pins, max-age, pin-sha256, includeSubDomains, Security Considerations, IANA, Super Cookies, Google.
  • Duplicati is open source backup application. It could support BackBlaze B2 Cloud Storage. It would be nice. Anyway, I personally would opt anyway for European service provider like hubiC. But I'm sure there would be people whom would love to have B2 support.
  • Watched one video where they talked about security etc They had 'passive' security personnel everywhere. I don't know if it's just me. If access to spaces is highly controlled, what's the point of having security staff everywhere? Security staff is probably the least paid group in that building and they might not have any interest at all in what they're doing. I mean it's highly unlikely that people who are very keen at what they're doing ends up doing something bad. As well as if they don't have any economic incentive because they're paid enough and they've got what they need and more. Also the key personnel is very buy doing what they're supposed to do. They don't simply have time to do something they're not supposed to do if they have a lot of slack time. But the security and cleaning staff probably isn't in the same group after all. As well as they're kind of 'invisible staff' for the people who do matter or manage the operations. Not going to mention sources, but my experience is that cleaning & security staff might be one of the largest risks on certain environments. - Yet it might not be optimal situation if you get caught as security guy from stealing, hah. But the compared career loss to losing your guard status compared to some other occupations which might require tens of years of experience isn't nearly as high. I'm sure there are places where this risk is very acknowledged, but I guess it isn't in many places. Work might be also subcontracted o many levels, so nobody's even sure who's the actual person having the access and doing the stuff. Why? Well, because it just doensn't matter, until it does. - Very important part of proper personnel security management (PSM) process as mentioned in ITIL. - As in retail market it's old known fact, that customers steal 50% of the stuff lost and official staff aka employees steals the remaining 50% roughly. I of course don't mean this would be the case always. I'm sure there are also very vigilant and extremely dedicated and serious security staff, but the risk is out there. - Isn't it so that if you're not very serious about your job, you shouldn't be working there in the very first place? Applies to many other things than security too. What if the low paid security guy gets a million just for picking up that hard drive and dropping it for someone? Small task for him but still big bucks. I suspect a lot less than a million would get that easily done, especially if you pre-study employees and pick one with money problems.
  • Something different, checked out RS-26 Rubezh which supports MARV and MIRV. Just wondering when some more aggressive and hostile countries can develop technology like this. Hopefully not too soon. - Yet it could be a story for a agent movie if someone would sell such missiles for highest bidder. Because I'm pretty sure some entities might want to prevent such sale from happening.
 

WDM, Product & Service Design, Customer Experience, Business Processes, DOS, Scope Creep

posted May 23, 2016, 8:32 AM by Sami Lehtinen   [ updated May 23, 2016, 8:32 AM ]

  • A nice article about wdm networking technology. Nothing new, but a good quite compact summary. Works also as a good primer, so you know what you don't know and what to study more.
  • Read a book about Product & Service design. Service Design for Business: A Practical Guide to Optimizing the Customer Experience - Product & service design. User Interface Design, Usability Testing, Content Planning and Management, User Interaction design. User Experience Expert. User & Customer Centric Design. Service-Dominant Logic. Product & Service value. System Architecture. Business Logic. Business Consulting, Business Analytics & Business Intelligence. Enterprise Architecture. Different perspectives on project, system architect, management, project manager, developer, it department. Experimenting and Experimental software, quick iterations, lean development, lean management, lean project. Versus Completeness, Consistency, Correctness. Customer outcome, service channel, touchpoint, context. Value Proposal and Brand values. Customer Life-Cycle Map, Customer Journey Map, Service Blueprint. Innovations. Customer and Client feedback. Innovation verification process. Service script, service path. Key Performance Indicators (KPI). Cost to Serve. Information, Interaction, Transaction. Client / User story. Customer centric. Customer persona story. Service as a Product. IoT, customer targeting, customer analytics. Service Scalability. Minimum Viable Product (MVP). Building a perfect customer experience. Continuous iterative process. Optimizing customer service path. Process tools and process flow. Do not start with "This is how we've be always doing this". Measuring customer experience. New disruptive market players with new concepts or technologies. Process Visualization. Interactive process and user interface validation from other groups. Systemic approach to product & service development. Offer request, offer, contract, requirements specification, implementation, testing, reviews, changes, deployment, maintenance, bug fixes, updates. The never ending spiral of (analysis, evaluation, development, planning). Business process simulation. Information Systems helping in Decision Making. Strategic level, tactical level, operational level. Interaction, communication, collaboration.
  • After this it's funny to notice how some businesses are able to fail sales / customer service processes very badly. You just call them and tell that now it's time to cancel all the bleeping services. I've had this kind of experiences with Electricity Company and Telecom Operators. Most of then this is related to sales processes which are completely screwed. They fail to provide proper information about services etc. It's not too hard to follow normal concepts, where information about product and pricing is given and then customer can negotiate / decide if they want to order the service. Also these large companies often out source their sales, and this is one of the reasons why it's so utter crap. If they can't follow proper procedure, I might also forget the rules how things should be played. Usually these companies also make it very hard to make any changes or even cancel services. Which is just one extra nail to their coffin. Unfortunately I'm busy with other stuff. If I would have extra time, I could play some really nice games, reverse engineering their systems and checking if they have proper protections against internal system exploitation. In many cases it's easy to forget that threats can be internal, instead of only focusing on external threats. Actually they should pay me for doing basic system performance & security testing. Many services are out right slow / buggy. Which probably means that the quality isn't great, so it's also quite likely to find out something exploitable when exploring and experimenting with the system. Shouldn't the norm for system testing be that they've had the red team which has tried all possible ways to exploit / break the system? In many cases I've noticed that service service portals where customer is already authenticated and logged in, aren't nearly as well protected as 'public' services. If the credentials check process is light, great. But what if the customer stars to change email address million times per second? Or queries past invoices or something with extremely high rate? Or generally does something which isn't expected by the system designers and is something, which is deeply integrated into other systems and is therefore slow and the service hasn't ever been designed to be used as API with massive loads. This is especially true, if the service is such that it's limited to something like hardware devices which usually limit the performance / actions per second rate. But who says that someone couldn't replicate the protocol and do the same request at million actions per second compared to the normal rate? Just packet capture the protocol requests, write a custom client, modify payload and have fun. Replace random bytes or bits in that request and see if it causes any interesting effects etc. AFAIK, if services are properly done, none of this should have any effect. But we all know what the horrible reality of software quality is.
  • Somehow this reminded me from WinNuke, Ping of Death and one of my old cell phone Ericsson GH388. It had 160 character SMS message limit of course. But if I moved the cursor back from the end of message and inserted a new character in between the 160 characters long SMS message, it caused the phone to always boot. Fail.
  • One game platform always replicated the player list to all clients when it changed. Yet they didn't realize that I could change my user name in the game millions of times per second, basically causing all of the services outbound buffers to become totally flooded and finally crashing it. How stupid is that? Writing the code required really little extra work. Because as long as I had authenticated session, I only needed to capture the UDP packet as it was, and then change the required bytes and start sending it in massive quantities. The most trivial way of exploiting existing protocols with minimal amount of work.
  • So many projects with Scope Creep again... Aww... The usual case. - No details, but we all know it's the norm anyway. That's why I'm wondering what's the part worth of mentioning here. And it would be nice if it would ...

Quick notes abotu Google I/O 2016 Keynote

posted May 21, 2016, 9:16 AM by Sami Lehtinen   [ updated May 21, 2016, 9:24 AM ]

Google IO 2016 Keynote - They processing all your content, queries, email, chats, pictures, videos, etc. absolutely great and very scary at the same time. Understanding context is the key. Natural language processing (NLP). Context details, who said "playing to night" would mean movies? It could be concert hall or hockey hall, for different people. Also if you ask if movie is any good, it shouldn't be "generic review" response, it should be personalized based on personal preferences. The main actor is one I really can't stand in the movie, how about not recommending it for me. But it was a good watch, got some light brain fog and watching this got me thinking well again. People are also great at being very ambiguous in their statements. 'Get the flowers'? What, for relatives funeral, new lover, or for wifes anniversary and which flowers to get. Maybe you bought a house a while a go and landscaper needs some stuff. Yes, sure, previous discussions and calendar should provide strong hints what did you mean by that statement? Does it ask for extra details if it isn't clear enough? Does it recognize the tone how it was said, it could even indicate which of the three options you mean. Sure that's awesome when it reaches a level of great secretary. Yay. Font size in Allo chat app. Sigh, uaah.... Awesome innovation? It's nice that it learns from earlier discussions. Yet that's another problem. One of the ways to detect someone impersonating you, is that they don't do it correctly. Of course in this case the system would help the advisary in this case. Also processing images in chat is also awesome and extremely scary. Yet another reason to host your own infrastructure and avoid giants like Google, as well as maintaining multiple context aware separate identities. Waiting for time when all this stuff is linked to Strong Artificial General Intelligence. That's going to be very scary? Saying Hi and asking How are you from bot? Lulz. Made me actually laugh. Very cool, there's that context, which I were earlier asking for. Now they know answer to that "Did my team win". Allo, End-to-End (E2E) Encryption, Message Expiration, Duo Video Calling App. IM market is getting hot. Wonder if Google got any changes. Knock Knock feature is really nice. Well, not that different really. Often friends send a message over Skype if I won't pickup and tell why they were trying to reach me. Don't like the way how the own video is cropped into circle. Because due to cropping  you don't really know what you're sending out. It also kind of fails that you need to prepare for call in such a way. I wish they'll allow an option where you can actually send that "clip" which was recorded why you were waiting for the other end to pick up. Because it would be actually handy. So you can start calling, go straight into the topic, and if they don't pickup the message got delivered anyway. - That sounds good. Like with the Skype, here's my message, if you're interested, call back. I don't know about other people, but I personally rarely react to IM or anything 'requiring' instant attention, if it really doesn't. Ah, ok, I'll be back to that when it's time for it. Also the "application embedded features in notifications" are interesting. I mean usually that kind of stuff adds system resource requirements. You don't need to "launch an app". But technically you probably need to, even if it doesn't seem to launch. These kind of 'neat stuff' can suck a surprising amount of system resources. Just like all of those darn slow and heavy web sites. Either they have to have some kind of light unified script to handle actions on notifications, or if they pass that stuff via some "notification action API" to the app, it basically means that the app had to launch, to handle that stuff. So it's doing the same stuff, but just in UI terms more seamlessly. In some cases this might encourage usage patterns which heavily tax the system and user doesn't even realize it. Daydream-Ready Smart phones, VR Mode. Lot of VR 3D content seems to be coming. VR YouTube. Finally, smart watch Android Wear 2.0 which doesn't require phone to be linked. This is actual progress. Those extra displays for phone are lame. It has been possible to get watch phones from Asia for a long time. Now they've joined those together in androidwear 2.0. Nice. Android Studio. Lot of very neat Firebase stuff. Firebase Cloud Messging and Notification integrations. Totally awesome, gotta check that out. Firebase crash reporting. Firebase Cloud Messaging (FCM). Android Instant Apps sounds also cool, and horrible at the same time. It depends what kind of sand boxing and rights those instantly installed malware, ehh, spyware, ehh... Apps need. Yet, in general this sounds really awesome. No need to install "junk" apps. This is the way it should have always been. It's always funny to notice how lazy loading and "pre-loading" stuff is wiggled back and forth over and over again. Why? Pre-loading is bad, lazy loading is bad. Nope I thought those were good things. Eeh, let's change it tomorrow again. Alpha Go stuff.

Use cases, User Stories, Consulting, Documentation, Stories, SQLite3, Long Tail, Overfitting

posted May 20, 2016, 9:42 AM by Sami Lehtinen   [ updated May 20, 2016, 9:42 AM ]

  • Had long discussion about should we use specification, use cases or user stories, etc. All of those got it's advantages and are more useful on other cases than others.
  • Had a consulting gig about the usual stuff: Documentation requirements, documentation sharing, weekly project checks. Some of the stuff I've written about earlier, but don't connect it right now here. Automated off-line mode and caching, with short timeout so that systems work even when there are network disruptions, systems also recover from those states fully automatically. Tasks in project need to be role based, nothing is 'personal', because it causes all the problems we know way too well. Basic invoicing, accounting, etc. Then the classic question when making ERP integration. Is invoicing sales? Or is it something which is going to be sales when invoices are paid and so on. Or is sales something which is only being paid using card or cash, or some other payment method which provides "instant payment" and carries virtually no risk of credit loss. Credit & Debit transactions with own issuer and external issuers. All kind of discounts, campaigns, etc. Data flow charts, exception handling. What happens if system is in off-line for extended periods? All the usual self service aspects using web and local terminals and so on. Where is master data kept for all of the multitude of data types required by the system. Multiple different ways of pricing, using algorithms, discount tables, complex discount tables with tons of rules, aka campaigns and of course there's option to use price matrix thinking. Basically price matrix is something which usually maintained by 3rd party system, because it's trivial to import price matrix but it's huge job to update it, especially if there are multiple dimensions. Sure, that matrix can be also sized in gigabytes but that's not a problem with current tech. It's always a good question if there's any simpler way to get required things done and if the all seeming complexity is actually required. Why is the pricing model such that it requires price matrix and if it requires it at all. Data Warehousing. It's seemingly simple, I just send data. But the actual problem is who's going to interpret it and if it's being done correctly. This is a point which basically easily makes me laugh and cry at the same time. Because it's so easy to fail it ridiculously badly. Then you're getting dis and misinformation instead of real data. Worst thing is that if the data never gets validated and 'the people looking at the numbers' think that the misinformation is real data. Does anyone know, who knows what we should know? The classic question. Is there anyone who actually knows what the requirements are? Nope, well, that's the norm. Actually it's awesome if customer acknowledges that they're having trouble to define the requirements. Worst fails are served when they're clueless about being clueless. System architecture, pricing models, integration design planning, stock management details. Well. How do you know you're good? When discussions lasting several days and going into detail won't bring anything surprising or new to the table. Been there, done that, can be done, no problem.
  • That reminded me from a system integrator I met a long time a go in one meeting where we planned new integration. I gave him the documentation before hand and he studied it. Then we had a meeting to check up things. We went through everything and the guy was like, yep, ok, clear for everything. After that meeting I asked my colleague, what's your feeling. Did the guy understand everything or nothing at all? Because none of the questions gave any indication that he would understand the thing nor he said anything to show off the understanding. It's so easy to go to lesson about quantum physics and say, yeah, sure, I got it all, that's clear. When I didn't actually understand anything at all. But, in this case the trust was that the guy know what he was doing and delivered a week later perfectly working integration. - That's awesome! - I'm often getting this feeling in integration meetings. If the customer has prepared things well, there's rarely a need for extra questions. Which might easily give impression that I'm totally clueless. Yet, that's kind of fun at the same time. I'll let them think so if they want to. No problem. I'll just get the job done.
  • With SQLite3 inequality filter data must be on right most index column when using composite indexes. SQLite3 indexing & query optimizer.
  • Reminded my self about long tail and overfitting which are both quite obvious issues and nothing new. I just wanted to read what the Wikipedia articles say.

Smart Traffic Control, Insurance, Statistics, Big Data, Bias, Probability, Discrimination, Phishing

posted May 13, 2016, 10:54 PM by Sami Lehtinen   [ updated May 13, 2016, 10:55 PM ]

  • Many smart traffic control systems just only get data from 3rd party, without proper authentication. That makes all kind of interesting attacks very easy if data isn't properly validated. Data quality can be also very bad, if there's no someone responsible for it. Yet using integrated data for stuff like logistics, would bring enormous changes and cost savings. - Mobility as a Services (MaaS). What kind of centralized control systems are used for in city UAV traffic etc? Can data be trusted? As example some systems like transponder systems used on airplanes are vulnerable to multitude of different attack vectors. Yet also the promises and additional value, cost savings and so on are really awesome for these future technologies if things are just done right. What is right, I guess nobody knows right now.
  • One Finnish insurance company is now using application to collect data about your driving behavior and reduce your insurance fees based on that. I wonder if it only collects driving data, or something else too. But this has been one of the topics being discussed about big data for a long time. Insurance companies are very keen to get that data, because even if it wouldn't be 'highly accurate' having any data is better than no data at all. I think they would love to collect: Financial records, criminal history, drivers license history, driving history, travel history, medical history, purchase history, social media posts and likes, social network analysis and so on. List goes on. It's also easy to forget that usually these activities are highly interlinked, so even if you wouldn't have all data, you can assume some areas based on some other data pretty reliably. It's interesting question if that can count as discrimination? If 98% of things with feature X are 'flagged'. Should that be called discrimination or accuracy? Yes, that will probably lead you getting flagged if you have feature X. But that's totally reasonable. I don't like some things about these talks, they always claim that something can be used against women or minorities. What about straight Caucasian males? Who's going to defend them? What if statistics show that Tibetan Buddhist Monk has lower probability to engage anti-social behavior that drunk Caucasian males? Is that wrong? Or is it again accuracy and just something that normal people could be biased about?
  • Because some features are interlinked (information leak), it might be possible to feed all data collected from your luggage by security scan and pass it for automated analytics. Even if it wouldn't reveal any contraband, it still could reveal "some combination of things" which might indicate that it's better to engage in additional checks. Using machine learning for this kind of analytics would probably be very efficient. It would be interesting question if it would be ethical? Maybe it would be a good idea to just feed features in and get true or false indication. Then pass back the result if anything bad was found or not. In this case the process would be complete black box, but at least it wouldn't discriminate anyone... I mean based on non-statistical reasons, ahem. But I'm pretty sure this is something which much smarter than I groups have been thinking for a long time. Statistical analytics and data processing isn't anything new at all. Actually when it's not being used efficiently makes me usually wonder more, don't they really get what they're missing when they're not using data analytics? - Related 32c3 talk: Say hi to your new boss: How algorithms might soon control our lives.
  • I'm just wondering if Nordea Bank's payments network being down is related to the massive phishing campaign which did run a few days before the bank's systems went down? In those phishing #emails they require you to send your long information by email to their #security #department,or they'll block access to the bank. I guess it's reasonable to first make that threat and then crash the bank. So even if people didn't react to the email immediately they'll remember it and might still act on it later. News didn't say anything about the phishing campaign, but I don't think it's random even. Because I received several (5+) scam mails from 'Nordea Bank' to give over my banking credentials a few days earlier. Afaik, that makes perfect sense. - Similarly sending first "security update to be installed" and telling users that site won't work without it and then DNS hijacking or DDoSing it would make perfect sense. First people don't install the packet, but when things go haywire they're stupid and try to fix it by installing the malware package.

CPU load, Crypto Wars, Data Protection, Skype, Deepmind, FLIF, ServerBear, Tempfile

posted May 12, 2016, 9:50 AM by Sami Lehtinen   [ updated May 12, 2016, 9:51 AM ]

  • CPU load measuring fails. I complained about one process being sucky and using way too many CPU cycles, working inefficiently and providing very bad response times for messages arriving via one pub/sub queue system. Guess what was the developers answer? I've been checking it out and it seems to only use about 13% of CPU why is that a problem. - Thank you for that. Btw. My guess is that the app is single threaded and the developer used 8 core CPU. I've seen fail comments like this with Anti-virus tools evaluations too. Anti-virus X is great, because it uses only 25% or 50% of CPU but Anti-virus software Y caused slowness because it used 98-100%. Excellent job guys! How about writing an article 'what went wrong' about your own article? CPU percentage can be shown differently, other tools show load on on whole system and other show / core. Which means that program which utilizes threading efficiently can have much more than 100% going on.
  • All this discussion about privacy and 'crypto wars' reminded me that there are other fundamental rights and regulations which dictate what can and what must be done. Here's a free handbook for European Data Protection. I've read it, it's at the same time horribly boring and yet very interesting at times. I guess most of people don't know enough about data regulation, because so many laws and rules are seriously conflicting on that sector. I guess in many cases it's just best to try what's sensible and can be easily and sensibly argued. But knowing if it's actually legal is going to be extremely hard and nobody knows that before extensive court case. Ouch.
  • Skype direct connections only? Does anyone get this? I don't get it. If direct connections are the only way to connect, then I would assume that cloud wouldn't be used, and IP would be always revealed. If the statement would be disallow direct connections or always use indirect connections, then I would understand that the IP address won't get revealed. Am I missing something? Quite from UI: "When you enable direct connections only, your IP address will be kept hidden when you cal people who aren't on your people list. This may delay your call setup time."
  • Google's Deepmind - A nice article about Artificial General Intelligence (AGI), AI. Yes, it's very shallow not deep at all. But nice on general level because it talks about General AI. Ha.
  • I've posted about FLIF image format earlier, but here's my latest comments: " FLIF provides excellent lossless image compression. The best part is that it's free and not patent encumbered. Yet in some cases lossy compression would be still more usable. But it's important to have options. I personally expected that JPEG2000 would have made a mark, but nobody seemed to care. Of course todays extensive image use on web browsing might make a difference. Also CPU's are faster so algorithms which require more CPU power to decompress are more usable than ever before and I guess that pretty much stands for future too. I remember time when GIF decompression was slow and JPEG decompression was ridiculously slow."
  • I did run ServerBear tests for my Scaleway server (Type VC1 x86 VPS). Results are here.
  • Reminded myself about tempfile. Which is very useful when working with data which is in 'log form' and can be processed in blocks rows and doesn't fit in system RAM. In such case performance is much better than using database for this kind of 'batch job' or temporary storage. The SpooledTemporaryFile is awesome, because it works like /tmp (tmpfs) on Linux and data is kept in RAM if there's no memory pressure, but can be written out to disk, when required. This is optimal for cases where the same program is run on wildly varying environments with large RAM & data set size differences. Some servers got large data but low RAM and in some cases it's just other way around. This is the optimal generic solution.
  • Something different? Quickly checked out Mi-28.

Work, Fun, Games, Learning, Priorities, Achievements, Python 3, Let's Encrypt, Scaleway, Phone Fail

posted May 9, 2016, 8:20 AM by Sami Lehtinen   [ updated May 9, 2016, 8:20 AM ]

  • Why do we work so hard? - Excellent article? Raises also great questions. But I've been thinking this hard over and over again earlier. I love working, learning and challenges. I wonder if there would be better uses for time than working? Ok, I confess. I do cycle during summer quite a bit, but even then Kindle is with me, and if I'm taking a break I'll be reading tech stuff as I do also on my vacations always. I've often wondered why people hate work so much? What's wrong with it? Ok, sometimes there's excess challenge or time pressure. 100 things I would need to master, but not time to master even two of those things properly. Yes, that's too demanding and can be frustrating. But then it's just time to adapt. Pick top priorities and do those adequately, nothing fancy, but gets the job done. That's what you'll have to live with. So, would watching cat videos and drinking at home be someway more beneficial than working? I really don't think so. I've got a bunch of great friends who agree with me. Some do demanding jobs as they've mentioned in the article, which is basically either working, or studying for the work. If the work isn't that challenging they might have two jobs so they can avoid the parts of life where you go and whine that there's nothing to do. Work isn't that bad at all afaik, as well as when I do things, I know I'm just good often that it alone makes me happy. So what if I spend weekends studying the stuff I want to know all about. I acknowledge that if there's too much pressure in this setup, there's a risk of burnout. I have avoided it, but I've been terribly close, so I know the signs to watch for. Yes, there are many things which I would like to do, if I would have time. But after all that's all up to priorities and simply being realistic about what you can achieve with limited time and resources. As example, during last weekend I coded and studied for about 30 hours. On top of that I did make a few nice meals and had a walk in sunshine and some sleep. Even after that I felt I made less progress with the projects I would have wanted to. One great example is that if you would have free time you could play game like Pandemic. But truth is that Pandemic game is just like work, it's co-operative assessing and optimization task, how to clean up the infections on board in most efficient way using skills the players (or play characters) got. So, why is some work (board game, strategy games, or complex card game) more interesting than basically similar other work (program code / SQL query optimization / logic refactoring)? Isn't it fun game to tinker and think about project tasks and critical path and figure out ways to optimize it? Sometimes there's no other way than cut the corners, but even then, it's extremely important to know and carefully think which corners can be cut without causing disaster. I would actually find that trend reversed. By spending a 6 hours playing complex board game, I've simply managed to waste  a lot of time, which I could have used for something much more productive. Or some other stupid ideas like heading to bar. First you lose time, next you lose money, then you lose health points. And even on top, you'll be wasting the efficient work of the next day while suffering hangover and after a night probably not so well slept. I wonder how stupid people are when they head to the bar. Anyway, what's better than waking up at middle of the night and knowing sigh, now I know where what the problem with dead locking in the code is and fixing it in the middle of the night?
  • Previous point just brought me to the point which happens every time when talking to friends about vacation trips. Either they have time or money, but not usually both. Slackers don't work and do not have money, and professionals who prefer to work, really don't have time to waste on vacation trip.
  • Python 3 is Winning - Finally.
  • Let's Encrypt has issued Million Certifications - This is great. Finally.
  • It took just two minutes to fire up the server at Scaleway. I did run some very quick basic tests, which do not require installing additional testing software. I was very happy for the performance, when comparing it to the cost of server. Networking was clearly fast as well as CPU okish, and local SSD worked as expected performing great. It's not hard core server obviously, but it's great when you check out the cost and if you just don't need more.
  • Technology as usual: One of my phones reboots often while being shutting down. Great job. Then you need to retry the shutdown after it finishes rebooting first.

Security, 2FA, Stock Market Prediction Game, SOC, SecOps, GNSS, My Data, E-receipt

posted May 6, 2016, 10:57 PM by Sami Lehtinen   [ updated May 6, 2016, 11:24 PM ]

  • Very nice generic post about security and how easy it's to fail or think things in inefficient way.
  • Got annoyed with several sites which use for 2FA their own App. I wonder how incompetent engineers they got if they think that implementing 'standard' TOTP 2FA is way too hard. There's no way I'm going to install yet another crap spy app just to login to their crappy site. I think it's better just to completely opt out for such services. As example Microsoft.
  • Someone made a stock market prediction game, based on real data. Actually this is something I've been thinking about years. It's going to be interesting to see if it's economically viable, and if the data is actually valuable. Technically implementing such game isn't anything especially complex. One of the hard questions I were thinking for such game is how to make scoring interesting so that new and old players get good change to compete against each other. Of course something trivial like weekly change charts in percentages or something. Even more interesting than just getting sentiment was the thought that the players would actually trade against each other. This would also bring managing over all money supply of the in game economics into play, which is as we all know, quite complex issue. Clear change percentage comparison would make me just win by creating new accounts and betting everything on single stock, if it does well, great, if it doesn't so that's life. Of course it wouldn't earn me high relevancy on single account for long, but it would allow me as a person and holder of multiple virtual portfolios to win weekly (yet using different account). If the amount of money is meaning full, then it probably gives unfair advantage for players which have been playing the game for long. So that brings in the problem of scoring players 'fairly' so the new players won't find them immediately in situation where they've feel like they've lost. One way to run the game would be running it as limited time intervals and resets. Like 3 month cycles or something. But that would also have it's own drawbacks. If players trade against each other, and game doesn't use real world markets as data source, could allow watching more interesting market phenomenons. Like players trying to manipulate the market and be successful on it. That's where the game discussion forum / real time chat would play very important role. If there's a well known player, playing 'penny stock game', it would be interesting what happens when he announces that stock X is going to rocket. If many enough players know the player, they might try to pick up on that short term pump & dump scheme, of course thinking that they can exit in time. So that would basically make that classic pyramid scheme in a way. Except at least the more experienced players would exactly know how it's going to play out, but still be in the game, making the less experienced players to pay the bill. Afaik, this would provide more interesting game play than trying just to predict the real world market, which you've got very limited manipulation possibilities as individual investor.
  • Prepared action plans with Security Operations Center (SOC) Security Operations team (SecOps). Unfortunately no further details will be available about this.
  • Checked out GNSS Augmentation and Quasi-Zentih Satellite System and European Geostationary Navigation Overlay Service (EGNOS)
  • Read guides by The University of Texas at Austin - Information Security Office - Good basic instructions as there are on so many other sites. I just which more people would read and follow these rules.
  • Lot of discussion about My Data and why there's information being collected about me, which I can't access because it's someone else's "proprietary" information.
  • Finnish e-receipt standard is finally 'out'. And now is the the Request For Comments time is going. I studied the latest version and it looked pretty good. I'm looking forward into seeing what kind of benefits this will provide in real world and how quick or slow the actual adoption will be. This also means that I'll terminate my e-receipt project pages and dump some parts of the project to my blog. I'm way too busy with other stuff anyway, so I didn't have time to continue it earlier. National standard is anyway a great leap in forward, and there's no point of making any competing and non-compatible suggestions, which anyway would be worse that the standard created with large board or highly competent people.
  • Microsoft SQL Server on Linux - We're living interesting times for sure.

IoT, Security, Katakri, OpSec, Slope One, Indexing, DNS, Bugs, Quality, Stack & Heap

posted May 4, 2016, 8:55 PM by Sami Lehtinen   [ updated May 4, 2016, 8:56 PM ]

  • Internet of Things, Internet of Spies, Internet of Targets? What's that going to be. Here's one interesting story. This is why people fear Internet of Things.
  • More Internet of Things reality. That's funny stuff too. I really don't have anything to add. The article says it all.
  • Wrote a short memo about Security landscape changes for classified information. Most important changes are related to the risk assessment. Security measures are targeted based on perceived and assessed risks. Risks are of course based on the target, data classification level, used environment and related threats. Making risk assessment is now always required for handling classified data. If official certification is being applied, important part of that certification is assessing the risk assessment and confirm that those do not conflict. Security management, physical security and software & configuration security. Closely related to ISO 27001 requirements. New requirements are also more ambiguous making audit possibly harder. Because there's no clear list of requirements, these are the required things. This is good and bad thing. Because it allows different measures to be used to reach the required security level. At the same time, this makes the risk assessment itself too critical position ensuring required (subjective) security level. There's also documentation of recommended processes used to manage security. - "Information security auditing tool for authorities – Katakri 2015" (CIIP)
  • Implemented Slope One Collaborative Filtering and ranking / recommendations for one project.
  • Nice article about Travel OpSec.
  • Indexing basics article "PosgreSQL Indexes First Principles", as you might have guessed it's about SQL (PostgreSQL / Posgres) Indexing. Nothing new, but if you're not familiar with indexing it's a good read.
  • If CloudFlare would have presence in Helsinki, it would be able to serve Russian customers / users better, those are currently being served from Stockholm.
  • Great and very comprehensive post about DNS, highly recommended reading. Including history etc. There was legacy stuff which I wasn't aware about.
  • Sometimes I just wonder, how many complex and annoying bugs you can fit in small algorithm and code implementation. But then answers seems to be lot more than you would assume. Phew. Even if you're not writing anything like ciphers or data validation / authentication (MAC/MIC) or protecting against cache correlation attacks or timing issues. So be happy if the application barely works and does what's required by operation requirements. You're just being crazy if you're going to ask about security, uncommon exception handling, performance or code quality. Those are nice things, if you got infinite time and the application is small. But if the app is complex and resources limited, it's a really nice fantasy. It's a common misconception that if it's working, it would be done correctly. That's not true of course. Because if you think it's working, it just probably means that you're not looking closely enough the multitude of ways it's seriously broken. Sometimes it's more like, by not looking, you can believe it's ok and then you can honestly tell it's working. - Actually I just proved this myself. But that story is in the post queue and will be out about 15 posts later.
  • Nice post by Julia Evans with title What is "the stack"? - Of course this is all familiar to programmers and tech geeks. But if you're not familiar about the stack, and you're interested in computer tech, take a look. Also the article about Rust stack & heap itself is really nice.
  • Excellent post curl vs wget, what's the difference? - Btw. I had no idea that curl can be used with such a huge number of SSL libraries. Yet I've found out the hard way that wget doesn't decompress gzipped payload. Which actually seems to be quite common complaint on forums.
  • Buggy software is just about everywhere all the time. Just noticed that Thunderbird shows UTF-8 correctly... Until you open the message in separate message window. As long as you're using 'preview' or 'reply' mode it's right. But if I just open the message itself, character set gets borked and UTF-8 double byte characters show up like those do, if character set isn't UTF-8. Great work, what a fail, once again. So, they can also fit in several bugs in their app. Just like I asked above.

1-10 of 355