Blog‎ > ‎

PRP, MarkDown, PCP, Mega, LocalStorage, Blink Protocol, UnitTest, Passwords, Helpdesk, Integration Best Practices, etc.

posted Jan 20, 2013, 6:13 AM by Sami Lehtinen   [ updated Mar 22, 2014, 8:43 AM ]
  • I have been working with my own Personal Resource Planning (PRP) methodology. Most important skill seems to be now to overcommit your self, because it leads certainly to unhappiness. Next important thing is clear prioritization. It also helps with not overcommitting your self, you'll do what's important and rest is not important. It's important to be able to happily ignore what's not important, otherwise it'll bog you down. My own GTD system with multiple priortized in queues will do the trick. What I still need to improve, is being even more selective what I insert in to my tasks queues. I seem to be bit too interested about way too many things. Focusing on much narrower field would make life easier. Recognizing high energy times from low energy times is also great, then you can pick tasks which are suitably challenging to your mood. Like watching movie, purging email, doing investments, programming etc. Each step requires more concentration and is mentally more demanding. Learning my own limits, makes turning things down much more easy. I have received some very interesting offers lately, but I really don't want to overcommit my self. I have done it often in past, learned from my mistakes and I won't do it anymore. It's better to undercommit, then you have free time and you still can do something useful on it. Even the same things you could have committed to and then suffered if you run out of resources.
  • Entire nations intercepted online, key turned to totalitarian rule. - No wonder BGPSEC is being pushed forward. (Btw. There isn't BGPSEC topic in Wikipedia yet.)
  • Studied Markdown for my PyClockPro project. Actually quite many services use Markdown. I personally didn't like it too much. Writing technical documentation with it wasn't as straight forward as I would have preferred. Getting Markdown to be formatted correcly with BitBucket wasn't fun at all. It's not complex, it's just annoying. Luckily they provide many other alternatives. If I'm not happy with markdown I then revert back to plain text.
  • Mega Upload uses convergent encryption. Well, it's a fail: TL;DR; It is not secure, deduplication ruins privacy. Not an acceptable solution at all.
    I like Freenet's approach, because it's also complemented with anonymous routing. Without anonymous routing data content based encryption is dangerous.
    Encryption key is based on the payload, so if you don't know what he payload is, you can't decrypt the packet. Of course decryption keys can be delivered using different encrypted tree of keys, which is used when you deliver download link.
    For that reason, when ever I'm sharing anything I usually encrypt files with my recipients public keys before sending those out. Just to make sure that data is really private and keys are known only to my selected peers. In some cases when I want to make stuff even more private, I encrypt data separately with each recipients public key, so you can't even see list of public key ID's which are required to decrypt the data.
    I also have 'secure work station' which is hardened and not connected to internet. That's the workstation I use to decrypt, handle and encrypt data. Only encrypted and signed data is allowed to come and go to that workstation.
    This is exactly the same problem as with Freenet. Because same plaintext encrypts to same ciphertext there is huge problem with that. If I really don't anyone want to know that I got this data, that's failed scenario. It makes things easier for service provider, they don't want to know what they're storing. Just like Freenet's data cache. But if I know what I'm looking for, I can confirm if my cache contains that data or not. Therefore this approach doesn't remove need for pre-encrypting sensitive data. Otherwise it's easy to bust you for having the data.
    There's also one interesting solution which is OFF System. Which is just pain gimmickry.
    Well, my point is that if security is provided, it's better forget trying to deduplicate data. If data is deduplicated security has been weakened aready way too far. Secure service do not need any deduplication, and if they use it, it means that the service is fundamentally flawed.
    See (PDF's): Secure data deduplication white paper, Efficient Sharing of Encrypted Data.
    When file sharing use GnuPG and Tor, or Freenet, Gnunet what ever. Anyway, using the first encryption layer is also critical. Then you can use secondary service to make you pseudoanynomous so you're reachable, even if they and you don't know who you're communicating with.
  • Littele browser local storage test using Brython and Google App Engine.  Seems to be working well with mobile browser, but my Firefox is configured to drop all data when I close it, so it doesn't work with my Firefox at all at all. (If I close FF during between page visits).
  • Year 2000 problem, year 2038 problem, etc. Do you know what year comes after 1999? - Don't you know? It's of course year 19100. Laugh, our invoicing system started to spew out invoices with due date in the beginning of 19100. Well, it's nice to get little interest free payment time. Let's see what happens when year 2038 gets closer.
  • Blink protocol: Well, I don't personally see need for that in my case, because I'm already used to higher level languages and higher level data structures. Which naturally are much less efficient than blink or binary formats. But with some scenarios I really can see it to be beneficial. Because I have to addmit that it's ridiculous to have 5 gigabyte XML file, which is less than 2 megabytes after efficient LZ77 compression. It really tells everyone, what's the "information value and density" of that XML-junk file.
  • Read quite short but information article about Thorium reactors: I have read a lot of stuff about nuclear reactor design and security systems earlier so this was quite nice fill in piece.
  • I'm still having issues with my PyClockPro project. I'm currently doing already third refactoring round. That's the reason I haven't yet released it. I have been adding many optimizations to save CPU time and discarding higher level Python data types for some things. I think the code is now actually quite ugly, because it's not Pythonic at all. It's more like C jar Java code, just written in Python.  I'm still wondering if I would use linked lists with internal processing logic, or external loops accessing data in lists. I benchmarked recursive linked-lists and performance was quite bad. Sigh. Currently I'm wondering should I use list.remove(something) or should I use loop to check list entries (using index) and remove content when found? Why I'm asking this? Because I already know the range in the list where that value being looked for is. But if I use list's remove, it doesn't have the luxury of utilizing the known range to look the value in. Well, after these questions, I know there is need for this kind of library, because implementation is non-trivial and I really understand why simpler alternatives like LRU or CLOCK are so popular.
    I'm currently writing unittests and proper complete benchmarks for PyClockPro. I'm also aware that the class decorator wrapper might not be optimal, I assume someone who's actually experienced in writing class decorator wrappers could tell me how it should be exactly done correctly and especially why so. It's working well, but as we well know, it really doesn't mean it would be done correctly. Some core functions are still pretty broken, I really hate refactoring core parts of program, because it inevitably leads to breaking most of code for a while. (Add the refatoring video clip of cat trying hopelessly to jump out of slippery bathtub.)
  • Studied pydoc3 for generating standardized class and function descriptions for Python.
  • Studied unit testing (with Python) and git hooks, preventing commits until unit tests pass. Read several long articles about Python unit testing, so I can include unittest for PyClockPro. I still think I'll release first version without unittests after I have done other tests and analyzed debug data so I'm sure it works ok.
  • This would be way interesting, if I would be younger and have time for it. Calling all coders: Hardcode, the secure coding contest for App Engine. "During the qualifying round, teams will be tasked with building an application and describing its security design." 
  • Google Declares war on password - The problem is that if criminals can convince you that you’re visiting Gmail even when you’re not, they can trick you into entering that secret code. In fact, the bad guys can even turn two-step authentication against legitimate users. Site should first be authenticated to user, and then key tied to that authentication, should be used to authenticate user to site, without ever revealing the shared secret or private key which was used as basis of the authentication.
  • I just noticed this week that one high value target got security flaw. Even if strong 2FA authentication was used, the session cookie they served wasn't https only. Oops. it means that (Java)Scripts can access it, and it can be sent over http connection. We all know that not all users add the https prefix, so fail, there it goes out to the internet without encryption. Well, I naturally reported the issue to their security team. I'm currently waiting for confirmation. I also suggest that sites should use HSTS, even if it's not nearly perfect solution. As well as not to provide service at all on port 80. That might make some people to understand that site is https only. Because if there is redirect, users will use it, without understanding that it creates some risks. (Like possibility of redirection to another 'fake' domain with valid https certificate.)
  • Some funny stuff for a change: I told my colleagues that we receive at least 5000 "hack attempts" aka failed logins daily to any of our public Internet facing servers. One of my colleagues just said to me: "Well, you're having such a ****** password policy, that maybe those are actually failed login attempts and not hack attempts at all." - It really got me laughing. Yes, passwords, especially long complex and random ones are painful for users. Here's password of the day (opening and closing quotes aren't included in the password):"^j'lb#K-€3,<_úgWJdXå(n_6=41Bµ%cj!" Btw. Good luck guessing the password or finding it out using SHA-1 hashs or so. I know it's possible, it just might take a while. ;) p.s. This password still got less than 256 bits of entropy.
  • I made minor fail, because I didn't remember that Windows Server 2008 R2 Datacenter edition doesn't referesh scheduled taks or service lists automatically. Pressing F5 is required. So I restarted one service a few times wondering why it's always starting, not running. Oops, it happens. 
  • Helpdesk processes: Had a long discussion with fellow IT-manager about: Internal (intra) & External (public / web) wikis, information & knowledge management, canned helpdesk reponses, automated first responce informing about on going issues, etc.
    1. Inform customers immediately (automatically) about current known issues
    2. Offer them FAQ / help, but if they ignore it allow them to enter ticket subject
    3. Scan knownledge base for answers to that subject
    4. Suggest a few best matching articles to the user
    5. Allow them leaving the actual ticket
    6. Show it to helpdesk guy, with pre-selected canned responces if those would be suitable for question.
    7. Always keep customer posted about the ticket status
    8. Preferably especially in B2B sector, ticket isn't closed, until the customer tells so. If ticket is stale for a while, system shuold automatically send message to the customer asking if issue has been resolved or not. Ticket really can't be closed "by default" after giving some semi random answer to the customer.
  • Checked out a few cloud / mobile pos providers: Vend HQ, Imonggo, AirPos, Kachng! (Torex), POS Lavu,, Posterita Cloud POS, SquashPOS, EffortlessE, MerchantOS, ShopKeepPos, LivePOS and wrote (private) study about those.
  • Added "An Introduction to Programming In Go" by Caleb Doxset to Kindle.
  • Some very old stuff just for fun. I reverse compiled one PCBoard script and made some funny modifications in it. Administrators newer found out about those changes. ;) Maybe people should checksum files, so they know if those files are still the files they think those are. 
  • Game / Java reverse engineering (or more like reverse/de-compiling). Well I liked that PCBoard reverse engineering. Another nice story was reverse engineering one Java Applet game. I decompiled the code, added my own class which contained hooks to key parts of the game. After that I could simply call any functions inside the game when I wanted to so. After adding some additional control code and timers, I could enable my player to receive points at will. My speech bubbles also had scroller features. (Just sent the message with offset about 5 times / second). One of the funny parts of this was that the game protocol wasn't too optimal. All buffers of the game server and especially users which slower connections got absolutely flooded. So what if I receive one point ever ms? - Well, that was fun as long as it lasted. Game developer added some checks to the protocol to pervent this. Oops, natural fail. He had problems with some users who reverse engineered the messaging protocol. But I didn't do that, because I decompiled whole program. One of those friends who wrote their own client, used something like 15 minutes to add the checksum code to their programs and off they went. I was even quicker, because my own class actually extended the game class. So I didn't need do any changes, just download the new game class locally and launch the game again using it. Well, I clearly had too much free time back then. 
  • Quora fail: I have been wondering this trend to app that, app it and actually app everything. I really don't get point of millions of (more or less useless) mobile apps.
    Today I really got slammed right into my face. Quora doesn't work with mobile browser, they require you to install their freaking Quora-app. This is simply getting ridiculous!
    I remember some sites that worked perfectly with desktop computer, but if you used mobile they required to use their SMS service or something similar. This is at least as bad trend. All the benefits of using browser as platform are totally lost. What do you think about it? Do you love installing App instead of using web browser for every freaking ridiculous site you have ever visited? I don't like it, I really really hate it. I don't want to install any crappy spy/malware apps. I just simply want to use their website. Or in this case, I don't want to use their services anymore at all. Do you know any others sites which are as ridiculous as Quora? No this is not +1 for Quora, this is -1 for Quora and +1 for StackExchange.
    When we think bit futher: It's simply bad for usability, but it's even worse when we start thinking about security! If every person using mobile learns to install what ever app is pushed by what ever website, it's going to be really soon a bad security & privacy problem.
    p.s. Yes, I do know if I change the browser signature to Linux or Windows desktop browser, I can perfectly well use the site. Which makes my point of using crappy (unnecessary) app simply even more valid.
  • Blog FTP/FTPS/SFTP/SCP/NFS/SMB based integration best practices. I try to be very compact:
    1. Use temporary files and paths if possible when generating file.
    2. Move ready files to target path / rename files with final name.
    3. Always check file size after any transfer. If reasonably possible checksum too.
    3. Retry if required.
    4. Always process independent files / transactions, do not do "random batching".
    5. If you don't have luxury of using temp paths and file names, use file locking properly. If even that isn't possible have proper EOF flag in file. So non-complete files won't get processed for sure.
    6. If file wasn't properly transferred (for any reason), resend data.
    7. Think this process as transaction, what has to happen (with positive confirmation) before you can proceed to next stemp.
    These rules are very simple, but you guys won't believe how often programmers fail even with these super simple rules.
    Same transactional basics can be and need to be utilized with more modern methods. Do not mark something as completed, before it's really done.
  • As example how to fail what I just mentioned above. A generate list of files to be processed, let's say "files*", then process all of those files. When done generate list of files "files*" and delete those. Well, isn't it reasonable? We processed files* and now we delete files*. Well, fail. What if files were added to that path during your processing. Yes I have seen that in live production. What a fail.
    It's funny that engineers fail that test, but small children do not. If I put some gummy bears on the table, and let the kid start eathing those, during the process I add 10 more gummy bears on the table. When he has eaten the original bears I ask him, if he as now eaten all the bears. Do you really think that the kid would say yes, I have. Nope.
  • I complained about Filezilla's poor FTPS performance a few months ago. Guess what, they have released fix for that. Excellent! Now everything is working as expected.
  • I did some security auditing for one company. They had installed most of systems using default username and password, and servers were directly accessable from internet. This is incredible, are we still living security through obscurity times? I thought this was old news even in year 2000, but it's still happening.
  • Other non-it tech stuff, space lanuch using Ram accelerator and non-rocket spacelaunch. Afaik. Ram accelerator is super high tech and cool solution, still being viable when we have well working scramjets.
  • Uber Taxi is interesting technological development in taxi sector. Instead of really expensive Taxi systems, they replace everything with modern mobile phone. This is technological revolution at it's best.
  • A really nice post about Python dictionary basics.
  • When I did read news about thousands of SCADA devices being accessable directly over internet. My first thought was: Well, they got many honeypots. - I really hope I'm right with this thought.
  • Seeding data with fake entries, so it's easy to spot if information has leaked. It's nothing new, Canary Trap is just more advanced method than Mountweazel.
    As example I personally provide unique email address to every service I ever give email address to. It's very easy to see, if they leak it. Most disturbing case I have this far encountered is receiving spam to email address only given to one investment banking company. I'm 100% sure I haven't ever given it to any other site. This means that either my server was exploited, they leaked the addresses on purpose, or someone stole their customer base with email addresses.
    I can recommended studing field of counterintelligence in general, there's lot of interesting methods and stories.
    See: False document
  • I know my signing key (1024D/274EF626 - 1024 bit DSA) key is not up to recommended strength. I have earlier said that I'm waiting for ECC upgrade and then start using it. Otherwise I recommend using 4096 bit RSA encryption and 4096 RSA signing keys.
  • How to enable HTTPS secure encrypted SSL connections for LinkedIn:
    Click your Own name -> Setting -> Account -> Manage security settings -> When possible, use a secure connection (https) to browse LinkedIn - Check -> Save changes.
    So if you haven't done it yet, do it now.
  • Well, I still hear people talking about IPv6 NAT. No no no, it will reduce usability of features like power saving. Keepalive traffic is absolutely pointelss. When TCP ip stream is open, it's open, it shouldn't require any keepalives. Sending keepalive packets every 30 seconds or so, just consumes power on mobile devices. When data is sent over connection, if remote end isn't there, connection will die and that's it. Most systems default keepalive time to two hours, most of NAT devices default to much less. Also benefits of global reachability are lost.
  • Brython (1.0.20130111-000752)
    print(int(0.5)) -> 0
    print(round(0.5)) -> 1.0
    Python 3.2 (Win64 bit r32:88445)
    print(int(0.5)) -> 0
    print(round(0.5)) -> 0
    Ouch! These kind of differences can make apps behave in really surprising ways. Well, as we know naturally the floating point implementation being used affects this issue too.
    Reminds me from: Write once, test everywhere - The Java, approach.
  • Spent one day studying Windows PowerShell. Haven't been using it too much. Usually I have cmd file which calls my Python scripts (which might call for os.system for some sub functions), but PS can be really handy directly for many things.
  • Did read a article about Dart.
  • Thought mre about multi-core vs many-core thinking. yes, with multicore something like pipelining process steps could work well, but with many-core it's not an viable option anymore. Like I have said earlier, I often use Process Pools and simply chunking data to be processed into slices. Naturally this won't work with all kind of work loads, but for me it has worked well this far.
  • temporary email forwarding service lacks reverse DNS name for their outgoing smtp mail server. - Fail. This is easy to fix, but they simply haven't done it.
  • I would have liked to write longer article about national relations and locations of their national domains DNS servers. But don't have time for that analysis right now. I have been doing quite many checks and I know there are political ties. Just check out where dns servers are for .fr, .de, .nl, .ru, .jp, .ch, .tw etc. It's interesting. Why Taiwan doesn't have their top DNS servers in China. Why .ru DNS servers aren't in US etc. Why all .eu DNS servers are in EU area? etc... It could be interesting to make complete political world analysis based only on this information.
  • Quickly studied basics of GlusterFS.
  • CipherSaber is a very simple but working cipher implementation. Assuming you have pre-existing RC4 cipher code.
  • For a change studied water turbines, wind turbines, and ship propulsion systems designs for a while.
  • WebSockets will allow an easy way to offer VPN services over HTTPS. I guess this will end silly politics to think that using some services can be blocked based on IP or TCP port number.
  • There have been some news about new convert messaging applications. As mentioned earlier, any service which allows you to store key value data, can be used as proxy to deliver data. So all kind of DHT networks, DNS service etc, what ever can be used as data relay. As long as "arbitrary" values can be sent over the connection. Even if values would be very limited using large number of keys or updating value for key often enough allows data transmission, just on lower data rate. Some people still do not understand that allowing even plain DNS usage allows me to communicate exactly anything in and out of their network. 
  • I started to write an article about Bandwidth Hog, it's a UDP protocol that is designed for high packet loss / high latencynetworks. Retransmission is simply super aggressive and there isn't any kind of window limiting data transmission. Actually HS/Link protocol worked like this. It used infinite window and packets that got lost were sent later again. This allowed protocol to maintain sending data out always at defined data rate. Some data got lost, some got through, so what? The lost parts were sent again later. This would allow protocol to "steal" bandwidth from applications using TCP connections which slow transmission down in these situations. For this protocol it doesn't make any difference if there is 30% packet loss and 30s (yes, seconds, not milliseconds) latency on link. I have been thinking about this for a long time, especially when using cognested slow shared connections. Basically same kind of results could be achieved using TCP connection splitter, where let's say 300-1000 tcp connections are used in parallel to transmit data over the link. - Well, I think the key point was in this post already, so I don't bother writing any longer story about this. Because some limits are always required, those could be things like packetloss or latency. Let's say that the protocol is tuned so that it transmits data faster until 20% loss ratio is reached. This would make it way faster than parallel TCP connecting using exactly the same link / route.
  • Well, this was mostly new stuff. I couldn't manage to shorten my backlog at all. Maybe some other week I'll get that done.