Disaster Recovery, Surveillance, Privacy, Data Protection, Database

  • Writing some temp code had issue with readlines() leaving the annoying '\n' trailing every line in list. But it's just better to .read.splitlines() so you don't have to separately loop through lines and rstrip('\n') Interestingly rstrip also removes whitespace characters which didn't match the rule. ' A B '.rstrip('B '). I would expect it to return ' A ', but actually it returns ' A'. Good to know, because in this specific case, the trailing whitespace was and is essential due to specification requirements. I could have stepped on that mine easily, without doing these checks. Of course data is validated before it's actually sent to output. But yet another good example, that you have to KNOW the system, ASSUMPTIONS won't lead to success.
  • Had fun with massive backup restoration rehearsal and integrity testing. This time there were really enjoyably small number of alarming findings. All good. That's the way I like it to be.
  • Had very long discussion with friends about the new surveillance and intelligence law and what kind of effects it might have. "Technical Guideline for the implementation of legal measures for the surveillance of telecommunications and the disclosure of information " . Often things like these are kept under wraps with all kind of gag orders and NDAs. But it's very nice that German friends publicly share this document, so everyone can check what the requirements are. What comes to the NSA documents, NSA having direct access to many cloud service probably is true. Nobody needs to hack the services, because they provide ready API for law enforcement to access all and any information they want. But as stated, that shouldn't really matter too much, because nobody probably puts any highly confidential information on Facebook anyway. I've been often highly worried about the fact, that many people seem to be sending highy confidential information on email, without using proper or any protection at all. Only good thing is that most of people doesn't need to care about these things at all.
  • Of course in Finland also all that kind of information is usually classified with restricted right of possession, publicity and publishing is prohibited under strict non-disclosure obligation / gag orders, without expiry.
  • Internet abuse, all kind of illegal stuff, etc. Are of course major burden for service providers, like discussion forum hosting and so on. How to get rid of stuff which is illegal, who makes decisions. How much the process can be automated. How to verify if data requests are authoritative. It's way common to get messages from someone imposing to be authoritative on some matter, but in reality they're private investigators and other people trying to claim authority. Using all kind of trickery trying to get the (incompetent) administration to give information out, which they shouldn't actually get, etc. Only sure way to deal with this, is just to honor only requests from local authorities which identity and authority are easy to verify. Random requests should be just ignored. Because giving any information out would also break several privacy laws. There are official and secure communication channels that can be used and of course request related documentation needs to be delivered. One way to get around this, is to maintain minimal information about users, as little as possible. There's not much to give out in that situation.
  • For some data base structure issues, some people suggested that I could use databases which support JSON. Sure, but you'll still have to define which content gets indexed and how and so on. Generic database / integration issue is that there's no silver bullet turning complex data structures easily into some other format. Of course you can use JSON to SQL code generator or things like that. But often it isn't still working. Logical processing and data extraction with if rules and mappings is usually required. Good for me, because this is my daily job. Good thing is that XML and JSON etc and generally object languages make building complex data structures quite easy.