My personal blog about stuff I do, like and I'am interested in. If you have any questions, feel free to mail me! My views and opinions are naturally my own and do not represent anyone else or other organizations.
[ Full list of blog posts ]
- A few tricks how to handle things more efficiently using PostgreSQL and it's arrays.
- YouTube starts using VP9 video codec to improve video quality and saving a ton of bandwidth.
- It seems that during holidays there are tons of attacks towards public internet facing servers. I guess the 'bad guys' know that if they hack the servers during holidays, nobody's probably going to do anything to fix the situation for several days. Especially if they do it so, that they don't disturb it's normal operation and just run their own additional tasks with low priority. Nobody notices or cares anything at all.
- Configured ALL servers to use IPv6 as well as configured corporate network to use IPv6. Servers were easy to configure and software & firewalls, but the corporate network configuration took some pain, because there was different router / firewall which I didn't have to deal earlier with. Had to use similar test system to test and troubleshoot everything before moving working configuration into production. It took quite a while to get everything confirmed cross checked and so on. Also had fun with DUID fields. SLAAC, DHCPv6 and dhcpv6-slaac-problem by IETF. But it was worth of the fun I got while doing it.
- All of my servers are now using only SHA-256 based SSL certificates. Yes including the full certificate chain, so there aren't any intermediate SHA-1 hashes. But the funny thing is that Google says that SHA-1 is obsolete cryptography technology. Yet they're using it all the time. Like in case of Gmail's certificates.
- I just can't stop loving projects which are absolute mess and extremely badly documented. Yeah, you can get things working, if you are brave enough to think and go through all possible configuration options as well as read source code if it's available. It's just so annoying. But well, things will get done, when you just put enough effort into it. Some times some key information is assumed yet if you don't know what it is, you're pretty much failing hard for a long time.
- Added my servers IPv6 address to DNSWL whitelist for ensuring email deliverability.
- Studied tons of stuff about DHCPv6 and Router Advirtsement Flags (M,R,O). I had to, because I'm planning to use it. Yet it doesn't seem to deliver (With ZyWall) some of the key benefits I would assume getting. Like knowing who's using which IP address and when. Unless full manual configuration is being used with DHCPv6 DUID.
- Microsoft "Docker" style servers. Aka containerization with minimal over head. They call it Nano Server on Azure platform.
- Finally after a weekend it seems that systems are now getting 100% correct IPv6 configuration, including Windows & Linux systems as well as mobile devices. That's just awesome. Now everything is 100% dual stack allowing IPv4 and or IPv6 traffic. Also many services which earlier used NAT or port redirection are now directly reachable. It's better than using constant port mappings. As well as horrible in protocol ALG in some cases (FTP). Only one workstation is failing, which is the workstation I've been using to test everything. So there's some kind of configuration issue somewhere. Let's see how this plays out.
- It's good to disable ISATAP, 6to4 and Teredo which are enabled all on default Windows configuration: Just enter commands in elevated shell:
netsh int ipv6 isatap set state disabled
netsh int ipv6 6to4 set state disabled
netsh interface teredo set state disable
- I liked this approach mentioned in one blog: About Startups - Show don't tell. "I'm going to build this amazing thing" is a LOT less interesting than "I've built this slightly crappy thing that actually does something". EVERYONE is GOING to build something, most people never do...
- Amazon EFS did look interesting yet it's extremely light on details, which really do matter in these cases. It's just so easy to make bold bogus claims like "low latency" or high IOPS. Those are very relative terms.
- Tons of configuration work with DNS and IPv6 stuff. But it starts to be pretty much done now. Phew! Now even visitor WiFi network provides full IPv6 connectivity. Allowing both options SLAAC + DHCPv6.
- Checked out Call for bids - When wondering what kind of new features OpenBazaar could implement.
- China's Great Cannon - New stuff? I think the attacks mentioned here have been known well over 20 years. Shocking news right? It was in 95 when I were at office doing IP networking and it was back then already trivial to monitor and modify packets, messages and content on the fly. Yet people still doesn't seem to realize that email and other stuff are just "post cards" whizzing by which can be modified at will when and if required.
- Also checked out: Data Analytics using Pandas and SQLite and Python Boltons
My weekly web log dump:
- Read about: 802.3bz and 802.by which are faster NBASE-T networking standards for Ethernet.
- Checked out a few discussions about how to update components inside docker containers to patch vulnerabilities. Seems that there's currently no good way of doing it.
- Yet another trap. I was going mad because I couldn't get peewee to make the query I wanted to make. Or actually it did run my query, but results weren't exactly what I was expecting to see. There's a table which can contain five different categories for an article. I wanted to join all articles belonging into selected category and then sum sales for those. I did the usual join stuff and added where statement and used sum function. But alas, results were way too small (as sales amount) as well as I could clearly see that the amount of result rows wasn't what I was expecting it to be. After trying jumping through a different loops I found out the problem. When I used pure SQL writing the query took me about 30 seconds. When I used peewee it felt hopeless. After some time I decided I'll need to debug this deeper. Using the traditional ORM is just obfuscation layer for working queries I pulled the SQL query out what Peewee was creating. And there it was the trap. When I wrote the SQL query I just used pure join and then where statements. But Peewee friendly and automatically and seamlessly added JOIN ON. And the ON statement nicely referred to only one of the category fields. I added join(articles, on=( cat1 | cat2 | cat3 | cat4 | cat5 )) and the problem solved. Uaahh. Once again, pure SQL was so easy compared to hidden traps with ORM. Of course that automatic join on can be beneficial at times, if there's shared foreign key it's enough to join tables without additional statements.
- I actually did get a reply from Charles Leifer, I haven't yet said thank you very much! Because I've been busy with other projects.
- Somehow I understand programmers who create useless extra rows into databases. Just to make querying much easier. In case of Peewee all the trouble starts as soon as tables being joined do not contain data being joined on. I've always said that programmers who add pointless data into database aren't really doing a smart thing. But in this case, everything gets just so much easier if you submit into doing that. Suddenly everything works straight out of box, instead of having continuous problems with queries. Another incredibly simple but bit funny way is to run two queries. First get stuff which can be joined and then additionally fetch stuff which can't be joined and then merge and sort it. Or use union and merge two statements where the second part doesn't contain results from the first part. I've seen that happening over and over again. It's some times really funny to see tens of millions of totally useless rows in database. But those are there, to make things simpler. You don't need to handle cases and build complex queries and code to work around missing information, even if it's redundant and useless. I've seen cases where there are tens of gigabytes of useless data stored in tables just to simplify queries. Now I can see why.
- There's also some bad documentation with Peewee. There's difference between JOIN.LEFT_OUTER and JOIN_LEFT_OUTER yet documentation messes up with those. As well as fn.Count() and fn.COUNT() as well as fn.count() aren't same thing at all.
- UpCloud started to offer IPv6 free of charge for their IaaS servers. I've already configured my servers to fully utilize it.
- Some WiFi thoughts: Depends so much from environment. I would use only WPA2, dtim can be 3-10x beacon intervals depending from use case. Like for laptop network I would use 3 and for mobile devices 10. Rts/fragmentation is also very site specific, sometimes smaller values bring better results, but generally rts can be disabled and fragmentation can be off (maximum frame size as threshold). In congested areas Smaller Fragmentation Threshold + RTS can bring better results. If that even matters, in most of cases it doesn't. Depending from device quality auto channel can be preferred.
- Tested dedicated cloud SSD ARM servers from ScaleWay. - Liked it, excellent nice performance / price ratio. Yet the storage is virtual, which means it's stored. So even if server is dedicated shared storage can cause "noisy neighbours" performance problems. Their approach is bit different: "The craziest cloud technology is out! Welcome on board! Say good bye to virtual servers, we have defined the next computing standard: the physical cloud!"
- Tested even more OpenBazaar 0.4 version using several virtual machines. There are some issues, but commits are flowing in at a good pace. 0.3 network seems to be practically dead. I hope release of 0.4 version boosts the network to new heights. Even 0.3 had over 1200 users, mostly testing the network and not actually yet using it for anything. I guess the 0.4 version will reach 10x that easily.
- Something different: T-14 (Armata), Sub-caliber round, Ula Class Submarine
- One test project is currently hosted at OVH. But I do have servers at DigitalOcean, Vultr, Hetzner and UpCloud. I do like OVH for my personal small projects, because it's reliable and dirt cheap. For more business critical stuff I'll prefer Hetzner. They and soyoustart (OVH) provides crazy performance per buck. Links: www.hetzner.de ovh.com www.soyoustart.com also online.net is worth of checking out or if you're looking plenty of storage space and cheap price then kimsufi.com vultr.com. My personal test servers are running at UpCloud, they provide hourly billing great performance but at clearly higher cost. (But still considerably cheaper than Amazon AWS, Google Cloud Compute Engine or MS Azure) One pro for services which got active and passive data is UpCloud MaxIOPS storage, which is combination of RAM, SSD and Cheap SATA storage. Data which is updated or often read is cached and stuff which is rarely accessed rests on SATA. It releaves developer from dealing with that and still gives affordable per GB price. Actually I built at one time such systems using bcache and dmcache. But that won't fly when some of production servers use Windows.
I also love getting a good throughput: 2015-04-03 18:34:05 (68.5 MB/s) - ‘1GB-testfile.dat’ saved [1073741824/1073741824] just yesterday played tested it out with wget.
I did consider Google App Engine for front a while. But problem with GAE is that if you get kicked out for some reason, there's no good real alternative platform to run the app without extensive porting. So for this kind of test project it wasn't a really viable option after all.
- Just checking out GNU Social, what kind of stuff it got similar to Local Board (LclBd.com) and what's different. Is this better than Twitter and, if so, how. It's good to learn new stuff all the time.
- Tried GNU Social at Load Average to see how it's different from this and Twitter. Well, there are plenty of similar projects. Like CupCake Users are free to select which ones to use. Other provide better privacy and features than Twitter. With largest networks there's a big problem that those are being tightly monitored which doesn't necessarily apply to smaller networks like cupcake, GNU social or this Local Board (lclbd). My Load Average profile and my CupCake.io profile.Also tried latest version of Diaspora to see what they've come up with. My Diaspora profile. To be honest, it looks good and again much better than Ello. Finally my Ello.co profile.
- Finished reading again latest issues of the Economist (I just love that stuff) as well as Finnish CTO and System Integrator magazines. Long articles about transmedia where same product is making money on multiple fields, I guess Angry Birds is a quite good example about that. Kw: Classic Concrete Experience Feeling Diverging feel and watch deflective observation watching continuum assimilating think and watch abstract conceptualisation thinking perception converging think and do active experimentation doing processing accommondating feel and do. Problem and project based learning. Learning & Awareness. Economist also got a long article about Data Protection and how rules differ in US and Europe.
- Estonia's e-residency program expands abroad, now official strong digital identity can be applied from 34 countries.
- Once again wondered how multithreaded par2 can somehow hog system so badly, I guess it's related to it's disk IO somehow. After the actual massive and computationally CPU & Memory intensive Reed Solomon matrix compilation starts, system runs fine again.
- Checked out IPFS - Nothing new, Content Addressable Storage (CAS) / Content Based Addressing (CBA) is nothing new at all. - Content Centric Networking - Named Data Networking - Lot of the IPFS talks are quite much off topic, they don't well describe the project, it's just generic promotional marketing like blah blah. Many of the related facts are totally hidden under this marketing hype. I made separate post about this IPFS topic. Sorry, posts are again out of order. I often queue stuff in backlog and release out of order. Some stuff can be just logged in yearly mega dump.
- Got bored with the fact how badly ThunderBird networking stuff is written. At times it just hangs and requires restart of whole system. It really annoys me. I've checked that it's a pure lie and just the internal state of the app sucks. Because I can access same resources using other applications and other networks without any problems. Except that ThunderBird just fails hard. Of course after rebooting workstation miraculously issues at the server gets fixed. Yeah, right.
- Why earlier it was recommended to shard images to multiple pieces, splitting site on multiple domains and so on. Now with HTTP2 single tcp stream is being preferred that's kind of strange...
- For one project which handles tons of messages asynchronously I've implemented "replay solution" which is excellent for testing and development as well as situations where database needs to be reconstructed. It actually quite well follows Apache Samza thoughts. All messages are stored as those are when received from the network into data feed storage and then only local "derivate" data is processed from that for end users. When something needs to be fixed, tested, developed, changed. I just make changes and replay that whole feed storage into system. At that time it's easy to see if everything goes through well and if there are any exceptions raised. As well as if some messages are incorrectly handled for a reason or another, it doesn't matter. I'll just fix code and run replay again. So handy. This also allows fully stateless and asynchronous processing, there's technically no correspondence between other parts of the program and the received / handler module. No state needs to be maintained what so ever, so I'm using fully stateless message passing implementation.
- I don't like Skype at all. It's delivery status information is so archaic. It only lets you know if the message you sent is delivered to "Skype cloud", but it won't tell you if the recepient has received or read the message. Other newer IM systems handle these things much better!
- One article said that there will be huge demand in Sweden for ICT guys. Especially Internet of Things and Big Data will add need for competent techies. It's also important to know well whole system architecture and integrations as well as project management and things will work out smoothly.
- Finnish quote from ICT mag: "Lisää osaajia tarvitaan tulevina vuosina etenkin tietoturvan pariin. Esineiden internetin kasvun ja big datan myötä myös muun muassa järjestelmäarkkitehtuuriosaajille tulee kysyntää.
" - There will be jobs for ICT guys even in future, who are passionate, ready to learn and work hard.
UpCloud*** Windows 2012 R2 configuration ***
configuration. But they explicitly allow only traffic from specified addresses and therefore privacy addressing / extension must be disabled. They did provider instructions how to get this stuff done, but I found the instructions to be "non-optimal".
*** sysctl.conf additions for Ubuntu ***
# IPv6 additions from UpCloud documentation
net.ipv6.conf.all.use_tempaddr = 0
net.ipv6.conf.default.use_tempaddr = 0
*** /etc/networking/interfaces additions ***
iface eth2 inet6 auto
*** Shell ***
sudo ifdown eth2
sudo ifup eth2
*** Verify output and address ***
eth2 Link encap:Ethernet HWaddr aa:aa:aa:80:47:0d
inet6 addr: 2a04:3540:1000:310:a8aa:aaff:fe80:470d/64 Scope:Global
inet6 addr: fe80::a8aa:aaff:fe80:470d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:36 errors:0 dropped:0 overruns:0 frame:0
TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:3868 (3.8 KB) TX bytes:7270 (7.2 KB) *** What wasn't optimal ***
Well, these settings didn't matter at all. So I didn't set those values. Also I don't know if those are really required. At least if you're not planning to setup a router which I wasn't.
And the settings about tempaddr wasn't mentioned at all. I double checked it, without disabling temporary privacy addressing things just won't work out.
With Windows 2012 R2 servers the guide was all good, disabling privacy addressing provided immediate static IPv6 address for server:
netsh interface ipv6 set global randomizeidentifiers=disabled store=active
netsh interface ipv6 set global randomizeidentifiers=disabled store=persistent
netsh interface ipv6 set privacy state=disabled store=active
netsh interface ipv6 set privacy state=disabled store=persistent
A few thoughts about: Discussion forums, Message threading, commenting, derailing, tagging, bumping, and so on
I've been thinking about how to thread discussions on Internet discussion forums. I find weak spots from solutions used by most of the forums. I think most current web forums get it bit wrong and the original Usenet news reference system allowing free tree structure forming individual threading where any of the posts works as tree root and forking discussions into separate threads happened completely automatically. Tree like threading might make following a long and complex discussions which for a lot actually easier but it would make following short discussions harder as well as cause UI design challenges.
After considering several factors things, I've come into conclusion where I actually might prefer it over this current comment / thread model which is now being used by lclbd as well as most of other web forums.
+ Technically perfect threading including fully automatic forking at any point, no need for moderator to split discussions or to whine that this is off-topic comment
+ No bump limit, threads can go on forever
+ Makes derailing discussion thread harder, because actually the derailing forms in a way it's own thread and doesn't fill the actual discussion with junk and if there are people who like that derailed discussion, let it be so
+ Moderator doesn't need to care about derailing
+ Users can modify tags during the discussion
+ Discussions which also diverge geographically would work without problems and would be seen by local people where ever the commenter is
+ Discussions which continue for extended periods won't cause problems, like causing annoying 'bumping' of really old discussions with initial posts
+ Bad initial post won't ruin whole thread, because rest of users can start actually discussion from good comments instead of starting from the bad initiatio
+ No reason to lock old discussions like some bad forum platforms do (not mentioning any)
- It might make harder for user to read "whole" discussion, because there's no such traditional clear thread. This message came first and then was followed by these messages. Or actually there of course is, but it's not all visible in one view.
- Potential challenges on UI design. But I guess I can deal with those. It's not harder than stuff I've already implemented. Maybe I should show 1 or 2 levels of comments so it's easier to decide which discussion for to follow.
RFC, requesting for comments: What do you think about this? Would it be worth of it? I think it would provide clear benefit compared to most of currently used systems.
Each message could be a reply to existing message and there wouldn't be "threads", "posts" and "comments" anymore. Just new posts which might be about previous posts.
I've also revisited Local Board (LclBd)
forum data structures and data access models. Trying to build it as flexible and versatile as possible. Simultaneously solving problems represented by all (?) existing solutions.
Thanks go to the friend who had private chats about this topic and helped with all the considerations. I'll actually rewrite some key parts of this site so that the internal data structure gets improved a lot. There will be "root tags" and "Root locations", but around that everything will be revolving freely. The whole concept has changed a little when I've been trying to make this as versatile and flexible as possible.
A few links to begin with:
IPFS is like running Git which stores all of the objects (files, directories, commits, etc) into a BitTorrent swarm. The files are accessed using file hash so files are automatically integrity verified using cryptographic hash even when using insecure networks. As a bonus it's possible to browse this network using HTTP (Just like Freenet) but even better, you can mount it as a file system. Of course this means that there has to be available seeds for the files. So it's not hosting or storage solution. Of course popular files will be seeded by multiple people, making those quick and reliable to access. But as we know from Edonkey 2000 (ED2K) and other peer to peer (p2p) networks, files go and die after a certain period. Some times even surprisingly quickly. Of course files remain available as long as clients are still sharing the files. So it doesn't matter if the peer / original seed or content publisher goes offline.
I think it could be interesting to implement a discussion board like lclbd.com
or Social network like Facebook, Twitter, Instagram or 4chan using IPFS. At least delivering all static data would be easy using it and it would require much less server resources. Completely another question is if there would be enough peers to support data structures or if server would need to continuously "reseed" p2p network for data. But that's exactly what happens with BitTorrent when ever there is stuff which is mostly downloaded and rarely seeded. It's possible that there are other seeds than the "original official source" but it's also possible that there aren't. This is one of the reasons why there is a support for HTTP seeds. I'm sure this offers interesting possibilities. There has been CDN networks like this already.
This is exactly what kind of direction I assumed that the The Pirate Bay would have been doing when they announced that they're doing state of the art stuff. But alas, they just published some reverse proxy junk, which they claimed to be latest cloud stuff, yawn.
It's really worth of noting, unless you already really acknowledge it that data added to IPFS is only stored locally until another interested party requests it from you. IPFS itself (the protocol) provides no guarantees of redundancy or remote storage. So the original claims about "permanent web" are pure bogus making the idea sound like important and good.
- Wondered in a meeting how different views can be used to approach same matter. Marketing, Technology, Helpdesk, End Users, Invoicing, Server Management, Software Developers, System Peripherals, Customer Sectors. And so on. Result? Lot of semi great discussion, but very little factual results.
- Something different: G6, PzH-2000, M109 Paladin, B-1 Lancer
- Read: 10 Myths of Enterprise Python
- Checked out: Apache Flink - It surely looks pretty interesting and something I would use if I would need it. I'll keep this one in mind. - Kw: apacheflink, flink, apache, bigdata, hadoop, inmemory, java, cluster, yarn, hdfs, mapreduce, hbase.
- Checked out: Google Cloud Dataflow - Hmm, all github examples in Java, checked out but didn't bother to run. Yet I don't see any use for this in near future.
- Read: Hackers guide to Neural Networks
- Read: Deep Learning vs Machine Learning vs Pattern Recognition - kw: Convolutional Neural Nets, ConvNets Big-data, PaaS, Artificial Intelligence.
- Read: Why does Deep Learning work?
- Plan: I'm going to focus my studies on Data Analytics and Open Data for a while.
- Had some fun, as usual. With different kind of combination, packaged and recipe articles while doing CMR, BI and ERP integrations. It's one of my favorite topics, because really simple things get so complicated after all. Especially in situations where external stuff gets referenced by parts of the package but that must not be visible to end user and so on. Of course if everything would be done from scratch it wouldn't be so hard. But usually there are pre-existing complex systems and you'll have just to find a simplest possible way to work around restrictions of these systems while still delivering what's required by the client. That's my daily stuff.
- Took deeper look at Ansible and SaltStack (Salt) to evaluate those again, tried a few things in test environment. Windows Remote Management (WinRM) PowerShell (ps) - kw: configuration management and orchestration tools, playbook, execution module, state module, Python, SLS, YAML, ZeroMQ, SSH, inventory, orchestration, Vagrant, mocking, testing, scripting, modules, events, reactors, JSON, Linux, Windows, Tower, CLI, templates, master, minions, PyDSL, Jinja, desired state configuration (DSC), Powershell.
- Read: Moving away from Puppet, SaltStack or Ansible?
- Read bit more about Vagrant - Why they say development environments? Does development and production environments have some kind of difference?
- Lol, one guy said in his blog "I spent my vacation reading computer literature and documentation". That someone is like me. ;)
- Configuration Management (CM), requires quite a bit of server fu, duh. There's just so many dumb ways to fail. Is Ansible or Salt even required, PowerShell is the core of Windows configuration, does DSC do everything required?
- Checked out Azure Regions and Zones as well as Microsoft Azure Regions and Azure RemoteApp. - Again costs are major player with this technology.
- PaaS doesn't relieve you from software maintenance. Technologies which are used in your apps will be shut down and you'll need to migrate your apps to utilize newer solutions. Google App Engine Master/Slave datastore will be shut down in July 6, 2015. Applications must be migrated to use newer database (NoSQL, HDR or NDB) before that. I think I'll just shutdown my old projects because of this. I don't have interest of migrating those. kw: datastore, Paxos, eventually consistent.
- Checked out bunch of HTML5 Front-end frameworks. It's horrible, there are just so many options, in hundreds easily.
- 9ox.net finally closed, because google deprecated and shut down the database it was using.
- Had a long discussion about different webshop platforms with my friend which is working in the field. - Prestashop, Drupal, Magento, Virtuemart, ZenCart, osCommerce, Wordpress ja OpenCart.
- Had some discussion about developing electronic purchase receipt formats further.
- Watched Haaga-Helia Future Forum marketing and system integration videos including TARU videos which is about future digitalized and integrated business systems. WhatsApp marketing. XBRL, Real-Time Economy (RTE).
- Enjoyed wondering different Microsoft Windows CAL options.
- A Great slideshare post The Emerging Global Web - how Internet is changing the global trade. There are winners and there are losers.
- Read: Building Two-factor Authentication
- JSON API - A standard for building APIs in JSON. - Nothing new at all, but it's nice that there are good examples for people making their first JSON APIs.
- I was asked if I want to join a SIG developing standards for business message formats for Open Data (sorry, no details at this point) and integration APIs. Well, that's most interesting and yes. I'm naturally interested to do that. I just believe my approach to many problems is really pragmatic. Let's see how that works out with people who got much more theoretical or academic approach. I guess it's good and brings out differences and can lead to valuable outcome.
- Read: Startup advice briefly by Sam Antman - Too short? Maybe it's better to check out the long version.
- Radical Statements about the Mobile Web - More web vs native apps discussion. I personally don't want to install any apps or crap on my phone or on my computer, unless I know it's absolutely vital software. I always prefer web over native apps in most of cases. But that's just me.
- I got involved in discussion if using 2FA will stop hacking or drastically improve security. Well, it will projetect from SOME attack scenarios but it surely won't stop hacking. Sure. So usually site hacking is done via exploiting some vulnerability on site or system. At that point they can usually steal anything they wish from the system. So using 2FA won't actually protect against that. As well as if they truly rooted the system, they can do what ever they want. As well as circumvent password protection measures as well as access all data, source codes, and so on. At that point, I really don't care if you got my password or not, that's least of my worries. My old password to Slack was: Q-CfK4h1H_bB0mN7PPvD I guess you or nobody else doesn't care about that fact at all. More important than slack passwords is the data in the system, like if people have given credentials to other systems during chat conversions and so on. From the end users point of view, if they have rooted your device, 2FA won't help either. They can steal already authenticated cookie, they can route traffic via your device so IP won't seem strange, as well as they can basically do what ever they want to. So, yes 2FA protects from some threats, but it really won't protect you from hacking at all. I could go on about this for much longer, but I believe I made the point clear.
- Lightly checked out: Apache Kafka, Apache Samza
- Turning the database inside-out with Apache Samza - Aww, so much talk about things which are obvious to everyone. Replication, caching, indexing, transactions, race conditions, locking issues, materialized views, data transformation, replication stream, transaction log write ahead log, immutable facts, immutable events, better data, analytics, fully precomputed caches. It's better to skip straight to "Let's rethink materialized views!" section. Other keywords: HTMLDOM, JSON, CSS, React, Angular, Ember, Functional Reactive Programming (FRP), Elm, Publish, Subscribe, Notify, Request, Response, Meteor, Firebase, Subscribers, RethinkDB, Designing Data-Intensive Applications, stateful steam processing.
- Created: Google+ Brython Users Community
- Studed several posts from Open Knowledge Blog
Here's just the about page from the site as memory.
9ox.net credits, about, info
Do you hate long short urls like http://bit.ly/hUMuOq
or http://tinyurl.com/5wrvrpp which are short in theory,
but totally impossible to remember!
9ox.net solves your problem. We provide really tiny urls.
We can keep urls short because our urls expire
30 days after last visit or 365 days after creation.
Domain 9ox.net by SpamGourmet
Minimalistic URL shortener by Sami Lehtinen.
Powered by Google App Engine.
Free service, no guarantees or warranties whatsoever.
closed down, because Google closed Master/Slave datastore
2013.12.22 Version 4.10 released, service disabled
Service is disabled for non onion urls. .onion urls do still work.
Because I simply don't have time to maintain this
project. Deal with potential abuse etc. Otherwise I have been very
happy with Google App Engine and it's reliability and performance.
It was good as long as it lasted. - Thank you.
2013.02.16 Version 4.9 released
Minor internal restructuring & HTML structure improvements.
2012.12.01 Version 4.8 released
Added support for .onion domains.
2012.10.13 Version 4.7 released
Updated DNS proxies (IPv6 support dropped temporarily).
Now two primary DNS servers are hosted on own servers.
Only two backups are on located on web hosting services.
Primary servers are in EU & US and backup servers in US.
Failed server information is cached for one hour for
2011.12.18 Version 4.6 released
Google Safe Browsing API implemented.
Internal Memcaching improved.
2011.12.03 Version 4.5 released
Unicode URLs are now working perfectly.
Further improved caching. Now all redirects are public
and cacheable for 1 day.
2011.11.05 Version 4.4 released
Lot of tuning with Unicode urls, solution still isn't
perfect but it's working most of time.
Added second primary web-DNS service server. It's hosted on
another of my virtual servers, using uWSGI and Nginx.
2011.10.30 Version 4.3 released
Added two more backup DNS servers unless primary server
is unreachable or times out. Now service also supports
sites (domains) which have only AAAA DNS records. (IPv6)
2011.10.06 Version 4.2 released
SURBL checking using secondary Linux server added.
2011.10.01 Version 4.1 released
SURBL based spam checking activated. Now every link submitted
to service will be checked against SURBL reputation data.
2011.09.07 Version 4.0 released
server resources, because if JS is not evaluated from create
page database won't be accessed at all.
Frontpage is now fully cacheable, earlier GAE sent permanent
redirect but with no-cache tag. It's quite pointless afaik.
Now it's fixed. Frontpage is browser cacheable and there is
no additional redirect anymore.
2011.09.05 Version 3.1 released
Spam controls are tightened further. Now all URLs (domains)
in database are daily rechecked.
2011.09.03 Version 3.0 released
Database logic changed, now shorter urls are always preferred
instead of oldest urls. Lot of internal database access related
2011.07.07 Version 2.1 released
Database uses is now using faster eventual consistent reads.
Memcache is used for caching data in memory.
2011.06.18 Version 2.0 released
Implemented three independent spam protection mechanisms.
One of those is WOT.
WOT should prevent creation of links to spam adverticed sites.
Unicode URL handling fixed. URLs like http://ä.Ö/€ do work now.
Exception handling and database transactionality and concurrency
2011.04.11 Version 1.0 released
Database transactionality improvements.
2011.03.24 Version 0.9 released
Let's see if this works out.
9ox.net - URL Shortener
- Something different: Chinese DH-10 Cruise missile, Computer Algorithms @ Khan Academy
- OVH Classic servers strange lag bursts? I assume that the host system is running out of memory and swapping stuff out. So even if VM doesn't show stuff is being swapped out, it is actually swapped out by the host os. This leads to situations where access to memory areas which hasn't been accessed lately can be very slow. It's strange feeling when it seems there's plenty of memory, but in actually it behaves like it's swapped out. On network side there's also some strange things. I'm not sure if it's directly related to this, or if there's some kind of other network traffic throttling or prioritization being used. Because in general network connectivity seems to be great or nearly perfect low latency no packet loss, but in reality when transferring data speeds aren't always what you would expect. Maybe they're limiting rwin or something else. Don't know. But that what I'm experiencing. Compared to the OVH Cloud servers there's clearly lower priority on these OpenVZ boxes. On CPU as well as on Networking side.
- Wondered how badly Outlook 365 is developed. It seems that it's horrible mess because being partially local email application and partially webmail application. Some of features work only using the desktop client and some of things work only by using the webapp. Most interesting result from this mess is that the desktop app doesn't combine cloud data and local data as it should if it wouldn't been designed so badly. Other email clients like Thunderbird work just so much better. If you don't have local copy of some message, it shouldn't mean that application is unable to show the message. Thunderbird works by caching messages locally, some messages are available some others aren't, which is perfect. I can still see all messages. But with Outlook fail. You can see that folder got 400 messages, but you can only see 100 of those or something similar. There's no way to see rest of the messages, unless synchronizing everything locally, which is simply really bad implementation as far as I know.
- Studied bit more about uWSGI Python Module - Now stuff using it is working perfectly.
- Finally managed to configure uWSGI fastrouter-subscription-server so I can run load balancing and other stuff easily with it. What was my problem with it? I didn't realize that when using ports other than 80 you HAVE to enter the port number and when using port 80, you MUST NOT enter the port number. Unfortunately there are no messages what so ever to help with this task, so you don't get any kind of hints, you just have to find the problem via trial and error or by reading the source code as I did. It's good documentation but might take a while to digest.
- PostgreSQL vs MySQL / MariaDB
- 7 PostgreSQL data migration hacks
- Launched one ug project using Google App Engine (GAE) - Platform as a Service (PAAS) - Seems to be working fine. I just would so much love GAE if it would support Python 3.4. I also like Jinja2 template engine very much when using 'alternate platforms', like Linux or Windows servers. Currently I'm using App Engines own template engine with it.
- Fine tuned my PostgreSQL RDBMS database slightly for performance when using it with peewee ORM. Got nice 25% performance gain by just changing a few lines in a query. Now I'm using lateral join.
- I'm also using a few less well known SQL databases via ODBC (pyodbc).
- I just have to say I kind of hate NoSQL term, because it doesn't actually mean anything at all. There has always been different object storages and solutions without transactional features and so on. Even GQL uses NoSQL database using SQL like statement syntax, but it's must drastically limited in features.
- I think I might need to study more Docker and OpenStack. But summer is coming, maybe next fall.
- I did take a look at Chef and Puppet, but I think I'll prefer Ansible right now. With current number of servers I'm administering it's just on the edge, if I should use advanced configuration management system (CM) or is it better not to use one. Setting up such system will take considerable effort. It's just smart not to invest heavily in tech that might not be needed, or does not produce meaningful profits or costs savings. Also Salt seems to be pretty interesting.
- Even if big data is on such a huge demand. It's always a good question is the data reliable and what do you use it for. Having just data it's utterly meaningless, if data quality is bad, also the results even if technically correct can be really seriously misleading. Being data scientist or big data specialist also requires wide set of business and management knowledge. Doing technically correct things without understanding what you're doing can lead to extremely bad results. On the other hand, if you user right tools and methods, big data isn't any different than any other data. Just the data set itself is larger. Basic analytic and statistics skills are still needed. As well as using common logic to verify results, can these even be right, even in theory. I've seen so many times that people generate reports or do something, and say this is the result. When you take a look at it, it's immediately clear that this can't be true nor done correctly. But the question is why didn't they realize it when handling the data. Common sense and knowing your data are really important for making a basic reality checks.
- Studied Vuze BitTorrent client. Which got new Swarm Merging feature. After reading the specification carefully I don't personally believe it's going to be meaningful feature. It's nice idea, but in reality it's not as useful. On the other hand systems like Freenet and GNUnet have shared data blocks between different downloads 'always' and it's been much more efficiently done than on file level. Not exactly same but reminds me from eMule Advanced Intelligent Corruption Handling (AICH) feature.
- The Economist, it's just great stuff to read. Even if I linked to web site, I recommend reading the full version.
- Is PaaS a perfect solution? Nope, it isn't. PaaS isn't silver bullet nor it guarantees any portability between platforms. Actually it can tie you to one platform extremely tightly. Of course you can make application which isn't tied to platform, but it adds overhead, affects performance and so on. For some tasks PaaS is great, but in some cases working around issues with PaaS can hinder whole project or just make running the systems very expensive compared to alternatives. It's just like using mobile frameworks which guarantee write once cross platform applications. It can be great, but it can also make things hard or nearly impossible, add lot of overhead and cause total failure of reaching promised goals. When ever something is "a perfect solution", I instantly get highly skeptical. Either the talker is doing pure marketing, or doesn't know what they're talking about.
- Android privacy email address autocomplete privacy horror. Why are programs often written by so poor programmers? Why is everything stored the application sees and you can't even remove that information.
- For some reason it seems that CloudFlare converts all Etags to Weak Etags. So even if I set strong Etag when the browser returns the query via CloudFlare it now contains W/ prefix. If I drop CloudFlare and do same stuff directly, then Etag is strong ie without W/ prefix.
- My brain hurts, had again a few discussions about what's the difference between GiB and GB, what's Gbit and what's the difference between bit and byte.
- Had a few discussions about cost based vs value based pricing. Cost based model is always bad.
- Reminded my self about TLS 1.3 and AEAD - KW: TLS handshake, TLS session resumption, TLS False Start, TLS record size optimization, Early termination, OCSP stapling, HTTP Strict Transport Security (HSTS), CCM, EAX, OCB, GCM, AE, MAC
- Signs that you're a bad programmer. I really liked this article. - Found out many issues. Smile. Especially being in hurry as well as avoiding overhead by putting something where it doesn't really belong. Depends from project size and scope, if that's Okish or really bad. Of course if required you can come back later and clean it up. - There's also clear difference between if it's done as temporary prototype code or if it's planned to be something more permanent. - Especially "do what ever it takes" to make it work right now, without researching topic sounded pretty familiar for quick prototype testing code. Why write perfect prototype code? If it won't work out it's getting discarded anyway. As well as adding temporary unrelated features to existing application to avoid the overhead of starting a new project. Smile. - It's important to know how these should be done, if it would be worth of it. - Also the Pinball Programming made me smile really much. But that's one of the symptoms of the previous stuff. If you're writing program that's going to be probably run only once in production, how much time you should waste to document and fine tune it. If it runs once and produces required results, that's it. - Yet similar code in more used programs is 'fatally flawed' and making everyone else life hard. - Anyway the Symptoms list made me smile so hard, there's just tons of stuff there which we all have seen several times, awesome! - I don't want to say anything about the "Unfamiliar with the principles of security" section, because it would be just way too horrible. - But the good thing is that the list was completely familiar and didn't bring surprises. It's just that in some cases there's decision to knowingly ignore some rules for temporary stuff.
- People who claim this temporary ignorance is bad, are doing it wrong them selves. Why to make mold and cast it from bronze, if quickly cutting it with knife from styrofoam does practically the same job? Cutting corners when it's suitable is also perfection. Over engineering can be really expensive. Actually Mythbusters are also pretty good at cutting corners, how to make something that works with really limited resources, even if they have been lately using pretty large resources.
- Something different: BrahMos, Durandal, NRO, DigitalGlobe, Gravitational Wave, BICEP and Keck Array, Disc tumbler lock, Lockwiki, KW: Keyhole, Hexagon, Topaz, Crystal
- Reviewed source code of one OpenSource project and immediately found two serious bugs. Well, it's good that open source code can be reviewed by anyone. kw: code review, bugs, fixed, python, reviews, commit, patch, fix, bug, git, github, pythonista.
- Google Cloud Storage Nearline -
- Checked my SSD wear leveling data, block erasure information and total amount of data written and health. It seems that now when the SSD drive has been used for 1,5 years. It's life status is about 99% left. Which means that I don't need to be particularly worried about 'burning out' my SSD. I highly doubt that current SSD hardware will become obsolete in less than 100 years. Of course there's a little problem of my personally expiring before that happens too. ;)
- Found nice trap from one Python project (not my own this time), they used 'is' word to compare two values. But there's a big trap with that in Python. 1 is 1? Ok, 1 is 1+1-1 that works out. But the code used is on totally wrong places. Because if you use is instead of == it's really bad habit because when values get large enough, it's going to fail. And large doesn't really mean lager on Python int scale at all.
- There's no now for distributed systems KW: Google Spanner, FLP, CAP theorem, GPS, NTP, Paxos, ACID, Strong Eventual Consistency, Apache Zookeeper
- Goodbye MongoDB hello PostgreSQL - Key Value storage, JSON indexing, performance, reliability, correctness, consistency, sql, nosql, schemaless, replication, sharding, sharded, distributed.
- Lol, one unannounced organization got Cryptowall on their server and it also encrypted backups. So, backing up to media which is connected all the time to the system isn't a great idea either. Like I have said, there should not be option to delete or access earlier backups, just send more delta data.
- Yet another SSD endurance test - I'm heavy user and I've been writing about 1 TB / year to my SSD. So again, I think the drive will become obsolete in less than 100 years, so the actual endurance doesn't matter. Some of the tested drives lived up to over 1 PB of writes.
- Checked out payment & identification solutions: RuPay, Aadhaar, China UnionPay, JCB, American Express, Diners Club
- Re-read: ext4 and btrfs man pages, studied Bluetooth 4.2 smart.
- Several BI articles, kw: data virtualization, etl, web services, soa, esb, information as a service, CIO, CDO, nosql, hadoop, sql, Gartner, SAS Institute Federation Server, SAP Hana, SQL Server Information Services, IBM InfoSphere, JBoss, Composite Software, Informatica, Cisco Data Virtualization Platform, Dendo, Dendo Express. Thoughts: Maybe I should try Dendo Express to see what it really can do.
- I've been wondering why payments and identity businesses are considered to be separate businesses. Basically payments are just application of identification. Technically all this stuff is just so simple, when you got the primitives right. For primitives I mean public key stuff, which already exists in easy to use libraries like NaCL. - http://nacl.cr.yp.to/ - When you can identify the user using public keys and users can sign tokens using their private keys and you can verify those using their public key, what's so hard? It should all be technically trivial. The whole problem comes from the ecosystem, are solutions supported? How easy those are to use? Are there any transaction fees, who's managing the trust network and so on. Is it easy to use without mobile phone, easy to use without computer, can it be used without the users authorization and so on. So after all it turns out there's no simple universal easy to use or cheap solution after all. That's the reason why market is so extremely fragmented. Worst part is to get national laws to accept authentication solution. If they used this solution to make a contract can it be enforced legally?
- Checked out Hypersecu FIDO U2F Security Key - And their blog
- Python internals and things you just need to know.
>>> 256 is 256-1+1 # This will match
>>> 257 is 257-1+1 # But this won't anymore.
- If you don't know the environment you're developing for, your code can contain very serious bugs which are hard to spot, because you don't understand the mechanism causing those. Just like the case where ASP int() worked like floor of most languages. Always rounding "down" even with negative numbers, so -1.1 becomes -2. As well as my fail with Peewee SQL where I didn't realize that not True doesn't match with None.
- Let's see, there's updated version of Subspace documentation - I also had chat with the author 'ctpacia' about this topic.
- Had a training about CMMI project management, product managing web sales channel, Kanban, Lean, Scrum, Business Model canvas, KanScrum, analyzing program usage situations and documentation and using this information to improve software products.
I'm thinking about credit card form device which would have full surface touch screen and slidable USB connector for charging and connecting to computer. Wireless charging would be nice too. It would also naturally feature Bluetooth and NFC connectivity as well as have very small CCD camera to read QR codes in cases where no other communication solution is available.
A few use cases:
- Pure NFC authentication to open doors or to login web sites or what ever mobile applications
- PIN-code to activate high security identity and then NFC authentication to open high security doors or to login to bank or tax authorities or and so on
- Bluetooth authentication for low security seamless proximity locking like car doors or to login generic low security web sites where you usually use crappy passwords
- Challenge response using number keyboard or QR codes for medium security identification like over telephone or any other situation. Also if high security identity is used, PIN code is required before getting the response code.
- Receiving signing request over Bluetooth, NFC, USB or QR code from web site or application to sign high security transactions. In this case the card will display information about the transaction you're going to sign. This prevents scams by malware where you think you're paying 5€ to charity, but in reality you're transferring half a million to some random Nigerian bank.
- Biometric identification could be also used for low security purposes instead of PIN code. But currently it's a technical problem. Also if true physical presence is required it's better to use on location sensors, card could provide required information to identify the user with the sensor. Of course all these measures can be combined with other things. Like it's card and on door biometric detector, which makes sure you're really there. If it's only over NFC identification which tells that yes, you're here, there could be relay attacks and it doesn't mean that you're really there. It just means that your identification information is available right now. Of course this could be also used in some cases as feature. It makes sure someone authorized you right now. As well as the card can display the information (once again) what you're exactly authorizing the other person to do.
So card can have multiple identities, those identities can be 'remotely readable', require NFC or USB mode, as well as PIN before activation for higher security purposes. Display can be used to show signing requests as well as to confirm identity of the service which is requesting authentication. For tinfoil head guys the display can naturally snow key fingerprints so it's possible to confirm that nobody's playing with the keys.
Naturally the authentication database and service should be open for anyone. So it can be used. One of show stoppers is that using authentication solution has been made so hard process that it's virtually impossible for everything else than for companies who are specialized for authentication or are very large players and spent huge amounts of money for this kind of stuff. Using electronic authentication should be trivial, just as easy as it's to check password or any other official identification document.
I know, I know mobile phones can do all this at least in theory. Problem is that mobile phones are nowadays full computers and therefore it's probably possible to mess with it using malware. And for sure it's possible if the device is rooted.
- Not an exact translation. Contains lot of generic blah blah.
Kirjoittelin joskus tästä aiheesta pitkästi. Useimmissa näissä autentikointitavoissa on se vika, että sitten tunnistautuminen perustuu laitteeseen. FIDO U2F tokeneissa ei ole esim. lukitusta. Kuka tahansa joka saa laitteen haltuun voi väärinkäyttää sitä. Tämä on ihan merkittävä riski. Tietysti tuota voidaan softapuolella korjata vaatimalla muutakin tunnistautumistietoa. Toinen ongelma on se, että mm. pankkikäytössä, laitteella tunnistaudutaan, mutta sen jälkeen pelikenttä on taas täysin auki. Eli olisi sairaan hienoa, kun olisi sellainen ratkaisu joka myös varmistaisi sen mitä sillä tokenilla halutaan tehdä, eikä vain sitä, että onko sulla tokeni. Muistaakseni tällaisia ratkaisuta on ollut harvassa, mutta joku on senkin toteuttuanut. Käsittääkseni muutamalla Saksalaisella liikepankilla on mm. tällaisia käytössä. Nykyjään tuon voisi toteuttaa juuri tuolla mobiilisovelluksella, mutta silloin ongelmana on se, että entäs jos palvelua käytetään jo mobiilina. Silloin luotettavuus taas putoaa. Tietenkin asiaa varten voisi tehdä täysin dedikoidun laitteen, mutta se taas nostaa olennaisesti kustannuksia. Dedikoitulaite on kuitenkin mielestäni turvallisin vaihtoehto silloin, kun käyttötarkoitus on sellainen joka oikeuttaa tuon kustannuksen. Nykyjään myös mm. Bluetooth 4.2 tai vaikka BLE mahdollistaa sen, että tuo dedikoitulaite ei välttämättä ole hirvittävän kallis. Vastaavasti CCD kamerat jne on edullisia, joten laitteen voisi toteuttaa myös sellaisena, että sitä voi käyttää myös ilman matkapuhelinta. Toisaalta jos haetaan taas varmaatunnistusta pitkillä avaimilla, niin sitten tiedonsiirto voi olla haaste. Eiköhän tekniikan kehittyessä tule tällainenkin ratkaisu tarjolle.
Visioni? Luottokortinkokoon rakennettu laite, jossa koko kortin kokoinen kosketusnäyttö ja näppäimistö, liittämistä varten reunasta voi tökätä USB liittimen ulos tai voi käyttää bluetoothia, NFC:tä tai laitteessa olevaa CCD kameraa tiedonsiirtoon QR koodeista. Riippuen sovelluksesta voi homma toimia seuraavasti:
- Pelkällä kortilla mm. NFC (ovet rakennuksissa), jos korkeamman turvatason ovi voidaan käyttää kortin näppäimistöä lisänä, ettei pelkkä kortti kelpaa.
- Pelkällä kortilla mm. bluetooth (vaikka auto proximity lukitus)
- Syöttämällä challenge koodi näppäimistöllä -> antaa responsen -> vaikka puhelinautentikointi
- Korkeanturvallisuuden kirjautuminen, kytke USB:llä, näet mihin palveluun olet kirjautumassa, anna pin koodi -> kirjautuminen.
- Mobiilisovellusken valtuuttaminen bluetoothilla ja hyväksynnällä laitteessa.
- Johonkin random web-palveluun kirjautuminen jossain matkoilla vaikka 'web kioskista'. Lue QR koodi challenge login sivulta ja syötä response.
- Isomman maksusuorituksen hyväksyminen, laite näyttää maksusuorituksen tiedot näytöllä, annat pin-koodin, jonka jälkeen allekirjoitettu tapahtuma palautetaan pankkiin.
Luonnollisesti myös tuki useammalle täysin itsenäiselle identiteetille joita voi poistaa ja generoida tarpeen niin vaatiessa. Joo, lista ei oo ihan aukoton, mutta konsepti tuli varmaan selväksi. Luonnollisesti tähän palveluun kuuluu myös ns. kansallinen identiteetipalvelu, jonka kautta tuon tunnistamisen voi tehdä kuka tahansa. Monesti asiat on tehty vaan hankalaiksi, eikä tunnistautumista pysty hyödyntämään kun ne tahot, jotka erityisesti alkaa sen kanssa hankalasti nysväämään. Asioista tehdään tahallaan hankalia ja kalliita ja sitten ihmetellään kun kukaan ei käytä niitä, esimerkkinä VETUMA.
Btw. En tiedä onko operaattorit tajunneet asiaa, mutta itseasiassa mobiilivarmennetta voi käyttää tällä hetkellä ihan kuka tahansa ilman mitään sopimuksia. Ai häh? Miten niin? No mitäpä tarjoavat ns. testipalvelua, jonka kautta tuo onnistuu. Tuo on todella näppärä juttu jos sen vaan tajuaa. Kiitos siitä, että testin onnistumisen kuittaussanomassa näkyy kaikki tarvittavat identifiointi tiedot.