OVH, uWSGI, PostgreSQL, NoSQL, GQL, CM, Chef, Puppet, Ansible, Salt, PaaS, BitTorrent,

Post date: Mar 22, 2015 1:50:16 PM

  • Something different: Chinese DH-10 Cruise missile, Computer Algorithms @ Khan Academy
  • OVH Classic servers strange lag bursts? I assume that the host system is running out of memory and swapping stuff out. So even if VM doesn't show stuff is being swapped out, it is actually swapped out by the host os. This leads to situations where access to memory areas which hasn't been accessed lately can be very slow. It's strange feeling when it seems there's plenty of memory, but in actually it behaves like it's swapped out. On network side there's also some strange things. I'm not sure if it's directly related to this, or if there's some kind of other network traffic throttling or prioritization being used. Because in general network connectivity seems to be great or nearly perfect low latency no packet loss, but in reality when transferring data speeds aren't always what you would expect. Maybe they're limiting rwin or something else. Don't know. But that what I'm experiencing. Compared to the OVH Cloud servers there's clearly lower priority on these OpenVZ boxes. On CPU as well as on Networking side.
  • Wondered how badly Outlook 365 is developed. It seems that it's horrible mess because being partially local email application and partially webmail application. Some of features work only using the desktop client and some of things work only by using the webapp. Most interesting result from this mess is that the desktop app doesn't combine cloud data and local data as it should if it wouldn't been designed so badly. Other email clients like Thunderbird work just so much better. If you don't have local copy of some message, it shouldn't mean that application is unable to show the message. Thunderbird works by caching messages locally, some messages are available some others aren't, which is perfect. I can still see all messages. But with Outlook fail. You can see that folder got 400 messages, but you can only see 100 of those or something similar. There's no way to see rest of the messages, unless synchronizing everything locally, which is simply really bad implementation as far as I know.
  • Studied bit more about uWSGI Python Module - Now stuff using it is working perfectly.
  • Finally managed to configure uWSGI fastrouter-subscription-server so I can run load balancing and other stuff easily with it. What was my problem with it? I didn't realize that when using ports other than 80 you HAVE to enter the port number and when using port 80, you MUST NOT enter the port number. Unfortunately there are no messages what so ever to help with this task, so you don't get any kind of hints, you just have to find the problem via trial and error or by reading the source code as I did. It's good documentation but might take a while to digest.
  • PostgreSQL vs MySQL / MariaDB
  • 7 PostgreSQL data migration hacks
  • Launched one ug project using Google App Engine (GAE) - Platform as a Service (PAAS) - Seems to be working fine. I just would so much love GAE if it would support Python 3.4. I also like Jinja2 template engine very much when using 'alternate platforms', like Linux or Windows servers. Currently I'm using App Engines own template engine with it.
  • Fine tuned my PostgreSQL RDBMS database slightly for performance when using it with peewee ORM. Got nice 25% performance gain by just changing a few lines in a query. Now I'm using lateral join.
  • I'm also using a few less well known SQL databases via ODBC (pyodbc).
  • I just have to say I kind of hate NoSQL term, because it doesn't actually mean anything at all. There has always been different object storages and solutions without transactional features and so on. Even GQL uses NoSQL database using SQL like statement syntax, but it's must drastically limited in features.
  • I think I might need to study more Docker and OpenStack. But summer is coming, maybe next fall.
  • I did take a look at Chef and Puppet, but I think I'll prefer Ansible right now. With current number of servers I'm administering it's just on the edge, if I should use advanced configuration management system (CM) or is it better not to use one. Setting up such system will take considerable effort. It's just smart not to invest heavily in tech that might not be needed, or does not produce meaningful profits or costs savings. Also Salt seems to be pretty interesting.
  • Even if big data is on such a huge demand. It's always a good question is the data reliable and what do you use it for. Having just data it's utterly meaningless, if data quality is bad, also the results even if technically correct can be really seriously misleading. Being data scientist or big data specialist also requires wide set of business and management knowledge. Doing technically correct things without understanding what you're doing can lead to extremely bad results. On the other hand, if you user right tools and methods, big data isn't any different than any other data. Just the data set itself is larger. Basic analytic and statistics skills are still needed. As well as using common logic to verify results, can these even be right, even in theory. I've seen so many times that people generate reports or do something, and say this is the result. When you take a look at it, it's immediately clear that this can't be true nor done correctly. But the question is why didn't they realize it when handling the data. Common sense and knowing your data are really important for making a basic reality checks.
  • Studied Vuze BitTorrent client. Which got new Swarm Merging feature. After reading the specification carefully I don't personally believe it's going to be meaningful feature. It's nice idea, but in reality it's not as useful. On the other hand systems like Freenet and GNUnet have shared data blocks between different downloads 'always' and it's been much more efficiently done than on file level. Not exactly same but reminds me from eMule Advanced Intelligent Corruption Handling (AICH) feature.
  • The Economist, it's just great stuff to read. Even if I linked to web site, I recommend reading the full version.
  • Is PaaS a perfect solution? Nope, it isn't. PaaS isn't silver bullet nor it guarantees any portability between platforms. Actually it can tie you to one platform extremely tightly. Of course you can make application which isn't tied to platform, but it adds overhead, affects performance and so on. For some tasks PaaS is great, but in some cases working around issues with PaaS can hinder whole project or just make running the systems very expensive compared to alternatives. It's just like using mobile frameworks which guarantee write once cross platform applications. It can be great, but it can also make things hard or nearly impossible, add lot of overhead and cause total failure of reaching promised goals. When ever something is "a perfect solution", I instantly get highly skeptical. Either the talker is doing pure marketing, or doesn't know what they're talking about.
  • Users asked if I can relaunch Of course I can. But I just need to fix a few things with it. I'll also launch it with new server which will provide vastly superior performance. Still got a few JavaScript (ugh) kinks to figure out. Everything on uWSGI, PostgreSQL and Python & Server side (Ubuntu Server 64bit) is working perfectly already. This is hobby stuff. I'll only code it at home, when I got the right mood. So things might not happen so quickly as those otherwise could. My main goal is now to vastly improve user experience and clarify a few things, even if everything already is technically working. I hate it when techies say it's ok, even if user experience is absolutely horrible. As example, I don't validate forms right now, I just tell user FAIL if there's anything wrong with the content. Might technically be working solution, but it surely annoys users.