Blog‎ > ‎

JS, HTML, CSS, FIXME, UI, BI, Data Sets, Jit, Real-time enterprise

posted May 1, 2014, 12:44 AM by Sami Lehtinen   [ updated May 1, 2014, 12:46 AM ]
  • Just to remind my self, I did read all tutorials from HTML Dog, actually it's very good tutorial. Short enough, but goes through most important things. Problem with many sources is that those go through endless debate what's the right solution and offer you 100 different solutions to try. Which is very inefficient but unfortunately common approach to many problems. And then you find out that 90% of those solutions was for old version and doesn't even theoretically work anymore.
    Here's the question I posted to Google+
    Does your projects or software products suffer from Bit rot or Technical debt. Especially when doing very early technical proof of concept, I often encounter catastrophic and massive early generation of technical debt? Why? Well, because I'm not sure if it's going to work. I'll just try to implement the key part of the task in fastest and dirtiest way possible. But when it after all happens to works out (as it usually does after some time). Then there's no interest anymore for code refactoring. (or in this case basically completely rewrite it correct way.) How do you personally deal with this problem? That's what I'm asking.
  • I'm trying to challenge my self with my side projects user interface. I'm hoping that it would really be fast and easy to use, with minimal cognitive load. Let's see if I'm successful. I've seen so many badly designed web sites, some are straight out enragingly bad, when you can't get simple task done even if you know very well what you want to get done. As well as minimizing information overload by efficient profiling and analysis about what the user wants to see.
  • Picked from skills in demand list: 4. Python, 6. SQL, 7. HTML5 / CSS3, 10. iOS / Android, excellent match, if I'm thinking about my side joy study project.
  • Once again laughed at code tags and technical debt. "#FIXME - This code works well in testing setting. But when system is used in production, it will lead to performance meltdown." It's also good idea not to forget about bit rot aka software rot. Also see code refactoring, which is notoriously unforgiving task, but it's just so important at times to do it.
  • Some claims about software price and BI price make me wonder, if software that costs 100k€ is expensive or not? If it saves you 100k€ / year after all running expenses, I would say it's dirt cheap. Even if some customers won't agree with me. I think they should check their analytical calculations about realistic ROI and Payback time.
  • Reminded my self about Total quality management (TQM) and Just In Time and no not this time about JIT compiler.
    Current ERP/BI/DW systems and real-time integrations provide great possibilities for JIT business processes. I've been developing those for one shoe store chain. Daily automatic stock replenish, order, deliveries, delivery confirmation, e-invoice integration, payments, etc. Based on sales generic sales volumes, per store volumes, product size natural distribution, article group (per store analysis), life cycle analysis, etc. As well as updating the existing forecast models based on incoming data. Enhancing, improving and automatic whole business process. Whole point of this, was to basically automate everything as far as possible and deliver daily data to management so they can control manufacturing of goods. Well, actually I did this stuff ten years ago.
  • Few thoughts about databases and dataset sizes.
It's a great question what's big. For us tables like that are quite common. We don't yet have tables with billions of records, but tens of millions is very common. Some of my BI SQLite3 databases even contain tens of millions of rows, and it's not a problem at all for daily batch runs. Most of customers aren't ready for real-time enterprise / economy approach yet. Even if my BI integration solutions fully support completely dynamic update intervals, from real-time to once / year or whatever.
Real problem becomes, when you'll need to derive real-time data from tables with billions of rows including joins with other tables that have similar amounts of rows. I'm actually building one globally scaling side project just for my own fun. If it starts going well, the datasets it needs to handle can be quite interesting in size.
I don't like forums like LinkedIn etc, which do not properly track last read pointers for each user etc, calculate post modifiers to rank posts individually for each user and so on. Of course geolocation also plays a role. So every item I'm processing from the main table got several dynamic per user individually calculated scoring factors. I don't have any commercial goals with it, but it's just to keep my knowledge updated and building something that can be really used, instead of making only minimal test modules and then deciding yeah, seems to work, that's it, let's try something else. The project is internally build so that I can easily utilize SQLite3 (for testing), MongoDB (Small scale production) and Google App Engine (Huge global scale) and allows me to test and see actual performance differences on those platforms.
With SQLite3 real problem begins (as with any database) when the index which is accessed all the time, doesn't fit in the system memory anymore. But it of course simply means that the dataset / index is just way too large for the low end server doing the data crunching. Another problem with SQLite3 is lock contention, but usually that problem only hits you after you start making at least hundreds of write transactions per seconds. Of course as in general, it's very important to keep the locking time of tables short. I've seen so many systems grinding to halt, because processes are holding data locked for way too long times.
Most often these problems are caused by developers who aren't familiar with the importance short locking time or complete ignore it. That's why I often recommend doing BI analysis so that connections are in read only, read uncommitted or transaction isolation off mode, and therefore won't cause any locking on the actual database being read. Of course some databases like PostgreSQL aren't affected by these long standing read locks.
  • Attended Tableau Helsinki Experience. Well well, actually I don't really know the need for these kind of shows. Any reasonably competent data handler can learn all they told at the meeting in a few hours. But two things are obvious with Tableau, it's fast (technically) and fast and easy to use. I did all kind of analysis from one production database just in a few hours and also located a few problems, also known as getting information based insights and knowledge about what has happened.  
  • Something to wow about, time of URLs is ending? Yes, I went wt* too, but that's true. It's very important to make site navigation extremely sane. With some badly designed sites, you need to manually modify URL to get where want you want to go. And that won't be happening in future. So navigation must be done right, in every case.
  • Just as I wanted to do my taxes electronically and tried to login, I got announcement that they're upgrading their authentication system and I should try logging in again tomorrow. Why this always happens just when I decide that now I get over with and done with taxes.
  • Something completely different: AW609 a nice Ospray V-22 civil version.