Blog‎ > ‎

DB performance, Kalyna, Disk dead, Kludge, MS OTP, Python, ETL

posted Feb 11, 2018, 1:35 AM by Sami Lehtinen   [ updated Feb 11, 2018, 1:35 AM ]
  • Wrote a small transaction & performance testing application, which works on Raima SQL, Microsoft MS-SQL Server (T-SQL), PostgreSQL / Postrgres and SQLite3. All of these SQL databases provide wonderfully different approaches to get same stuff done. Also minor changes in how things are made, greatly affects performance. SERIALIZABLE isolation caused lots of super slow deadlocks on MS-SQL Server. Using UPDLOCK with key queries is 100x faster when updating counters, etc. So many ways to fail and getting things slightly wrong and therefore getting also absolutely wrong results or data messed up in database. Also using ROWLOCK with small test database still doubled the performance for four test threads, when using ROWLOCKs. Reason for this is that most of the records being updated reside on same database PAGE.
  • It seems that Windows is prone to partial crash or hang if disk storage system is slow. Things which are purely ram based, seem to remain working. But everything else is basically dead. This unfortunate situation doesn't at times get automatically fixed, even if disk system starts performing normally. This situation often leads to totally black console screen, or login screen, which doesn't react to anything at all. Interestingly many services still keep connecting TCP sockets etc, but whenever data is being passed to threads which actually do something else, than handle network traffic, things get stuck again. One big problem is still, that nobody knows whats exactly causing it.
  • Checked out Kalyna Cipher from Ukraina.
  • Had lot of discussion with Database Administrator friends about best page size for database. Well, the answer is that it depends from so many factors, that it's hard to say. Most of databases seem to default to around 8 KiB pages. But Some tables benefit from larger pages and others from smaller ones. I personally think that for many reasons like disk storage and memory management, for the general purpose use the 4 KiB page would be pretty much optimal. Also SQLite3 has defaulted to 4 KiB pages since 2016 or version 3.12.0. I've used that page size long before that, because with modern hardware smaller pages won't make any sense.
  • Finally got annoyed enough by Windows 10 updates, to find out the place to change the settings practically. It seems that gpedit.msc is one way and then just configure automatic updates. It overrides all the junk configured in the user interface and gets the stuff finally working as it should have been working all the time. Now Windows updates won't mess up everything, by making systems randomly unavailable.
  • Lot of discussion and testing on SQL Server Performance Tuning matters. See: Wait Stats. Good old: RBAR or Row-By-Agonizing-Row, made me smile again. Transact-SQL (T-SQL). pgSQL.
  • Kludge solutions. Ahh. Those are always as interesting. On the other hand, everyone hates kludge solutions. But if you think it from engineering perspective. It's probably getting done what's required but much less work. Solution might not be pretty or optimal, but it will fulfill the requirements. Is that bad then? Today I recommended kludge solution something pretty similar to VFAT long filenames. Yep. It's not pretty, but it works. And requires 20x less work than 'doing it properly', because it would break so many different dependencies. Just barely good enough, but gets the job done. As example, in this particular case. The kludge drastically improves user experience and usability, even if it's not pretty from hidden technical parts.
  • A friend confirmed that he's also getting the strange error message while doing Microsoft Live login with OTP enabled. Duh. So I'm not the only one having the issue. It's pretty much clear, that they've got some kind of bug with Firefox and their setup, when Firefox is started with 'clean' setup. - Let's see how long it takes before they can fix this kind of challenging fail. Heh.
  • Yes, Python is Slow, and I Don’t Care. I quite much agree with him. But as said, everything is a trade-off. And if you're writing compression algorithm or video encoder for embedded devices, I think high performance options are required. But generally I've been happy with the performance and as being said, usually customers care more about development time (expensive) versus runtime. I just made some load testing for one customer, who was worried about the ETL tasks performance when I mentioned Python. Then we run some codes, and yes, we could double the amount of transactions to be handled daily. And still my ETL task for one days batch, finished under a minute. So I think that's pretty much adequate performance. I do all kind of caching, etc as said earlier. In this case, the database access is slow, processing the actual data is fast as long as you've got it all loaded.