Topic mega dump 2013 (1 of 2)

Post date: Jan 6, 2015 6:36:04 PM

Yeah, I know it's already 2015, but this is exactly the reason I'm just doing this quick dump:

Multi-tenant, cloud based, on demand capacity scaling. Average server load is under 10%, so why do we need the 90% for?
Intel HTML5 tools
SQLite3 parallel access
Error detection, handling and self recovery
BGP / Anycast
Implementing message queues in relational databases, I think I've covered this topic already. Not optimal, but works.
Database as queue anti pattern, and again. I even posted longer post about this. But other queue solutions might not be much better, if data is being queued for too long.
I have seen way too many directory / file based or database based queues. - Yet those do work as well. My current mail server is actually doing just that. (Postfix, with maildir)
So about using database as queue, here are a few problems and few suggestions how to improve it (a post by me).
How to design rest APIs for Mobile.
Web store, card acquirer, point of sale, bookkeeping, invoicing, ERP, card issuer, card transaction clearing, order number, data structure, ticket sales, reference numbers.
Physical Aggregation, Fabric Integration, Subsystem Aggregation. Solution where there are several completely separated modular subsystems like processing, memory and I/O.
Whole rack is "single computer" with shared resources. (Rack-Scale Architecture, Open Compute Project, silicon photonics technology, Intel Atom S1200)
Slow database server? Yeah. After checking why it's slow. I found out process which repeatedly requested 69421 rows 536 times in 10 minutes. And another query by another process did request 55512 rows 536 in 10 minutes. So it seems that some of these queries were basically running in never ending loop. This started interesting question, should the code be fixed, is it designed correctly, why data is being requested all the time. How much memory server should have, does the requests even hit indexes and so on. Engineers though it would be best to add a lot of memory and CPU resources to the system, because it's now slow. Finding out these things took less than hour, fixing the issue. It remains to be seen if it ever happens. I guess the fix is adding more resources after all.
Performance issues, endless loop without delay. Why solutions x,y,z and being applied when nobody knows what the problem is? I have seen this happening over and over again.
Discussed above matters with independent IT professionals and they all agreed that these problems are really common. Code is horrible, there's a lot of lock contention and so on, performance tanks, even adding resources might not help. When things are done so badly. One guys said that they're having huge problems with their ERP systems database. The ERP system alone runs just fine, but there are some reports which are written by 3rd party consults and those seriously abuse locking causing lock contenting and basically stalling whole database server.
- It's generic problem that people do not understand the problem is nor they are able to use tools which would reveal what the problem is. Instead of doing that, they just do something random and hope it would fix the problem.
Total home renovation project, is still taxing my IT 'hobby'. I've been so busy with it. I'M getting new just about everything. Every surface in the apartment is being renewed as well as plumbing, networking, electric cabling, floors, walls, kitchen, home appliances etc. I hope this means that I don't need to worry about these things in next 20 years or so.
Haiku OS - Played with it for a few hours.
Tachyon - Fault Tolerant Distributed File System with 300 Times Higher Throughput than HDFS
Designing REST JSON APIs - Nothing new at all. But could be problematic with mobile apps, if round trip times are high. All required data should be received using one request, even if it technically involves multiple resources. So some kind of 'view' resources should also be available which technically merge multiple lower level resources.
- Working with systems which are broken by design, is really painful and frustrating. There are multiple scheduled tasks, which might not run or can run, if any of those tasks does not work, data is is lost and requires complex manual recovery etc. I don't want to see that kind of systems at all anymore. It's so darn painful doing the manual recovery process, especially if it rarely happens. Let's say the crap fails 0,5-2 times / year. Nobody even remembers how the systems should get recovered. Simple and reliable, are the keywords which I really like. Sometimes I do prefer multi-step process, where in each step you can check the data and state easily. So you'll know that everything is good so far. One big black box which gets something in and spits out something, might be a lot harder to debug.
More great code. Let's assume we have huge database. Each database row got flag if the row has been processed. I would prefer pointer to monotonically increasing counter, instead of per row flag. But his is someone implemented it. Now they're doing it like this.
Configured mail client to show spamgourmet and trashmail specific headers by default, so I don't need to use show all headers or message source to see a few important fields.
PGStorm accelerating PostgreSQL with GPU.
MongoDB hash based sharding
Wildfire.py - Self-modifying Python bytecode: w.i.l.d.f.i.r.e
Hardware based full disk encryption (FDE)
Firefox seems to use SQLite3 in WAL mode, that's good choice when there are lot of writes.
What's new in Linux 3.9
Experienced developer can give you huge list what to do and why, and what not to do and yet again exactly why. I'm seen millions of ways writing extremely bad and unreliable code. I can tell exactly why not to write such code. Nothing kills more productivity than totally unreliable code.
- Use transactions, locking, handle exceptions, give clear error indicaton, log possible issues. Use auto-recovery, if possible. It's not that hard, it should be really obvious for anyone. Don't write stuff that totally kills performance, use batching, sane SQL queries, indexes, etc.
WiDrop - Wireless Dead Drop. As far as I know there isn't one in Helsinki yet. Maybe I should build one? Just have to ask some friends living in really center of town to run it. Or maybe I could find some local business to host it. Which probably won't work out after someone abuses the dead drop with wrong stuff.
When using Windows as file server, I just wonder why people always also give remote desktop access to server. It's just like when people share sftp / ftp / ftps / scp accounts, they usually also give always shell access. I guess that tells something about the administrators.
SaaS company providing superior Web-based security solutions for businesses, institutions, and government agencies to securely encrypt, time stamp, store, transmit, share, and manage digital data of any kind and size across a broad array of operating systems and devices, ranging from smart phones to Supercomputers. These services are further supported by our comprehensive and certifiable audit trail, including irrefutable time stamping. The proprietary methodology at the core of safely locked allows for a wide range of applicability of its software and services, all of which brings to the world market innovative, highly secure, interoperable and cost-effective services and products. - Sounds pretty cool.
Inception - Inception is a physical memory manipulation and hacking tool exploiting PCI-based DMA. The tool can attack over FireWire, Thunderbolt, ExpressCard, PC Card and any other PCI/PCIe interfaces. - If there's physical access to your systems, you're so owned.
Linode started to use TOTP
Flash Cache, Bcache, Bcache Design
WiDropChan anonymous wireless wlan chan with HTML5 off-line support. Just a thought play. Could be useful for someone. Providing pseudonymous access with private messaging & attachments would be nice bonus. Server software would provide basic access control, flooding protection, captive portal and HTML5 client off-line features. To update data it would be enought to visit the page when you pass by. After that, you could handle messages and files offline.
PCBoard BBS system - Sorry no time to write about it. I would have been just a good example about file based locking and multi-computer shared environment.
Babel routing protocol
Thinking about games and their closed world and money systems is great excersice. How do you make the game fair? How do you prevent inflation, deflation and generally control the supply of money in different economic situations. Making game fair in a such sense that all money doesn't come to one user etc. I did spend a lot of time thinking about these concepts, but I didn't ever write about it nor I did write any code.
Big-O Cheat Sheet - This webpage covers the space and time Big-O complexities of common algorithms used in Computer Science.
When bad data is received, what should be done? Halt, continue, log error, send alert? I often have found out that the best way is to halt, it forces the problem to be investigated. If processing is continued and error is only logged, it might take months before anyone notices that things aren't as those are supposed to be.
Had a long discussion with a good friend about: fluid intelligence, crystallized intelligence (aka wisdom)
Read: The Dangers of Surveillance - Neil M. Richards - Washington University School of Law
Strangest problem ever. I made too large transaction by giving command:
- delete from junktable;
- commit;
- It took a long time and after a the commit log got huge the database service crashed. After that, when restarting database server, the crash recovery crashed. I were forced to delete the transaction log. After all this the database was totally corrupt. Great going! Luckily I was using my test environment and not a production environment. - Phew!
I really dislike programs which malfunction so that they require operator / admin attention. When things are out of beta, things should just work, work and work. Not fail randomly giving headache for everyone.
Bitmessage load spikes on higher tier streams - Forever continuing retransmissions. Protocol used to synchronize data between nodes in same group not described. Possible traffic timing correlation attacks when using out bound message relay. Passive mode makes retransmission and work proof even worse.
Skimmed checklist manifesto, and watched Fukushima Two Years After documentary, where they analyzed the failures in essential processes of cooling the reactors. In very short from, after power outage the passive emergency cooling system was off and they didn't even realize it.
Just like database managers saying, yeah, there is that commit log, just delete it. Great choice guys, it seems that you don't know what the function of journal is after crash. It's true that recovery after crash is faster if journal is deleted. But it leads to data corruption. Great move, but seems to be one of the tools in DBAs standard toolset.
Spanning tree, real-time web applications
Tested: Freedcamp, Trello, Asana, Apptivo - For software development team work I liked freedcamp, for very simple task management I loved Trello. But Apptivo seems to be the most complete product of these. Maybe bit too heavy for simple tasks, but in general seems to be the best too of all of these.
Bitmessage - Only route refresh messages should be flooded. Flooding every message to every stream node is very bad tihng. I just remember too well how much fail original Gnutella protocol was. It did work well with a few nodes, but it was totally unscalable by design. Message types (inventory, getdata, senddata, sendpeers)
- Client allows system to be flooded with stale TCP connections. Because I can complete 3 way handshake using raw sockets, I could pretty easily flood all nodes on the network with countless stale connections. I didn't try what would actually happen if I would flood all network nodes with millions of stale connections. But it seems that there's a real problem there. First of all tons of connections shouldn't be allowed from same it, as well as there should be some limits which would prevent keeping stale connections alive for as shorter time. Now connections remain open for quite a long time (too long?), allowing attack to be effective without any additional handshake or state.
- A) Do not allow tons of connections from same IP, especially if there hasn't been ANY negotiation. This flaw makes attack super trivial and easy. I can disable ability to connect new nodes for whole network from one of my servers with gigabit connection in a few minutes and that's super trivial and easy to do.
- B) Because client doesn't recover from the initial attack, there's coding flaw. Normally client should return to 'normal state' after those connections die for what so ever reason. But this doesn't seem to be happening. Client can't form even outbound going connections after attack.
- All this is followed by the classic: socket.error [Errno 24] Too many open files.
- So there are at least two vulnerabilities. Third one is pretty bad too, they allow fake peer information to get propagated throughout the network. This can be also utilized to attack other TCP services by making almost all Bitmessage clients to connect those. There's no check if peer is valid before propagation.
I had nine computer's running Edonkey2000 cluster, which harvested files from Sharereactor. Including my own control software which took care of load balancing and allocating files based on disk space and bandwidth demand & peers on different servers.
Early ed2k server overloadaded when there were too many peers connecting to it. Worst part of it that it only send list of N peers to every client. So when swarm or client group downloading the same file grow large enough, all new peers got only information about those limited number of peers connected to the server. Rest of the peers were unconnected because there was no peer to peer gossip protocol with early servers.
I convinced the ED2K guys to select rarest vs random block when selecting which block should be downloaded from peer which got multiple required block available. I also told the GNUnet developers that the peers would need local block cache eviction. Old versions didn't evict blocks and when cache got full, those were simply unable to insert new blocks. Of course this isn't a problem as long as there's huge churn on peers, which install client run it for a while and forget it. But this totally ruins help provided by peers which are connected all the time and would be able to provide considerable resources to the network, if the cache would be fully utilized. Note! It's not cache, it's data storage, which remains persisted on disk when client isn't running.
- [2015 note] This is exactly what I would also like to see from Tribler.
Read: Introduction to financial economics by Professor Hannu Kahra
Managed Wlan systems, and configured one for a customer with five base stations
MitMproxy - Yet another tool for hijacking connections and stealing data
Finished reading Secrets of Big Data Revolution, by Jason Kolb and Jeremy Kolb
Watched Google I/O 2013 Keynote
Studied: software defined storage, universal storage platform, virtual storage platform, system storage san volume controller. storage hypervisor, virtual software, Geo fencing, location intelligence, Integration architect
User interface design, general usability (UX), and viewing the product usability from end users and customers roles.
General Data Protection Regulation
UnQLite - An Embeddable NoSQL Database Engine
Refreshed my memory about Freemail (email over Freenet) documentation and how they check if certain keys are being used and which keys should be used to publish new messages
Just keywords: Predictive modelling, Intermessage delay, asm.js, Gitian, yappi, pump.io
Bitmessage test attack client goals: Network propagated, persistent attack (invalid message getting propagated by the network, but still crashes the peer after it when being processed by the client). As example, packet passes networking and data store parts, but the user interface displaying notice about it crashes the client. In some cases this would even cause the client to crash again when it's restarted. Only purging the data store would fix this issue.
I like web apps because those work on all platforms: Sailfish, Firefox OS, Tizen, Ubuntu, HTML5, COS.
What's the point of using state of art VPN software, when it's configured not to use keys and passwords being used with it are lightly to say simply moronic. It's just matter of time before the state of art security is broken.
SQLite3 3.7.17 release notes
SQLite3 Memory-Mapped I/O
Bitmessage Fail - Number of connected hosts isn't managed properly, can easily lead to several problem and running out of TCP sockets. As well as inventory message flood allows memory consumption attack (which caused remote clients to crash and therefore lead to situation where I detected connection management issue) - Quite good reasons NOT TO run P2P software on systems doing anything else than running just the P2P system itself. Always properly isolate P2P systems from all other systems. Bitmessage also failed to exclude 172.16.0.0/12 addresses. So it was possible to make the Bitmessage clients to connect local address space by spoofing peer addresses like described above.
- Bitmessage vanity address generation is trivial. All it takes, is changing one line in source code and some time.
- Bitmessage potential timing attack... You can flood or crash connected nodes and see how quickly the message received confirmation message comes. Also timing attacks might reveal who's the sender in case of distributed lists. Just connect really many nodes, do not relay the message your self and observe order which the message is getting offered to my node from other nodes. Restructure network, crash a few peers and recheck. Address messages can be also be used to check out the network structure, because messages are flood casted. Send message to only one peer and see which order and after what time other peers offer it back to you.
Native applications versus HTML5 applications (in Finnish)
HTML5 features you need to know
Non-blocking transactional atomicity
Good reminder: You're dangerously bad at cryptography - Yes, it's hard as any other security related topic. Getting some basics right is quite easy, but after that getting it absolutely right is nearly impossible.
OpenVPN, SSH and Tor port forwarding & tunneling
Unhosted applications - Freedom from monopoly - Distributed Standalone Applications using Web Technology
Parallelism and concurrency need different tools
Checked out Wise.io
Is NUMA good or bad? Google finds it in some situations up to 20% slower. - I guess it's all about memory access patterns.
In the upcoming Google App Engine 1.8.1 release, the Datastore default auto ID policy in production will switch to scattered IDs to improve performance. This change will take effect for all versions of your app uploaded with the 1.8.1 SDK. - Ok, so they didn't like the original concept where 1000 consequent keys were allocated at once for a peer. Even that didn't guarantee incremental key allocation, because each thread got it's own key space for performance reasons.
Studied more Security Information and Event Management (SIEM)
I'm one of the StartMail beta testers.
Started to use Comodo Personal Email Certificate as optional to my GnuPG / PGP keys.

Link: 2013 mega dump 2 of 2