Blog‎ > ‎

ETag, gzip, HTTP, NSA, DHT, Integration, Specification, Bad code, Python 3.4 libs

posted Mar 2, 2015, 8:09 AM by Sami Lehtinen   [ updated Mar 2, 2015, 8:10 AM ]
New stuff:
  • NSA's firmware hacking - It seems that they're missing the fact that harddrives got much more space than the 'service area alone' which is reserved for own operatios. Spare sectors and SSD wear levelling area and all "free space" can be used to store a lot of data on disk, especially if it's not completely full. Even then everything except the free space remains fully usable before drive starts to fail. Also I didn't like the fact that they said store data unencrypted. I'm sure if they bother to go that far, they also will encrypt the data when storing it. Just to be sure, not encrypting it would be just silly. Ok, often obfuscating data is enough, it's must lighter and faster, yet makes data such that it's not immediately obvious what it is. There's no reason why ROM alone should be used to store documents. Even code which doesn't fit in rom could be stored on disk and loaded on demand. So code base for this kind of application can be larger than the storage space in ROM. Did somebody forget dynamically loaded segments, which were used with .exe apps a long time ago. Same address space will be just swapped with different code loaded from disk, if there's not enough room to store or loaded everything at once. 
  • One of the five machines they found hit with the firmware flasher module had no internet connection and was used for special secure communications. " - This part reminded me from my secure system with RS connection. When I said that it's being used with low speed serial link, I did mean it. Also the out of band attack channels are disconnected like: DCD, DSR, RTS, CTS, DTR, RI. Only Transmit data or receive data and ground are connected also the RD, SD is controlled using a switch on cable, which makes the cable always unidirecitonal. If the other pins would be connected, it would be possible to carry data overt CTS, RTS, DSR, DTR and RI pins without the LEDs indicating it. I'm using DB9 pin out, if DB25 pin out would be used there would be many other pins which could be used to relay out of band data. As said, it's important to make relaying data in our out as clear and hard as possible.
  • " The attack works because firmware was never designed with security in mind " - Made me smile. Well, that's true. In most of cases, software is barely working. Who would want to spent additional resources to make program secure when adding those features could also make the system brittle and harder to manage & maintain? Security isn't priority is the norm when creating software. There are much more imporant things to consider, like if the program is working at all and not crashing all the time. Anyway, aren't applications and security only for honest people? If somebody really wants to get in, they will. 
  • Got bit bored at home and wrote decorators for ETag ang gzip handling for bottle.py.
  • Enjoyed installing a few Fujitsu Server PRIMERGY RX200 systems with LSI MegaRAID CacheCade Pro 2.0 SSD caching solution. 
  • It seems that many storage solution sellers don't even understand meaning of Hot-Warm-Cool-Cold tiered storage. There's no reason what so ever to store archival data on expensive SLC raid arrays. Only small amount of hot data should be on fastest possible disk system and rest can be stored on slower tiers. 
  • Wondered how some server dealers try to sell you tons of stuff you don't need, included in a package. As well as leaving the stuff which isn't included in package openly priced. I think this kind of pricing model is just annoying and wasting everbody's time. Just give me a clear pricelist which includes everything and I can make my own conclusions out of that. I don't want to waste time negotiating stuff which doesn't really matter. If prices are too high or service isn't what I would like it to be, then I'm not buying. Multiple negotiation rounds just waste everybody's time. Another thing which is pretty ridiculous nowadays are long contracts, like we demand 36 month contract? Ok, fine. If prices are lowered during the contract do these price cuts also apply to existing contracts? No? Ok fine, I don't want that kind of deal. Some service providers do cut prices also for existing contracts, other do not. I don't like it. If you offer a backup solution, I'm interested to know if the backup solution is off-site backup. I would prefer option where the backups can be fetched at any time without any assistance from service provider. So if required, I can even keep my own copies. How do we gain access to the backups in case of total data center loss? Yes, I know it's rare, but it has happened before and it will happen in future too. Is invoicing clear and right? Some service provider send horrible invoices with mistakes and unclearl ines, others deliver clear invoices which are always right. Some service providers provide clear invoice every three months or so. Other service providers require advance payment / contract / month, which is horrible. Also questions how refunds are done in case of service is cancelled are always interesting. Does the service provider provide flexible contracts where you can modify system resources as needed? If I need extra CPUs or memory for some heavy batch run is that possible? Many service providers also offer SSD storage. Well, nice deal, but what if you don't need it. As well as tons of bandwidth included in package which isn't needed either. I don't care if it's included, it's nice, but including bandwidth in package shouldn't bring it's price to pain point. I just wonder how much server resources are sold using these kind of Dilbert deals. Lot of BS talk, little facts and then just let's roll the monthly billing. Does anyone even know what they really need and what they're buying? Nope? I guess that's true unfortunately in many cases. Clueless customers and managers are truly clueless and those are also the customers which keep this kind of service providers running. 
  • Quickly read through Flux article - It's yet another pub/sub messaging / queue solution. 
  • Had long discussions with friends about DHT, STUN, TURN, how to know if peers are alive, how ping and pong should actually work, how often. How to prevent reflection and amplification attacks with UDP based solutions. How to manage peer information in a sane way. Listing 10k nodes got no use if there's really high churn. Keeping list of smallish number known reliable nodes is a good idea. In this case if bootstrap / seed / initial fixed list nodes are under attack network won't fully collapse because peers can't find information about current active peers to join the network and so on. List also shouldn't contain too many peers which are unreliable or short term nodes. In most of P2P networks it's really common to have extremely high churn rates. Some peers might run just a few minutes in a month and so on. Looking for those at a random time is quite unlikely to be successful. Like client pings server very 900 seconds. If no reply, enter test mode, send pings 6 times every 10 seconds. If no pong is received consider server to be dead. And if server doesn't receive any pings from client in 1000 seconds consider peer to be dead.  Of course this is only for times when state is in idle. During normal operation there's constant bidirectional communication as well as ACK / NACK packet traffic. As well as software engineering aspects and integration architecture consulting. Lot of debugging, hanging threads, non-blocking socket I/O and all the general stuff. Plus lot of discussion about NAT loopback, other devices do support it and others do not. Some allow it to be configured freely. It's also known as NAT hairpinning or NAT reflection. I'm hosting several networks which do support loopback but a few networks do not. It's really annoying because services can't be accessed using name or IP but you have to know the private IP address to access the service. Some times NAT is also doing NAPT and translating the port number so even port number might be different for LAN than for "rest of the world". 
  • Firefox 36.0 Release notes -  Adds support for HTTP/2. After using this for a while, I don't know what they got wrong. This is just like what I cursed a few posts ago. Shit code is shit and you'll notice it. Firefox totally freezes and hangs and seemingly nothing is happening. Network is on idle, CPU is on idle, there's plenty of RAM and Disk I/O capacity etc. But alas, nothing happens, why? Why? WHY?!
  • Zoompf guys wrote that they double their database performance by using multirow SQL inserts.
  • First thing to remember with UDP is that it's addresses can be spoofed. So data shouldn't be sent to recipient without first verifying that the request is valid. This is exactly what TCP does by it's nature. If this step is skipped, it's very easy to make and such program to amplify and reflect attacks. I'm just sending packets to all OB nodes which tell that some random ip and port just requested that huge image. It's very usual to measure the amplification factor. If one 512 byte UDP packet can trigger sending 100 kb then the amplification factor is roughly 200000. If there are no measures what ever to prevent this (I know there's already at least some window limits) I could use my 1 Gbit/s connection to trigger 200 Tbit/s DDoS attack easily. As well as the targets wouldn't know it's me even if I would do it from home. So this is just theoretical sample. Some times even no amplification is enough for attackers, they're just happy with the masking features. So they can use a few servers with high bandwidth to indirectly attack site making attack detection and mitigation harder. It's important that the recipient validation is made in a way that can't be also spoofed.
  • Attended Retail and Café & Restaurant 2015 expo / convention / fair / conference. Same stuff as always, self service, mobile apps, RFID, digital signage, loyalty programs and retail analytics. 
  • Had once again interesting discussion about customer data retention. What ever information is received, will be stored indefinitely and won't be removed ever. So when you use cloud storage, have you ever considered the fact that what ever you ever store there, you can't ever remove? Did you understand that? Maybe not? But you should really think about it. Yes, there might be "delete button", but it's just a scam. Anything isn't removed ever, it's just hidden from YOUR view. It's still there. These are very common practices and there's nothing new or special about this. Even all temporary files back from 2013 are stored. When asked if those can be deleted answer was nope, we don't ever delete any data which we have once gained access upon. 
  • Enjoyed configuring Office 365 for one business & domain + installing Office 365 clients as well as configuring email accounts and SharePoint.
  • Replaced CRC32 etags with Pythons own hash based etags using base64 encoding. Computing it it's about 7 times faster as well as amount of bits provided to avoid collisions are plenty more.
  • Also adding HTTP xz (lzma77) content-encoding compression support would be also trivial, but currently no browsers support it.
  • Requirements specification, all that joy. Fixed a few things for a old project. No fixing is wrong term, there wasn't anything wrong to begin with. The program worked exactly as specified. But after it has been in production for six months, customer had unexpected situation which created NEW requirements. Then there's all that age old and boring discussion, should they pay extra, because the integration isn't working. But they don't just get what's causing it not to work. In this case it was especially boring case. Data is transported over HTTP as XML to another system. Structure is really simple and clear and there are three systems interoperating via message passing. Problem? Well. 
  • For some reason system let's call it N doesn't accept messages from system S which are generated by system W. And the reason is? Well, for undefined reason system N can't handle in tag T data which contains information for several days, even if there's no reason what so ever to do so.
    Example:
    <data>
      <day date="1">
        <stuff/>
      </day>
      <day date="2">
        <moar-stuff/>
      </day>
    </data>
    They insisted that there has to be msg for each day, even if there's no technical reason for it and no documentation requires it. Of course this situation creates a problem only when there's data for several days to be delivered.
    So who made mistake? Me? Them? Nobody? And who's going to pay for it? - All just so typical integration stuff.
    Well, I 'fixed it'. It was naturally trivial to fix. Even if I still say that I didn't fix anything, because there was no mistake to begin fixing with. I just open and close the msg between days. Totally pointless and doesn't change a thing practically, except that now it works.
    Funny thing about these things is that sometimes it takes months of pointless discussions how it should be fixed. Even if fixing it would take just 5 minutes. Some companies and integrators just seem to be much more capable than others.
    In one other case situation was quite similar but instead of date it was profit center. One major ERP vendor said that it's impossible to handle transactions from multiple profit centers in same message, even if there's no technical limitation for it. In that case it wasn't even my app which was generating the data. I wrote simple proxy which received one mixed message, weeded it out per profit center and then sent per profit center messages forward. Totally insane stuff, but it works. Because both parties said that it's impossible to fix so complex things, which made me laugh. One party could generate per profit center data and another part couldn't receive mixed data. I think they both got pretty bad coders. Well luckily there was someone who was able to deal with this impossible to solve technical problem in a few hours.
  • Studied Bitcoin Subspace anonymous messaging protocol for direct P2P communcation. I also wrote about it.
  • CBS got the same problem PBS had earlier:
    "Unfortunately at this time we do no accept any foreign credit cards. In order for you to make a donation you will have to use a bank issued credit card from the U.S."
  • Read about Payment Services Directive (PSD2)
  • Python: Problem Solving with Algorithms and Data Structures - Just read it.
  • Noticed that SecureVNC allows cipher called 3AES-CFB. Yay, AES256 isn't enough? Do we need 3AES already? What about using ThreeFish with 1024 bit keys? 
  • Checked out twister which is distributed p2p implementation of Twitter.
  • Checked out Transip servers in EU - https://www.transip.eu/vps/ Excellent hosting option like Digital Ocean, Vultr, OVH, Linode and so on.
  • Quru wrote about Stockmann's webshop. - It just sucks. Actually I just yesterday proved it. My friend couldn't make her pruchases from the store. I had to make purchases because the payment solution was so broken that standard MasterCard didn't work with it. Nothing happened after credit card information, nothing at all.
  • Now it's clear, Samsung S6 doesn't even have SSD slot. This was really expected move, because even the old phones with SSD slot were crippled by firmware updates so that the SSD card couldn't be practically used with applications. I just wonder why nobody made bigger fuss about this. Devices which you have already bought and downgrade via software 'updates', duh! 
  • Python 3.4.3 released
  • Python 3.4 statistics library
  • Studied about Opus audio format - Because latest VLC 2.2.0 - https://www.videolan.org/press/vlc-2.2.0.html - supports it. 
  • Peewee ORM ala Charles Leifer - Techniques for querying list of objects and determining the top related item
  • dataset - Super lightweight ORM for Python
  • Python 3.4 tracemalloc - Track application memory allocation in detail 
  • Python PEP 448 - Additional Unpacking Generalizations
  • Google PerfKitExplorer - Track performance across clouds  
  • For some reason multi threaded version of par2 seems to crash with my new 16 core system. *** Error in `par2': 6429 Segmentation fault      (core dumped) par2 c -u -r10 recovery.par2 *** Most interestingly the crash happens after the Reed Solomon matrix Construction is completed. So there's some kind of simple addressing fail somewhere probably. I'm pretty sure it's simple bug, and not a hardware related issue. It also seems to be happening quite often. 
Phew, now my backlog is gone. I did it. Hooray!
Comments