I've often talked about problems and slowness caused by multiple overlaid abstraction layers. In some cases those seem to make things simpler, but in reality the whole mess becomes just more complex and overall performance suffers due complexity and lack of transparency.
I did enjoy reading the Making full table scan 10x faster in InnoDB. But personally would have optimized the task differently. Of course this is purely technical writing and I don't have any plans to implement it for MySQL. But this is how I would do it in theory.
Things which were really wrong with the original MySQL dump:
1. Dumping table in primary key order is pointless, if we're dumping whole table for later processing.
2. Even if that's ignored, still dumping table even in logical page order isn't optimal either.
Well, how would we then reach the optimal performance?
Let's think very simply and just dump table in the order as it is stored on the block storage medium.
1. Get list of disk blocks which contain the table. This is of course easier if table is in own file, but it's doable anyway. Also skip blocks which are linked to pages which contain only empty / deleted records.
2. Order the block list by block number (this is the optimum read order for data, completely ignoring logical database order, or any key order)
3. Read blocks from disk in order, and dump (complete) records in read order. In case of records aren't aligned with disk block, keep those partial records in memory until the missing data is read and then dump it out.
4. To optimize this (which was already said in the 10x article), keep at least 32 parallel read requests in queue to optimize disk read speed. This also greatly affects even modern SSD performance.
Afaik, this is the most efficient way to dump a database table from HDD. From SSD block device, it's highly likely that the steps 1 and 2 do not provide any real benefit. But step 4. is still very important.
I've been doing stuff like this a lot when implementing ETL (Extract, Transform, Load) BI / CRM / ERP - solutions. I always wonder how the 'official solution' can be so extremely slow. The reason is often, that they're doing something which is in this case meaningless. When you make light implementation which does only and exactly just what is required in efficient manner, it's usually a lot or even amazingly fast compared to the 'right way of doing it'.
Of course this is a low level approach and requires that you have direct access to data storage and there won't be any writes during it to the table.
I often use backup copies of the database for these large dump tasks, so I don't have any other activity in the data source.
I usually achieve easily 10x or even 100x speeds when using Python. Everyone says me that Python is slow, but in this case, it's usually about disk I/O and database engine. The Python code ends up using 1-5% of CPU time. So usually using Python threads only is enough, even I'm completely familiar with multiprocessing for more CPU intensive processing tasks.
With Ext4,BtrFS and other extent based systems, extents can be used instead of blocks, yes, basically those are just contiguous block groups.
But if we want to make this really complex, then we can talk about NTFS compression. Now we have data in 64 kilobyte compression blocks, which are still written to disk in 4 kilobyte clusters (as many as is required after compression). This is something which would require really low level disk access, and decompressing blocks only when all (possible) 4 of those 4 kilobyte blocks have been read. So it would be yet another buffer until everything is ready for further processing when scanning the disk for data. This could consume a lot of memory in situations where disk is highly fragmented and data isn't well compressible. But with files with higher compression ratio, also less memory will be used due to the fact there are less on disk clusters for one 64 kilobyte compression block.
I'm very happy to receive feedback & follow-up posts, and I'll link to those too. - Thanks!
1. My friend immediately asked, why not to copy whole database. But file copy isn't optimal solution, there might be other tables, data & indexes in same file, depending from solution being used + stale data, ie dead "deleted" records which are on in possible free list. As well as, file copy doesn't read data in the "on disk order", it does read data file in "logical file" order. So it's much slower with highly fragmented files anyway.
KW: mysql, postgresql, postgres, mongodb, database, speed, dumping, tables, table, export, processing, process, refining, reports, reporting, graphs, ananlysis, analytical, analyzing, solid state disk, hard disk drive, device, harddisk.