Linux filesystem btrfs fragmentation allocation strangeness and bad sectors (ranges)
I'm having trouble understanding why these files are allocated like this. And no, this isn't first nor last time this happens. Note, that these files have NOT been preallocated, NOR these files have been partially updated to cause segments of the file to be updated. Therefore I would assume that delayed allocation would make this file "mostly contiguous" on file system where there is plenty of free space. Note, the pattern repeats throughout the whole file, this is just a very small snippet. The file was written to the disk by Firefox during download from local server.
The primary question is why the file(s) have been written into A,B,A,B,A,B altering "area / location" pattern? - I would appreciate if anyone could give any meaningful answer / guess.
Maybe Firefox flushes files into the disk? - I proved this to be wrong already.
I'll try to repeat this same test with Python, writing 64 Kibytes blocks at 100 MiB/s for 2 gigabytes to see what happens.
Firefox file save from network:
6924: 5118950.. 5120965: 43563352.. 43565367: 2016: 36311680: shared
6925: 5120966.. 5120997: 36311680.. 36311711: 32: 43565368: shared
6926: 5120998.. 5125061: 43565368.. 43569431: 4064: 36311712: shared
6927: 5125062.. 5125093: 36311712.. 36311743: 32: 43569432: shared
6928: 5125094.. 5127109: 43569432.. 43571447: 2016: 36311744: shared
6929: 5127110.. 5127141: 36311744.. 36311775: 32: 43571448: shared
6930: 5127142.. 5138061: 43571448.. 43582367: 10920: 36311776: shared
6931: 5138062.. 5138069: 36311776.. 36311783: 8: 43582368: shared
6932: 5138070.. 5139389: 43582368.. 43583687: 1320: 36311784: shared
6933: 5139390.. 5139413: 36311784.. 36311807: 24: 43583688: shared
6934: 5139414.. 5141445: 43583688.. 43585719: 2032: 36311808: shared
6935: 5141446.. 5141477: 36311808.. 36311839: 32: 43585720: shared
6936: 5141478.. 5143493: 43585720.. 43587735: 2016: 36311840: shared
6937: 5143494.. 5143525: 36311840.. 36311871: 32: 43587736: shared
6938: 5143526.. 5145541: 43587736.. 43589751: 2016: 36311872: shared
6939: 5145542.. 5145573: 36311872.. 36311903: 32: 43589752: shared
6940: 5145574.. 5147589: 43589752.. 43591767: 2016: 36311904: shared
6941: 5147590.. 5147613: 36311904.. 36311927: 24: 43591768: shared
6942: 5147614.. 5148381: 43545016.. 43545783: 768: 36311928: shared
6943: 5148382.. 5148389: 36311928.. 36311935: 8: 43545784: shared
Python test program:
642: 478736.. 479807: 44152216.. 44153287: 1072: 36317921:
643: 479808.. 479823: 36317921.. 36317936: 16: 44153288:
644: 479824.. 480911: 44153288.. 44154375: 1088: 36317937:
645: 480912.. 480927: 36317937.. 36317952: 16: 44154376:
646: 480928.. 481999: 44154376.. 44155447: 1072: 36317953:
647: 482000.. 482015: 36317953.. 36317968: 16: 44155448:
648: 482016.. 484191: 44155448.. 44157623: 2176: 36317969:
649: 484192.. 484207: 36317969.. 36317984: 16: 44157624:
650: 484208.. 485295: 44157624.. 44158711: 1088: 36317985:
651: 485296.. 485311: 36317985.. 36318000: 16: 44158712:
652: 485312.. 486383: 44158712.. 44159783: 1072: 36318001:
653: 486384.. 486399: 36318001.. 36318016: 16: 44159784:
654: 486400.. 487487: 44159784.. 44160871: 1088: 36318017:
655: 487488.. 487503: 36318017.. 36318032: 16: 44160872:
656: 487504.. 488575: 44160872.. 44161943: 1072: 36318033:
657: 488576.. 488591: 36318033.. 36318048: 16: 44161944:
658: 488592.. 489679: 44161944.. 44163031: 1088: 36318049:
659: 489680.. 489695: 36318049.. 36318064: 16: 44163032:
660: 489696.. 490767: 44163032.. 44164103: 1072: 36318065:
661: 490768.. 490783: 36318065.. 36318080: 16: 44164104:
662: 490784.. 491871: 44164104.. 44165191: 1088: 36318081:
663: 491872.. 491887: 36318081.. 36318096: 16: 44165192:
664: 491888.. 492959: 44165192.. 44166263: 1072: 36318097:
665: 492960.. 492975: 36318097.. 36318112: 16: 44166264:
Most interesting. Answers please!
Minor update, I made a few extra tests. Wrote another Python program, which writes small blocks of data to the disk faster. After running both programs with alternate file system. Results of the tests are still similar, only with very small differences. The primary point is the same, why alternating storage location pattern where there are very small extents.
Bad Sectors (data ranges on disk)
Another interesting observation, one huge ~20 TB HDD had just a few bad sector ranges, but btrfs doesn't support bad sectors. So used partition table to work-around the bad ranges. Btrfs (@ Wikipedia) supports merging partitions together and it was just easy to first create bad partitions covering the broken areas of the storage and then create data partitions in between and finally creating a file system on those. Does it really make sense? Hmm, no? Was it interesting experiment? Sure! Traditional storage issues, where drives DO NOT efficiently detect problematic areas and cause constant problem. Even if there's plenty of spare space available. Well, I lost only ~20 gigabytes from total of 20 terabytes due to the bad regions. Not bad, I also left (full) 1 gigabyte reserve zones before and after that actual bad area. And all partitions are aligned to GiB. Now the file system as been in burn in test for two weeks and zero problems and or slow downs have been detected. Job completed. I just wish that the drives firmware would be slightly smarter, and it would have done the mapping to spare sectors for me efficiently, but nope. Not first (nor last time) this unfortunate situation happens. Drive writes data happily to sectors and then says it can't read it back. But even with repeated testing, it still won't fix the problem even if it SHOULD do it.
TRIM / DISCARD / UNMAP / DEALLOCATE
It's very nice that btrfs starts with discard operation for the whole file system block range when creating a new file system. We just discussed with colleagues about this and wondered why Clonezilla (@ Wikipedia) / partclone doesn't do something similar. It would be so trivial to trim (@ Wikipedia) the whole drive and just the partition (as block discard range) before writing the data. Of course by default. Now I can run it before writing the image, but most of users won't do that.
2025-08-17