Blog‎ > ‎

7-zip MS .cab file handling bug?

posted Sep 17, 2014, 8:29 AM by Sami Lehtinen   [ updated Sep 17, 2014, 8:31 AM ]
It seems that 7-zip fails when handling larger .cab files, but so does MS makecab.exe as well.

I created a few , extracted and generally wondered internals of .cab files. Because I received one cab file, which can be expanded successfully with expand.exe. But when I try to create similar cab file, I'll get error message: "ERROR: (FCIAddFile)Data-size or file-count exceeded CAB format limits" as well as 7-zip says the cab file is invalid after extracting about 30% of it. That's very strange. All this started when I received ~1.5 GB cab file containing about ~2.8GB file and 7-zip refused to extract it.

When compressing that extracted file with makecab.exe error occures at: " 77.02% - raw=2,147,450,880  compressed=1,133,243,115". Just as expected when using 32 bit signed addresses. I do personally wonder what's the point of using signed addressing.

I have already asked how how they created it in the first palce. It might be possible that MS SQL Server is able to create .cab files which do contain larger than 2GB files, and it seems pretty likely at this point. I also verified that the file extracted with expand.exe does seem to contain valid data to the end of file, which makes this case even stranger. 7-Zip extracted file is exactly the same size, but as said earlier, it's end 2/3 of file size is full of zeros.

Why I tried to create similar file? Well, I just wanted to know if it's 7-Zip bug when handling large .cab files or if there's something else wrong with it. Unfortunately the file I'm talking about, does contain confidential database, so I can't share the original file. I'll hope there will be some kind of resolution for this question.

Details:
Compressed size: 1,461,158 KB (Smaller than 2GB)
Original size: 2,788,312 KB (Larger than 2GB)

Compressed file magic number: MSCF
Uncompressed file magic number: TAPE

Anyway the CAB file specification says that maximum file size is 2GB, so how it's possible that I just extracted successfully a larger file? Microsoft got some non-standard MS only kludges in place?
Maybe they got something like 'append', so it can contain 64K * 2GB files, which are simply appended when extracting?

Wikipedia: CAB