SofaUnZip 1.0b released

SofaUnZip 1.0b released

by Louthrax on 27-10-2015, 02:44
Topic: Development
Languages:

Seems like there's been a "ZIP frenzy" in the MSX dev community recently. TNI released a first unzipper for MSX, followed by Grauw releasing a blazing fast gunzipper. Now it's Louthrax turn to release another unzipper, based on Grauw's code, with the following features:

  • DEFLATE and STORE methods support.
  • Directories support.
  • Long file names support.
  • File time and date support.
  • Fast extraction speed.
  • CRC32 check.
  • Output concatenation to a single file (useful for disk images).

It can be downloaded from Louthrax's MSX game room. Note that it is still in a beta stage. Please report any bugs on the SofaUnZip bug report thread.

Relevant link: Louthrax's MSX game room

Comments (22)

By meits

Scribe (6534)

meits's picture

27-10-2015, 02:51

wow...

need i say more?

By Grauw

Ascended (10707)

Grauw's picture

27-10-2015, 08:52

Very awesome, I'm super excited to see this development! The power of open source, people! Smile

Can't wait to see what will come next in the Sofa series.

By Louthrax

Prophet (2436)

Louthrax's picture

27-10-2015, 10:43

I just release a new bugfix version (beta2), that handles empty directories correctly.

By syn

Prophet (2115)

syn's picture

27-10-2015, 12:19

Looks nice Big smile

By Lord_Zett

Paladin (807)

Lord_Zett's picture

27-10-2015, 13:19

where zipping on msx like hell!

By Parn

Paladin (833)

Parn's picture

27-10-2015, 13:35

Very cool. This is being a great year for MSX unarchivers. Can't wait to see what's next. Smile

By Jipe

Paragon (1604)

Jipe's picture

27-10-2015, 15:35

c'est l'heure de l'apéro Wink

By Louthrax

Prophet (2436)

Louthrax's picture

27-10-2015, 16:01

Jipe wrote:

c'est l'heure de l'apéro Wink

Smile Et je crois bien que je vais devoir offrir une 3ème tournée de SUZ...

By Louthrax

Prophet (2436)

Louthrax's picture

27-10-2015, 16:02

I just release Beta3 version. Beta2 introduced a side-effect bug causing the "wild-card" to crash the application !

By Grauw

Ascended (10707)

Grauw's picture

27-10-2015, 23:03

Did a quick performance test, it decompresses an Aleste 2 disk image in 116s on Z80 and 24s on R800. With CRC check disabled, even in just 79s on Z80 and 18s on R800. Like gunzip, twice as fast as any other general-purpose unarchiver for MSX Smile.

By Louthrax

Prophet (2436)

Louthrax's picture

28-10-2015, 00:09

Looks like the performance may vary depending on your SD card content / layout. Did some tests today with big ROM/DSK archives (over 600MB), using Nextor, and the disk access times were significantly slower. This might be improved by using a "level 2" read-buffer cache in memory mapper... Thinking about that, I never imagined dealing with such big files on MSX!!

By Grauw

Ascended (10707)

Grauw's picture

28-10-2015, 10:24

Louthrax wrote:

Looks like the performance may vary depending on your SD card content / layout.

While testing gunzip, at some point I added a bunch of test gz files to the test environment, and a little later my performance tests had suddenly gotten slower without clear reason. Turned out it was due to those files I had added, even though they were in a subdirectory that I wasn’t accessing. Quite surprising. Perhaps due to more seeking for free FAT table entries?

Anyway, now every time I do a performance test, I move those extra test files out of the test environment so I can reproduce the same numbers. A bit of a hassle but what can you do.

Louthrax wrote:

Did some tests today with big ROM/DSK archives (over 600MB), using Nextor

Oh wow Smile. Now I want to try it too, just to see it unpack such a huge file. With openMSX running unthrottled in the background, I wonder how long it would take.

Louthrax wrote:

and the disk access times were significantly slower. This might be improved by using a "level 2" read-buffer cache in memory mapper...

DOS 2 does have a configurable cache for directory entries etc., I thought it was big by default but it might be worth trying to increase it, I think you can specify it through an environment variable.

By Louthrax

Prophet (2436)

Louthrax's picture

28-10-2015, 12:02

Tried again, there's indeed a significant slowdown on disks containing a big amount of data, probably du to the FAT crawling (tried to extract in a directory containing only 1 zip file, it does not help, so it's not the "directory crawling").

So we'll need to minimize calls to DOS functions. Write buffer is 32KB, that can't be expanded. I'll try to allocate 1 page of 16KB as "level 2" cache for read and see the results. Allocating more than that would not help too much as you'll need to do separate DOS calls to read each page anyway...

Never heard of a way to configure the cache for MSX-DOS 2 (except using external tools like DOS2CASH, but this does not work for FAT16).

By Grauw

Ascended (10707)

Grauw's picture

28-10-2015, 12:39

Louthrax wrote:

Never heard of a way to configure the cache for MSX-DOS 2 (except using external tools like DOS2CASH, but this does not work for FAT16).

Search for the "BUFFERS" command in the MSX-DOS 2 Commands Reference. Not sure whether it will actually do anything (can’t try now, at work).

By Louthrax

Prophet (2436)

Louthrax's picture

28-10-2015, 16:55

Tried the BUFFERS command, had no effect.

Also, it seems that the slowdown is caused by the write operation, not the read one. To test that, I put a lot of archive files on the "B:" drive, using 2 SD cards on my MegaFlashROM. Extracting from the "B:" drive to the near-empty A: drive is super fast. You can also simply try to do a "mkdir" on a full and empty SD card, the difference is significant.

As the write buffer is alredy 32KB big, I'm afraid there's not much we can improve from the gunzip / sofaunzip side.

By Grauw

Ascended (10707)

Grauw's picture

28-10-2015, 17:50

Sounds like it’s the FAT cluster allocation that can take a long time.

You could maybe try to allocate the space for the entire file in one go... That is, a call to _SEEK with the file size followed by a call to _WRITE of 0 bytes. Maybe that way it can allocate a series of clusters more efficiently.

By Louthrax

Prophet (2436)

Louthrax's picture

29-10-2015, 00:22

Grauw wrote:

That is, a call to _SEEK with the file size followed by a call to _WRITE of 0 bytes. Maybe that way it can allocate a series of clusters more efficiently.

Works. Fixed. Thanks Grauw Running Naked in a Field of Flowers

By Louthrax

Prophet (2436)

Louthrax's picture

29-10-2015, 00:24

Small note: _WRITE with 0 bytes does not seem to do anything. Had to _SEEK (filesize-1) & _WRITE(1 byte) if filesize > 0.

By Grauw

Ascended (10707)

Grauw's picture

29-10-2015, 00:26

Cool, good to know Smile.

By gdx

Enlighted (6108)

gdx's picture

10-11-2015, 09:52

I tried SofaUnZip to decompress a zip that contains files in a folder. It has decompressed correctly all files and the folder but decompressed files were been placed outside the folder.

By Louthrax

Prophet (2436)

Louthrax's picture

10-11-2015, 11:42

gdx wrote:

I tried SofaUnZip to decompress a zip that contains files in a folder. It has decompressed correctly all files and the folder but decompressed files were been placed outside the folder.

You should use the "x" command to "extract with path". "e" command extracts without path (have a look at the documentation), that's the same convention as command-line 7z (maybe also arj and rar IIRC). Let me know if it does not work !

By gdx

Enlighted (6108)

gdx's picture

10-11-2015, 12:00

Thank you. It works with X option. Smile