Archive Mount is a FUSE application that allows you to mount any zip/gz/bz2/tar file (in fact anything that libarchive supports). This is very useful if you want to get at the files inside a archive without extracting them all.
In my use case I’m using Archive Mount with a zip file containing 10,000 files. This seemed to be very problematic as Archive Mount would take ~20 seconds to actually mount the zip file, and just as long to run a “ls” in the mounted directory.
So I downloaded the source, and started to make some tweaks to improve the performance. All my changes can be found on github, and so far I’ve done the following:
- Fixed a couple of minor problems. 1 2
- I made some tweaks to store the head child, as well as the last child. This improves the start up speed by ~20%. 3
- I also store the basename as well as the full file name. This reduced the calls to strrchr, and actually had a measurable improvement (At the cost of using one additional pointer for each file). 4
- I also changed init_node and free_node a little bit. This simplified the code in places. 5 6
- Finally, I actually completely replaced the linked list structure with a hash table. For small archives the speed difference is not noticeable, for large archives I had a 50x speed improvement! The awesome uthash library helped me do that. 7
I’m also currently working on a complete re-haul of the open/read code. Once done, I’ll be able to very efficiently open and read from files. At the moment a read bizarrely takes O(N) (where N is the number of files in the zip file), and then each read requires re-reading the entire file up until the seek point.