Archive Mount is a FUSE application that allows you to mount any zip/gz/bz2/tar file (in fact anything that libarchive supports). This is very useful if you want to get at the files inside a archive without extracting them all.
In my use case I’m using Archive Mount with a zip file containing 10,000 files. This seemed to be very problematic as Archive Mount would take ~20 seconds to actually mount the zip file, and just as long to run a “ls” in the mounted directory.
So I downloaded the source, and started to make some tweaks to improve the performance. All my changes can be found on github, and so far I’ve done the following:
- Fixed a couple of minor problems. 1 2
- I made some tweaks to store the head child, as well as the last child. This improves the start up speed by ~20%. 3
- I also store the basename as well as the full file name. This reduced the calls to strrchr, and actually had a measurable improvement (At the cost of using one additional pointer for each file). 4
- I also changed init_node and free_node a little bit. This simplified the code in places. 5 6
- Finally, I actually completely replaced the linked list structure with a hash table. For small archives the speed difference is not noticeable, for large archives I had a 50x speed improvement! The awesome uthash library helped me do that. 7
I’m also currently working on a complete re-haul of the open/read code. Once done, I’ll be able to very efficiently open and read from files. At the moment a read bizarrely takes O(N) (where N is the number of files in the zip file), and then each read requires re-reading the entire file up until the seek point.
I’m sending all these changes upstream, so hopefully my work will appear in your local copy of archivemount soon! Until then follow my project on github.