<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Zip on bramp.net</title>
    <link>https://blog.bramp.net/</link>
    <description>Recent content in Zip on bramp.net</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-GB</language>
    <lastBuildDate>Tue, 08 Nov 2011 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://blog.bramp.net/tags/zip/" rel="self" type="application/rss+xml" />
    
    <item>
      <title>Archive Mount</title>
      <link>https://blog.bramp.net/post/2011/11/08/archive-mount/</link>
      <pubDate>Tue, 08 Nov 2011 00:00:00 +0000</pubDate>
      
      <guid>https://blog.bramp.net/post/2011/11/08/archive-mount/</guid>
      <description><p><a href="http://en.wikipedia.org/wiki/Archivemount">Archive Mount</a> is a <a href="http://en.wikipedia.org/wiki/Filesystem_in_Userspace">FUSE application</a> that allows you to mount any zip/gz/bz2/tar file (in fact anything that <a href="http://code.google.com/p/libarchive/">libarchive</a> supports). This is very useful if you want to get at the files inside a archive without extracting them all.</p>
<p>In my use case I’m using Archive Mount with a zip file containing 10,000 files. This seemed to be very problematic as Archive Mount would take ~20 seconds to actually mount the zip file, and just as long to run a “ls” in the mounted directory.</p>
<p>So I downloaded the source, and started to make some tweaks to improve the performance. All my changes can be found on <a href="https://github.com/bramp/archivemount">github</a>, and so far I’ve done the following:</p>
<ol>
<li>Fixed a couple of minor problems. <a href="https://github.com/bramp/archivemount/commit/f173bb8766aed2ae62a53115c6f7a0a0a157b081">1</a> <a href="https://github.com/bramp/archivemount/commit/457e8f9199d0829b247229eb3910d94c1d98263c">2</a></li>
<li>I made some tweaks to store the head child, as well as the last child. This improves the start up speed by ~20%. <a href="https://github.com/bramp/archivemount/commit/882ff0979ec379b8e46e25c2bbf23ba0bbe19f6c">3</a></li>
<li>I also store the basename as well as the full file name. This reduced the calls to strrchr, and actually had a measurable improvement (At the cost of using one additional pointer for each file). <a href="https://github.com/bramp/archivemount/commit/10178dc167e06598468a644cb5b5469aac4ea098">4</a></li>
<li>I also changed init_node and free_node a little bit. This simplified the code in places. <a href="https://github.com/bramp/archivemount/commit/a996236177a8f742aa91cf8b5b90feb943d41ddd">5</a> <a href="https://github.com/bramp/archivemount/commit/0c022825795c2394e8788f289a869f05be9537ce">6</a></li>
<li>Finally, I actually completely replaced the linked list structure with a hash table. For small archives the speed difference is not noticeable, for large archives I had a 50x speed improvement! The awesome <a href="http://uthash.sourceforge.net/">uthash library</a> helped me do that. <a href="https://github.com/bramp/archivemount/commit/1f152876b1f39f53622d89d7fbb7b34fd70cfd10">7</a></li>
</ol>
<p>I’m also currently working on a complete re-haul of the open/read code. Once done, I’ll be able to very efficiently open and read from files. At the moment a read bizarrely takes O(N) (where N is the number of files in the zip file), and then each read requires re-reading the entire file up until the seek point.</p>
<p>I’m sending all these changes <a href="http://www.cybernoia.de/software/archivemount/">upstream</a>, so hopefully my work will appear in your local copy of archivemount soon! Until then <a href="https://github.com/bramp/archivemount/">follow my project on github</a>.</p></description>
    </item>
    
  </channel>
</rss>
