<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Utf8 on bramp.net</title>
    <link>https://blog.bramp.net/</link>
    <description>Recent content in Utf8 on bramp.net</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-GB</language>
    <lastBuildDate>Thu, 23 Sep 2010 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://blog.bramp.net/tags/utf8/" rel="self" type="application/rss+xml" />
    
    <item>
      <title>UTF-8 Directory Listing</title>
      <link>https://blog.bramp.net/post/2010/09/23/utf-8-directory-listing/</link>
      <pubDate>Thu, 23 Sep 2010 00:00:00 +0000</pubDate>
      
      <guid>https://blog.bramp.net/post/2010/09/23/utf-8-directory-listing/</guid>
      <description><p>I had a need to create a directory listing with all the UTF-8 characters intact. This seems quite a chore on Windows, as doing anything via the shell seems to mangle the characters and show ???? instead of the real characters. For example, both the built in <strong>dir</strong> and Cygwin <strong>ls</strong> or <strong>find</strong> seemed affected. This turns out to be a <a href="http://stackoverflow.com/questions/379240/is-there-a-windows-command-shell-that-will-display-unicode-characters">limitation in the windows shell</a>.</p>
<p>To solve this problem I wrote a bit of python to read the file names in full UTF-8 and output the results directly to a file (and not via a pipe, which would again be via the shell). The resulting very simple script is as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">os</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">codecs</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">log</span> <span class="o">=</span> <span class="n">codecs</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s1">&#39;listing&#39;</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="s1">&#39;w&#39;</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="s2">&#34;utf-8&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">root</span><span class="p">,</span> <span class="n">dirs</span><span class="p">,</span> <span class="n">files</span> <span class="ow">in</span> <span class="n">os</span><span class="o">.</span><span class="n">walk</span><span class="p">(</span><span class="sa">u</span><span class="s1">&#39;.&#39;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">	<span class="n">log</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">root</span> <span class="o">+</span> <span class="sa">u</span><span class="s2">&#34;</span><span class="se">\n</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">	<span class="k">for</span> <span class="n">file</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">files</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">		<span class="n">log</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">file</span><span class="p">)</span> <span class="o">+</span> <span class="sa">u</span><span class="s2">&#34;</span><span class="se">\n</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">log</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</span></span></code></pre></div></description>
    </item>
    
  </channel>
</rss>
