<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Snakes on your Summer</title>
	<atom:link href="http://gsoc.robertlehmann.de/feed/" rel="self" type="application/rss+xml" />
	<link>http://gsoc.robertlehmann.de</link>
	<description>Sphinx Native Language Support</description>
	<lastBuildDate>Sat, 21 Aug 2010 14:45:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Making Ponies Fly</title>
		<link>http://gsoc.robertlehmann.de/making-ponies-fly/</link>
		<comments>http://gsoc.robertlehmann.de/making-ponies-fly/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 11:11:40 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Pootle]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=97</guid>
		<description><![CDATA[Separate from the tremendous amount of feedback from the community I quickly want to outline the pydotorg setup for later reuse. We have the most recent versions of Sphinx (with Native Language Support), Pootle and Translate Toolkit running from checkout on our servers and a current checkout of the Sphinx project source in question. There [...]]]></description>
			<content:encoded><![CDATA[<p>Separate from the <em>tremendous</em> <a title="Twitter / Robert Lehmann" href="http://twitter.com/rlehmann/status/17958643337">amount of feedback</a> from the community I quickly want to outline the pydotorg setup for later reuse.<span id="more-97"></span></p>
<p>We have the most recent versions of Sphinx (with Native Language Support), Pootle and Translate Toolkit running from checkout on our servers and a current checkout of the Sphinx project source in question. There is a directory for <code>.pot</code> files (message templates) built by Sphinx and one for <code>.mo</code> files (compiled catalogs) built from Pootle&#8217;s <code>.po</code> files (message catalogs). One last directory is used for translated HTML builds.</p>
<p><em>Attention:</em> Apache should drop into the same system user as the one maintaining these files. This saves you a <em>lot</em> of hassle messing around with shared files, groups rights, yadda yadda.</p>
<p><em>Note:</em> I will use Unix environment variable syntax for locations, feel free to statically insert your preferences there (we do).</p>
<p>Sphinx has to build message catalogs from a <code>$SRC</code> directory into Pootle&#8217;s <code>PODIRECTORY</code> set in <code>localsettings.py</code>. For our deployment, <code>$PROJECT</code> is always <code>python</code>.</p>
<pre>sphinx-build -b gettext $SRC $PODIRECTORY/$PROJECT/templates</pre>
<p>Pootle creates its translations in <code>$PODIRECTORY/$PROJECT/$LANGUAGE</code> but currently Sphinx only reads compiled message catalogs.</p>
<pre>for file in $PODIRECTORY/$PROJECT/$LANGUAGE/*.po; do
    mofile="$MODIRECTORY/$LANGUAGE/LC_MESSAGES/$(basename $file | sed s/po$/mo/)"
    rm -f $mofile # msgfmt will do merging otherwise
    msgfmt "$file" -o "$mofile"
done
</pre>
<p>Sphinx can now pick the translations up from <code>$MODIRECTORY</code> by adding it to <code>locale_dirs</code> from your <code>conf.py</code>. The code snippet above only works for one (or non-overlapping) projects but is easily extended to multiple targets. (Warning: <code>locale_dirs</code> can not be set from the command line as it needs to be a list of paths).</p>
<pre>sphinx-build -Eaq -Dlanguage=$LANGUAGE $SRC $BUILD/$LANGUAGE</pre>
<p>Your HTML build should end up in <code>$BUILD/$LANGUAGE</code>.</p>
<p><em>NB.</em> If you want to reproduce a Python documentation setup you will likely need to enable <code>sphinx.ext.oldcmarkup</code> for the time being because Python did not switch to Sphinx 1.0 as of the time of our deployment. This extension will become moot as soon as Python does the jump. Additionally you might want to enable <code>AUTOSYNC</code> and the Google Translate backend.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/making-ponies-fly/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Way We Roll</title>
		<link>http://gsoc.robertlehmann.de/the-way-we-roll/</link>
		<comments>http://gsoc.robertlehmann.de/the-way-we-roll/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 07:54:20 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Pootle]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=96</guid>
		<description><![CDATA[Without further ado I would like to announce the beta launch of the Python translation services, available at pootle.python.org. I am reprinting the full announcement made to the Python Documentation Special Interest Group here for posterity: Dear Python Documentation community, we are proud to announce the *BETA* launch of the translation services for the official [...]]]></description>
			<content:encoded><![CDATA[<p>Without further ado I would like to announce the <strong>beta launch</strong> of the <a title="Python Translations" href="http://pootle.python.org">Python translation services</a>, available at <code>pootle.python.org</code>.<span id="more-96"></span></p>
<p>I am reprinting the <a title="[Doc-SIG] Translations" href="http://mail.python.org/pipermail/doc-sig/2010-July/003877.html">full announcement</a> made to the <a title="Doc-SIG — Python Documentation Special Interest Group" href="http://www.python.org/community/sigs/current/doc-sig/">Python Documentation Special Interest Group</a> here for posterity:</p>
<blockquote><p>Dear Python Documentation community,</p>
<p>we are proud to announce the *BETA* launch of the translation services for the official Python documentation as part of Google&#8217;s Summer of Code. It is available from</p>
<p style="padding-left: 30px;"><a href="http://pootle.python.org/">http://pootle.python.org/</a></p>
<p>and is open for registration now. We have added a few languages that we felt would generate enough feedback but are always happy to add more language teams.</p>
<p>Please note that the software toolchain used to create this service is still experimental and might probably change substantially. We are trying our best to maintain stable services but cannot guarantee every bit you submit will be ultimately usable in our final translation targets.</p>
<p>If you are experiencing any trouble, want to bring up any suggestions or have other feedback do not hesitate to contact me or file a ticket at <a href="http://bitbucket.org/lehmannro/sphinx-i18n/issues">http://bitbucket.org/lehmannro/sphinx-i18n/issues</a>.</p>
<p>Robert Lehmann</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/the-way-we-roll/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Into the Wheel Shop</title>
		<link>http://gsoc.robertlehmann.de/into-the-wheel-shop/</link>
		<comments>http://gsoc.robertlehmann.de/into-the-wheel-shop/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 03:12:29 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Pootle]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=55</guid>
		<description><![CDATA[To rehash quickly, the Sphinx Natural Language Support project actually spans two very different aspects: a Sphinx extension to extract/incorporate translatable strings an interface to maintain translations It turns out the latter half is already partially solved by Pootle and I can build on that instead of rolling my own half-baked ad-hoc web interface. Now [...]]]></description>
			<content:encoded><![CDATA[<p>To rehash quickly, the Sphinx Natural Language Support project actually spans two very different aspects:<span id="more-55"></span></p>
<ul>
<li>a Sphinx extension to extract/incorporate translatable strings</li>
<li>an interface to maintain translations</li>
</ul>
<p>It turns out the latter half is already partially solved by <a title="Pootle, part of the Translation Toolkit" href="http://translate.sourceforge.net/wiki/pootle/index">Pootle</a> and I can build on that instead of rolling my own <a title="World's Worst Website" href="http://www.angelfire.com/super/badwebs/">half-baked</a> ad-hoc web interface.</p>
<p>Now this changes my whole schedule (<em>sigh</em>) and I can concentrate on improving Pootle in the second term of the summer. We have a handful of priorities which will enable a pydotorg setup and wide-spread adoption of a Sphinx-Pootle conflation.</p>
<h1>Installation</h1>
<p>Pootle is <a title="pootle:installation — Translate Toolkit &amp; Pootle" href="http://translate.sourceforge.net/wiki/pootle/installation">easy_installable</a> per se but behaves a bit strangely in virtual environments due to absolute paths in its setup. There are packages <a title="Details of package pootle in Debian lenny" href="http://packages.debian.org/lenny/pootle">in Debian</a> stable but they are old as the hills and not of any particular use as we expect changes to Pootle during this summer, aight. (I do not want to downplay Debian, and Pootle 2.x is already <a title="Details of package pootle in Debian sid" href="http://packages.debian.org/sid/pootle">in the next release</a> pool.)</p>
<h1>Features</h1>
<p>The most dramatic changes are going to be about <em>Change Tracking</em>. Pootle needs to grow facilities to show differences between updates, ie. several versions of message catalogs. Pootle already features <a title="Descriptions of all pofilter tests" href="http://translate.sourceforge.net/wiki/toolkit/pofilter_tests">checkers</a> for a couple of metrics and should probably be extended by reStructuredText validation.</p>
<p>I have <a title="[translate-pootle] Pootle and Sphinx" href="http://sourceforge.net/mailarchive/forum.php?thread_name=1275304723.29086.10161.camel%40localhost.localdomain&amp;forum_name=translate-pootle">started discussion</a> with the Pootle developers and they welcomed my plans.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/into-the-wheel-shop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Since I&#8217;d Gone This Far, I Might As Well Turn Around</title>
		<link>http://gsoc.robertlehmann.de/since-id-gone-this-far-i-might-as-well-turn-around/</link>
		<comments>http://gsoc.robertlehmann.de/since-id-gone-this-far-i-might-as-well-turn-around/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 03:11:54 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Sphinx]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=70</guid>
		<description><![CDATA[Extracting messages from Sphinx is fairly easy. Apart from the occasional obstacle here and there when dealing with non-plain text such as directives the machinery already in place makes it a straight-forward task. But collecting messages from documents is only half the battle in implementing Native Language Support for Sphinx — they also need to [...]]]></description>
			<content:encoded><![CDATA[<p>Extracting messages from Sphinx is fairly easy. Apart from the occasional obstacle here and there when dealing with non-plain text such as <a title="reStructuredText Directives" href="http://docutils.sourceforge.net/docs/ref/rst/directives.html">directives</a> the machinery already in place makes it a straight-forward task. But collecting messages from documents is only half the battle in implementing Native Language Support for Sphinx — they also need to go in again.<span id="more-70"></span>Sphinx already exposes mechanisms to <a title="Internationalization in Sphinx" href="http://bitbucket.org/lehmannro/sphinx-i18n/src/tip/doc/intl.rst">configure language</a> settings, commonly called <a title="The build configuration file — Sphinx documentation" href="http://sphinx.pocoo.org/latest/config.html#confval-locale_dirs">locale</a>. Hidden deep down in its innards there is a procedure to load gettext-style message catalogs (<a title="sphinx.locale: Locale utilities" href="http://bitbucket.org/birkenfeld/sphinx/src/tip/sphinx/locale/__init__.py"><code>sphinx/locale/__init__.py</code></a>) which I only needed to <a title="Cross-changeset (r2223:2fb72bb549a1-r2361:d5c6178cbf95) on sphinx/locale/__init__.py" href="http://bitbucket.org/lehmannro/sphinx-i18n/diff/sphinx/locale/__init__.py?diff2=d5c6178cbf95&amp;diff1=2fb72bb549a1">augment</a> for domains other than Sphinx itself.</p>
<p>With that done we have the translated message texts for each document node at our fingertips. These are still flat strings rather than nested document trees and thus not directly usable for our purposes.</p>
<p>As luck would have it Docutils readily exposes the <a title="Contents of /trunk/docutils/docutils/parsers/rst/__init__.py" href="http://svn.berlios.de/viewvc/docutils/trunk/docutils/docutils/parsers/rst/__init__.py?view=markup">reStructuredText parser</a> and we can just turn the messages into small doctrees and merge those into our existing document. This is now implemented in Sphinx.</p>
<p>There is one last remaining caveat: the translation step as it stands turns the build process into a dog slow mess. Sphinx decided to update the whole environment (that is, re-read all files) when the <code>language</code> value changes; which is the case <em>every. single. build.</em> right now. For our current use I don&#8217;t care too hard though this needs to be fixed when <a title="lehmannro / sphinx-i18n / source" href="http://bitbucket.org/lehmannro/sphinx-i18n/src">sphinx-i18n</a> is merged into upstream. It&#8217;s likely to require more in-depth architectural changes to Sphinx&#8217; core to get this as quick and resource-friendly as possible.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/since-id-gone-this-far-i-might-as-well-turn-around/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drumming Up</title>
		<link>http://gsoc.robertlehmann.de/drumming-up/</link>
		<comments>http://gsoc.robertlehmann.de/drumming-up/#comments</comments>
		<pubDate>Thu, 17 Jun 2010 07:24:37 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Internationalization]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=68</guid>
		<description><![CDATA[During LinuxTag 2010 in Berlin I discussed internationalization issues with a bunch of people from major Linux distributions. I hereby express my gratitude to all of you and will summarize my impressions. Any errors are very likely to be me mixing up the facts and I would be pleased to learn better from you! Gentoo&#8216;s [...]]]></description>
			<content:encoded><![CDATA[<p>During <a href="http://www.linuxtag.org/2010/">LinuxTag 2010</a> in Berlin I discussed internationalization issues with a bunch of people from major Linux distributions. I hereby express my gratitude to all of you and will summarize my impressions. Any errors are very likely to be me mixing up the facts and I would be pleased to learn better from you!</p>
<p><strong><span id="more-68"></span>Gentoo</strong>&#8216;s tools remain mostly untranslated. There is no coordinated i18n effort and, if at all, they use plain PO files with a strong dependency on shared translation memory.</p>
<p>For software, <strong>FreeBSD</strong> follows the same strategy. They have a German <a title="FreeBSD German Documentation Project" href="https://doc.bsdgroup.de/">manual</a> in DocBook format which can be translated and pushed to version control. To maintain a uniform language style translators need to work on whole chapters. Issues are resolved through their mailing list which will supply SGML markup to plain text translations or pick up half-finished ones. Updates to the reference documentation are tracked through revision numbers.</p>
<p>The <strong>Sidux</strong> <a title="sidux Manuals" href="http://manual.sidux.com/">manual</a> follows a similiar workflow: there is one person writing the reference documentation in English and one maintainer in charge per translation. Changes propagate by word of mouth.</p>
<p><strong>Debian</strong> has the excellent <a title="The Debian Description Translation Project" href="http://www.debian.org/international/l10n/ddtp">Debian Description Translation Project</a> (DDTP) which handles translations via e-mail and has a built-in notion of review. Their home-brewn web interface <a title="Debian Distributed Translation Server Satellite" href="http://ddtp.debian.net/ddtss/index.cgi/">Debian Distributed Translation Server Satellite</a> (DDTSS) acts as a gateway to their mail server.</p>
<p>Localization is a <a title="Why is This Important? — Development/Tutorials/Localization/i18n — KDE TechBase" href="http://techbase.kde.org/Development/Tutorials/Localization/i18n#Why_is_This_Important.3F">core value</a> in the <strong>KDE</strong> culture. They have <a title="KDE Localization" href="http://i18n.kde.org/">a lot of documentation</a> on internationalization issues and try to remove friction where possible, eg. by encouraging <a title="Development/Tutorials/Localization/i18n Semantics" href="http://techbase.kde.org/Development/Tutorials/Localization/i18n_Semantics">semantic markup and context</a> in translatable strings.<br />
Approaching releases there are <a title="Message Freezes — Development/Tutorials/Localization/i18n Challenges" href="http://techbase.kde.org/Development/Tutorials/Localization/i18n_Challenges#Message_Freezes">time spans</a> where translatable strings ought not be changed by developers. This guarantees there are translations available on release date.</p>
<p><strong>OpenSUSE</strong> has its translations for YaST and other system components <a title="SVN structure — openSUSE Localization Guide" href="http://en.opensuse.org/OpenSUSE_Localization_Guide#SVN_Structure">stored</a> in a central <a title="WebSVN — opensuse-i18n" href="http://svn.berlios.de/wsvn/opensuse-i18n/trunk/lcn/en_GB/po/#_trunk_lcn_en_GB_po_">SVN repository</a>. PO files are <a title="OpenSUSE Localization Work with PO Files" href="http://en.opensuse.org/OpenSUSE_Localization_Work_with_PO_Files">modified</a> locally and aided by a shared memory. <code>msgfmt</code> statistics are adequately <a title="openSUSE Localization Statistics" href="http://i18n.opensuse.org/stats/index.php">visualized</a>.</p>
<p>I also had the chance to talk to <a title="Henning Eggers in Launchpad" href="https://launchpad.net/~henninge">Henning Eggers</a> of <strong>Canonical</strong> at the <a title="LinuxTag 2010 Ubuntu Berlin Barbeque" href="http://www.ubuntu-berlin.de/LinuxTag10-BBQ-english">Ubuntu BBQ</a> who is part of the <a title="Software translations" href="https://translations.launchpad.net/">Launchpad Translations</a> team. I learned that they are actually a pretty shallow layer above gettext message catalogs and settled for providing utilities such as shared translation memory around the actual process. Fuzzy translations are discarded because <em>wrong</em> translations are absolutely fatal to their use case. Henning mentioned they would be interested in change tracking (for instance by means of <a href="http://en.wikipedia.org/wiki/Edit_distance">edit distance</a> between two <code>msgid</code>s) but do not have any such mechanisms in place currently.</p>
<p>I heard users absolutely want translation services to Just Work™ when they invest time in localization. Errors bubbling up from maintainers&#8217; faults (such as  duplicate <code>msgid</code>s) which inhibit their productivity usually kill motivation. Better diffs and change tracking seem to be a key feature people are looking for.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/drumming-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fine or Coarse?</title>
		<link>http://gsoc.robertlehmann.de/fine-or-coarse/</link>
		<comments>http://gsoc.robertlehmann.de/fine-or-coarse/#comments</comments>
		<pubDate>Tue, 08 Jun 2010 05:01:55 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Sphinx]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=65</guid>
		<description><![CDATA[I touched on message granularity in my proposal already and nailed down a pragmatic policy in my prototype: messages are basically split on a per-paragraph level. Inline markup is explicitly atomic and never propagated to a sole message. Docutils makes this as easy as shooting fish in a barrel — it provides nodes marked up [...]]]></description>
			<content:encoded><![CDATA[<p>I touched on <a title="gsoc-sphinx-i18n: Granularity" href="http://docs.google.com/View?id=df6p74cq_7gkvgmhgx#Granularity_and_Tracking_74959_5145674120425918">message granularity</a> in my proposal already and nailed down a pragmatic policy in my prototype: messages are basically split on a per-paragraph level. Inline markup is explicitly atomic and never propagated to a sole message.</p>
<p><span id="more-65"></span>Docutils makes this as easy as shooting fish in a barrel — it provides nodes marked up as <em>“containing text immediately”</em> (<code>nodes.TextElement</code>) which just need to be serialized to messages.</p>
<p>Other projects such as <a title="Real World Haskell" href="http://book.realworldhaskell.org/read/">Real World Haskell</a> use the same policy for their comment spans. Of course paragraphs are the most trivial message. List items get a message each, as do admonitions.</p>
<p>The Translate Toolkit supplies <a title="toolkit:posegment · Translate Toolkit &amp; Pootle" href="http://translate.sourceforge.net/wiki/toolkit/posegment">a tool</a> called <code>posegment</code> to split messages containing multiple sentences into smaller chunks but it has the usual fallacies like decomposing <em>“Dr. Jekyll”</em> into two messages.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/fine-or-coarse/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>It&#8217;s alive, it&#8217;s moving, IT&#8217;S ALIVE!</title>
		<link>http://gsoc.robertlehmann.de/its-alive-its-moving/</link>
		<comments>http://gsoc.robertlehmann.de/its-alive-its-moving/#comments</comments>
		<pubDate>Tue, 08 Jun 2010 04:29:00 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Sphinx]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=58</guid>
		<description><![CDATA[I have pushed an early prototype of a PO builder — boldly called MessageCatalogBuilder — to Sphinx. I previously announced this to be a Sphinx extension but changed my mind and incorporated it into Sphinx&#8217; core because patching translation sets into doctrees is likely a tightly integrated task. It extends the build mechanism by a [...]]]></description>
			<content:encoded><![CDATA[<p>I have pushed an early prototype of a PO builder — boldly called <code>MessageCatalogBuilder</code> — to  <a title="lehmannro / sphinx-i18n / changesets" href="http://bitbucket.org/lehmannro/sphinx-i18n/changesets">Sphinx</a>. I previously announced this to be a Sphinx extension but changed my mind and incorporated it into Sphinx&#8217; core because patching translation sets into doctrees is likely a tightly integrated task. It extends the build mechanism by a new <code>gettext</code> target and extracts raw messages into a collection of <code>.pot</code> files.</p>
<p><span id="more-58"></span>“Raw messages?,” you say. That&#8217;s basically synonymous with <em>woo, a lot of output… which is entirely useless</em>. These messages contain no markup at all and thus are handy to estimate a message catalog&#8217;s contents and size but have no practical application for documents — except if you want to lose all inline markup, that is.</p>
<p>As the doctree is readily available during serialization to message catalogs I have looked into <a title="A reStructedText *writer*? — docutils-user" href="http://article.gmane.org/gmane.text.docutils.user/5657">writers producing ReStructuredText</a> themselves. I am pretty settled for reusing docutils&#8217; <a title="Docutils Hacker's Guide" href="http://docutils.sourceforge.net/docs/dev/hacking.html#parsing-the-document">reStructuredText parser</a>, or at least its inliner, to rematerialize translations. There are a few subtle intricacies to inline markup which need to be resolved later on: some inline markup should be usable at the translator&#8217;s discretion (eg. strong, emphasis), other should not need to be retyped by every translator (eg. reference URIs, role targets) but could come in handy in some places (eg. references to multilingual pages).</p>
<p>I have briefly stumbled upon smaller style issues in the message catalogs while trying to rewrap <code>msgid</code>s to 80 characters per line. Unfortunately — Daniel and Georg <a title="#pocoo logs from 4th of June 2010" href="http://dev.pocoo.org/irclogs/%23pocoo.2010-06-04.log">don&#8217;t seem too happy</a> with this situation either — we are stuck with Python 2.4+ and thus do not have <a title="textwrap — Text wrapping and filling — Python documentation" href="http://docs.python.org/library/textwrap">textwrap</a>&#8216;s <a title="Issue 1581073: Allow textwrap to preserve leading and trailing whitespace - Python tracker" href="http://bugs.python.org/issue1581073">newest features</a> at our disposal. I consider this a non-issue for now though.</p>
<p>Georg supplied initial code review and instructed me to write tests and docs. He maintains a <a title="birkenfeld / sphinx-i18n / overview" href="http://bitbucket.org/birkenfeld/sphinx-i18n">fork of sphinx-i18n</a> for hotfixes and merging.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/its-alive-its-moving/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Casting South</title>
		<link>http://gsoc.robertlehmann.de/casting-south/</link>
		<comments>http://gsoc.robertlehmann.de/casting-south/#comments</comments>
		<pubDate>Tue, 11 May 2010 13:36:17 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Internationalization]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=38</guid>
		<description><![CDATA[I have set sails for the Community Bonding Period and am veering away from the Sphinx codebase to more research-related realms. I abandoned the XLIFF format ­— as there really is no point in duplicate representation — and will focus my efforts on PO files. I am going to dive into gettext family of translation [...]]]></description>
			<content:encoded><![CDATA[<p>I have set sails for the Community Bonding Period and am veering away from the Sphinx codebase to more research-related realms.</p>
<p><span id="more-38"></span>I abandoned the <a title="XLIFF 1.2 Specification" href="http://docs.oasis-open.org/xliff/xliff-core/xliff-core.html">XLIFF format</a> ­— as there really is no point in duplicate representation — and will focus my efforts on PO files. I am going to dive into gettext family of translation toolsuites to get an impression not only of the <a title="GNU `gettext' utilites: The Format of PO Files" href="http://www.gnu.org/software/hello/manual/gettext/PO-Files.html">PO File Format Specification</a> but of the whole normative landscape of solutions.</p>
<p>The new tool will require user feedback, user feedback, and user feedback and thus I require the community to help in building the best possible workflow for <em>them</em>. I will reach out to some projects in the future but appreciate contributions from anybody. (<a title="Das Python3.1-Tutorial auf Deutsch" href="http://bitbucket.org/cofi/py-tutorial-de/wiki/Home">Py-Tutorial-de</a>, I&#8217;m looking at you! <em>hint hint</em>)</p>
<p>I discussed the change tracking mechanism briefly with Martin and we concluded that walking the version history is a complex task which might require more time and skill than I have at hand. Thus it is desirable to save revision information <em>explicitly</em> in message catalogs.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/casting-south/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Meet Your Mentors</title>
		<link>http://gsoc.robertlehmann.de/meet-your-mentors/</link>
		<comments>http://gsoc.robertlehmann.de/meet-your-mentors/#comments</comments>
		<pubDate>Mon, 03 May 2010 10:30:34 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Meta]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=23</guid>
		<description><![CDATA[This Saturday morning I had a meeting with my three(!) mentors, namely Jannis Leidel, Martin von Löwis and Georg Brandl on IRC (with Daniel Neuhäuser and Armin Ronacher chiming in occasionally). I have been asked in advance why I chose gettext .PO files to store translations and we briefly discussed that issue. gettext was written [...]]]></description>
			<content:encoded><![CDATA[<p>This Saturday morning I had a meeting with my three(!) mentors, namely <a href="http://jannisleidel.com/">Jannis Leidel</a>, <a href="http://loewis.de/martin/">Martin von Löwis</a> and <a href="http://pythonic.pocoo.org/">Georg Brandl</a> on IRC (with Daniel Neuhäuser and Armin Ronacher chiming in occasionally).</p>
<p><span id="more-23"></span>I have been asked in advance why I chose gettext .PO files to store translations and we briefly discussed that issue. gettext was written for translating <em>messages</em> and not neccessarily whole paragraphs of free text. Its key tool used for updating translation sets, <code>msgmerge</code>, uses fuzzy matching with the intention of producing better results but this tends to fail. There is no notion of versioning and VCSes (or patch queues) need to be integrated into the workflow to retain history and inline markup is still a whole different can of worms (which I need to meditate about). The two selling points for .PO files are its suitability as a key-value store and that there <em>are</em> established tools after all. The other shortcomings have to be overcome by the new tool which will monitor version control and display stale documentation segments (and exports message catalogs along the way).</p>
<p>We had a look at <a title="publican" href="http://jfearn.fedorapeople.org/Publican/">Publican</a>, a publishing system based on DocBook XML. For translations they use .PO files, too &#8212; it&#8217;s turtles all the way down.</p>
<p>It quickly became apparent that the tool for maintaining document translations will be separate from Sphinx&#8217;s core. There are already projects verging into the same direction, eg. <a title="Bring Your Content To The Rest Of The World." href="http://www.transifex.net/">Transifex</a> / <a title="Transifex development portal" href="http://trac.transifex.org/">Txo</a> or <a title="Software translations on Launchpad.net" href="https://translations.launchpad.net/">Launchpad Translations</a> (which both neglect change tracking and are built for UI translations).</p>
<p>As there has to be at least one public instance on python.org we need to authenticate users. Invitations gained quite a bit of traction and give individual translation communities (groups) the possibility to self-organize. Integration with Roundup (Python&#8217;s bug tracker) would be nice-to-have but is not too important for now.</p>
<p>If you want to participate in our IRC meetings, we&#8217;ll be in #pocoo Wednesday evening.</p>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/meet-your-mentors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Warming up…</title>
		<link>http://gsoc.robertlehmann.de/warming-up/</link>
		<comments>http://gsoc.robertlehmann.de/warming-up/#comments</comments>
		<pubDate>Wed, 28 Apr 2010 17:36:01 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Sphinx]]></category>

		<guid isPermaLink="false">http://gsoc.robertlehmann.de/?p=13</guid>
		<description><![CDATA[So for starters I wrote a sidebar extension inspired by Python Sidebar (of Edgewall credit). I initially planned to file a simple patch but it grew into a full-fledged extension so the commit history is pretty meaningless now. Behind the scenes it still took about five iterations to get this Done Right. I started off [...]]]></description>
			<content:encoded><![CDATA[<p>So for starters I wrote a <a title="Mozilla panels for Sphinx documentation generator inspired by Python Sidebar" href="http://bitbucket.org/lehmannro/sphinx-sidebar">sidebar extension</a> inspired by <a title="A Python Sidebar for Mozilla" href="http://www.edgewall.org/python-sidebar/">Python Sidebar</a> (of Edgewall credit).</p>
<p><span id="more-13"></span>I initially planned to file a simple patch but it grew into a full-fledged extension so the commit history is pretty meaningless now. Behind the scenes it still took about five iterations to get this Done Right.</p>
<p>I started off with writing a new builder from scratch which was a nice exercise for the i18n builder I am going to write but overachieving in terms of this simple extension. It was a pretty braindead implementation and used close to zero of Sphinx&#8217;s Builder API. I glimpsed briefly into inheriting from StandaloneHTMLBuilder &#8212; the HTML document builder &#8212; which was a little obtuse and clumsy for creating just a hand full of generic documents. With discovering the intricacies of the build process I learned how to hook into the right places and use existing build features (such as status iterators, which render a handy progressbar during the build).</p>
<p>It then hit me that the extension mechanism might well be a better entry point for my plugin and so it was. An initial version created all documents in isolation of the original build process which was okay but would&#8217;ve required a time machine for integrating the sidebar extension into existing pages, which struck me as an odd and overly heavy dependency. I quickly split up the (already two-pass) build process into a collection and a build phase and could easily insert links to the sidebar panels into original documents.</p>
<h1>Lessons Learned</h1>
<ul>
<li>Pocoo&#8217;s already using Mercurial. Developing features, as minor as they may seem, should happen in a fork, not a checkout-patch (which I&#8217;m still used to from Python SVN).</li>
<li>Sphinx&#8217; documentation needs update in the &#8220;Writing a new builder&#8221; chapter. I will perhaps come around to that while writing/documenting my next builder.</li>
<li>Review is awesome and does not need an extensive toolchain. Georg Brandl briefly discussed the extension (by loading and trying it himself); his comments allowed me to fill in gaps in the documentation.</li>
<li>Docutils perhaps has a few bugs in trunk that interfered during debugging. (See <a title="Docutils Artifact ID 2993967: Rendering DOM nodes for tuple attributes" href="https://sourceforge.net/tracker/?func=detail&amp;aid=2993967&amp;group_id=38414&amp;atid=422032">#2993967</a> and <a title="Docutils Revision 6293: repr(Text) failed with long string" href="http://svn.berlios.de/viewcvs/docutils?view=rev&amp;revision=6293">r6293</a>/<a title="Docutils Artifact ID 2975987: No test case for Text.shortrepr with long string." href="https://sourceforge.net/tracker/index.php?func=detail&amp;aid=2975987&amp;group_id=38414&amp;atid=422030">#2993967</a>).</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://gsoc.robertlehmann.de/warming-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

