I have pushed an early prototype of a PO builder — boldly called
MessageCatalogBuilder — to Sphinx. I previously announced this to be a Sphinx extension but changed my mind and incorporated it into Sphinx’ core because patching translation sets into doctrees is likely a tightly integrated task. It extends the build mechanism by a new
gettext target and extracts raw messages into a collection of
“Raw messages?,” you say. That’s basically synonymous with woo, a lot of output… which is entirely useless. These messages contain no markup at all and thus are handy to estimate a message catalog’s contents and size but have no practical application for documents — except if you want to lose all inline markup, that is.
As the doctree is readily available during serialization to message catalogs I have looked into writers producing ReStructuredText themselves. I am pretty settled for reusing docutils’ reStructuredText parser, or at least its inliner, to rematerialize translations. There are a few subtle intricacies to inline markup which need to be resolved later on: some inline markup should be usable at the translator’s discretion (eg. strong, emphasis), other should not need to be retyped by every translator (eg. reference URIs, role targets) but could come in handy in some places (eg. references to multilingual pages).
I have briefly stumbled upon smaller style issues in the message catalogs while trying to rewrap
msgids to 80 characters per line. Unfortunately — Daniel and Georg don’t seem too happy with this situation either — we are stuck with Python 2.4+ and thus do not have textwrap‘s newest features at our disposal. I consider this a non-issue for now though.
Georg supplied initial code review and instructed me to write tests and docs. He maintains a fork of sphinx-i18n for hotfixes and merging.