I touched on message granularity in my proposal already and nailed down a pragmatic policy in my prototype: messages are basically split on a per-paragraph level. Inline markup is explicitly atomic and never propagated to a sole message.
Docutils makes this as easy as shooting fish in a barrel — it provides nodes marked up as “containing text immediately” (
nodes.TextElement) which just need to be serialized to messages.
Other projects such as Real World Haskell use the same policy for their comment spans. Of course paragraphs are the most trivial message. List items get a message each, as do admonitions.
The Translate Toolkit supplies a tool called
posegment to split messages containing multiple sentences into smaller chunks but it has the usual fallacies like decomposing “Dr. Jekyll” into two messages.