TranslationRestructuring

How to get the translation files in order so we can encourage more translations.

Created by: sikko, Last modification: 30 Sep 2007 (21:35 UTC) by WaterDragon

Translation Restructuring


I was looking at the translation of the bitweaver project when i noticed there are a lot of strange strings included in the master file. Things like numerous instances of a message with variables, or strings like "X's shoutbox". There are even strings in other languages in there. This makes it more difficult for translators to translate the project, and the translator has no idea if everything is translated.

There was a discussion about this on IRC, where I found out that:
  • The strings for each language will be extracted from the bitweaver.org site, and the strings to be stranslated are both the strings for the project, as for the site bw.o.
  • There was a time where strings were picked up on accident, they arent needed but are still in the current master strings. some are being shipped into the R2B2 masters.php file.
  • if strings are removed, but are used by the system, they will be re-added to the masters.php file at that time (this means to me that the right strings are in there for all the "used" parts of bw.o, but if not called for, will not be added)
  • xing removed about 2350 of the probably unused strings, and i removed manually about 1800. The Shoutbox and translated versions keep growing back though, in 0-9 there are a few of them back again after removal yesterday.
  • Getting bw to spit out a complete master file with al the -tra- and -tr- could be difficult
  • if you remove one of the master strings, the accociated translations are lost. There is no way of checking if this string is translated already, so we could keep those for an eventual merge

Proposed Solution:


I think the best way to resolve this and provide better support to translators would be:
  1. Identify translations in templates and php files that should be cleaned up
  2. Clean up the templates and php files so there is a minimum number of translations to be made
  3. Create new masters.php file from the bw program checking all occurrances of -tr- and -tra- (how to do this? What about arguments to smarty functions that auto translate.)
  4. check all translations for strings that match the ones that are used by bw
  5. merge the matched translations into the language files created through the new masters
  6. do spellcheck on the english masters
  7. diff the spellchecked version of the masters to the non-spellchecked vesion
  8. fix spelling errors in original .php and .tpl files
  9. make stats of each language available
  10. inform the translators of the easy way to translate,
  11. and try to get as much as possible languages up to 100%

After discussion on IRC, we came to the conclusion that this wasn't feasible to do before R2 release, so this would be something for R2.1. I'm looking into the strings atm, removed most of the German/Dutch/French/Russian/etc translation out of the masters, but the parts Shoutbox/Megaphone/arabic thingies keep coming back.

There is a script attached to this page to help identifying php and templates in need of cleanup:


This can be used to pull out the translated string from templates and php files. It can be run against the codebase with the following:

find . -name "*.tpl" -or -name "*.php" | xargs php -n -q collect-tr.php

The generated list should be combed to find places that we need some cleanup.


Comments

by dspt, 23 Oct 2007 (09:22 UTC)
another leak is a shoutbox title:
09xIwZdDnc's صندوق المحادثة