Getting started with a forked project

The OpenHistoricalMap project forks many of the OpenStreetMap projects but because our mission, naming, and licensing is different we need to diverge from original projects’ translations. One such project is the tasking-manager. The OSM version uses transifex already and is modest in scope but I’m having a hard time setting up our fork with transifex.

The unmodified config file is at: tasking-manager/.tx/config at staging · OpenHistoricalMap/tasking-manager · GitHub

Existing translations are at: tasking-manager/frontend/src/locales at staging · OpenHistoricalMap/tasking-manager · GitHub

I’ve tried running tx init locally; it produces a much smaller file and, frankly, I’m unclear what a resource is. I’ve done this a few times and once there were no prefilled options in the CLI for resources but another time en.json was prefilled. Although the file_filter = frontend/src/locales/<lang>.json, tx push doesn’t reference those files and manually uploading them through the dashboard results in a few errors–strange because they are from an operational instance–and they don’t seem to be recognized in the dashboard.

Any tips or pointers to documentation about how to initialize a project that’s already in motion would be greatly appreciated. Thanks in advance.

The original project’s documentation appears to be out of date as there’s no mention of transifex in the requirements.txt file. tasking-manager/docs/developers/translations.md at develop · hotosm/tasking-manager · GitHub

I’ve come across GitHub: Installation and configuration | Transifex Help Center which seems to provide the way forward.

Hello @erictheise,

I am Antonis from the Transifex Customer Success team. I hope you’re well!

The Transifex GitHub integration guide should definitely have what you need to get your fork set up properly with Transifex.

During the initial sync the integration will retrieve any pre-existing translations from your repo but keep in mind that beyond that point, Transifex becomes the source of truth for translations and there won’t be a bidirectional sync if you update translations outside of Transifex.

Please let me know if you have any questions or need further assistance, and I will be happy to help!

Hi @Mylon, thanks for your warm welcome. I definitely felt like I was making progress through the GitHub integration guide but, after the dust cleared, I’m not so sure. All of our locales files were imported but your system doesn’t seem to understand that they are already translations of en.json. It seems to think, for example, that el.json, a file full of Greek strings, still needs to be translated into Greek.

I ended up using

filters:
  - filter_type: dir
    file_format: KEYVALUEJSON
    source_file_extension: json
    source_language: en
    source_file_dir: frontend/src/locales/
    translation_files_expression: 'frontend/src/locales/<lang>.json'

settings:
  pr_branch_name: transifex_<br_unique_id>

because attempts to specify a single source_file wouldn’t parse … any suggestions?

Wow. I’ve spent half a day trying to integrate with your system, providing links to configuration files to be clear about what I’m doing, and the community has reported me as a spammer.

Alerts at 1am telling me I’ve got ten minutes to revise my posts to prove the project’s legitimate.

Hello @erictheise,

Apologies for the inconvenience. Multiple subsequent posts may trigger the spam filter, especially for new posters. Hopefully, this doesn’t happen again, but instead of creating multiple replies, consider editing your existing response.

To answer your question it is possible that your translation files are imported as source files because they are both in the same directory while you’re using the dir filter type.

In this scenario, the filter will consider all files in a single directory as source files, which is likely what you’re experiencing.

As an alternative, you could consider using the dynamic filter type:

filters:
  - filter_type: dynaimic
    file_format: KEYVALUEJSON
    source_file_extension: json
    source_language: en
    source_files_expression: frontend/src/locales/<file>_en.json
    translation_files_expression: 'frontend/src/locales/<file>_<lang>.json'

In the above example, <file> acts as a wildcard for any filename, and you can adjust it based on your naming scheme. You can also use <folder> as a wildcard for any folder and its subfolders in the specified path.

You should consider creating a separate directory for your source and translation files if you want to continue using the dir filter type. If you want to specify a single source file and its translations, then you can simply opt for the file filter type:

filters:
  - filter_type: file
    file_format: KEYVALUEJSON
    source_file_extension: json
    source_language: en
    source_file: 'frontend/src/locales/en.json'
    translation_files_expression: 'frontend/src/locales/<lang>.json'

You can use the test functionality when setting up the GitHub integration to see if your config correctly detects the content you want to sync or if there are any issues.

Please let me know if this helps. If you have any additional questions, I will be happy to assist.

Thanks, @Mylon, your suggestions have made a big difference. Because our fork will periodically sync with upstream I’m trying not to restructure our file layout so filter_type: file makes the most sense. FWIW, source_file_extension causes an error with that type and needed to be removed.

filters:
  - filter_type: file
    file_format: KEYVALUEJSON
    source_language: en
    source_file: 'frontend/src/locales/en.json'
    translation_files_expression: 'frontend/src/locales/<lang>.json'

settings:
  pr_branch_name: transifex_<br_unique_id>

My dashboard is now reporting percents translated but there are a few oddities. Arabic & Malayalam, for example, show 0% translated, which is correct. But Hebrew, Italian, Spanish, & others also show 0% even though they are extensively, possibly even completely, translated. How would I go about remedying this?

Hello Eric,

Could you please confirm the language codes you’ve configured in your Transifex project for Hebrew, Italian, and Spanish? Additionally, what are the corresponding translation file names for these languages in your GitHub repository, and where exactly are they located? Are they stored in the same folder as the en.json source file?

One common issue we see is a mismatch between the language code used in Transifex and the one in the filename. For example, if Spanish is set as es in Transifex but the filename in your repo is es_MX.json, the integration will not recognize or import the file.

I’m also asking about the file path because the integration uses the expression frontend/src/locales/<lang>.json to locate translation files. If your files are stored in a different path, that could also explain the issue.

Lastly, could you check the GitHub integration wizard for any specific error messages in the syncing logs and share them with us?

Looking forward to your reply so we can help you get everything running smoothly.

Best regards,

Hi @Sandy_DLR, thanks for your help.

Our filenames match the language setting. In most cases we just use the two letter code, the exceptions being: fa_IR, nl_NL, pt_BR, & zh_TW (of those, all but pt_BR show > 0% translated).

The layout & files are visible at … well, I’m not allowed to post a link to GitHub. OpenHistoricalMap / tasking-manager / frontend/src/locales on the transifex branch.

It took some digging but I do see some failures in the log related to plurality. Hmmm. This is a fork of a project that uses Transifex and we’ve yet to make any changes so I’m surprised to see failures here when it must be working upstream. I’ll make some edits and report back.

@Sandy_DLR, I tried adding a “many” value to es.json for the problematic key mytasks.tasks.comments.number and committed it to our GitHub repo but there’s no change and I don’t know if I need to retrigger it somehow.

Hello! I believe English only supports “one” and “other” plural forms, as shown in the example below:

{
  "files": "{count, plural, one {You have {count} file.} other {You have {count} files.}}"
}

Apologies, @Mylon & @Sandy_DLR, I got pulled into other projects since my last post. But I am back and I am stuck.

To recap: the OpenHistoricalMap project is forking HotOSM’s tasking-manager. We’ve yet to make any changes in their translations, also handled through Transifex, but syncing our fork has been problematic. Many languages that are extensively translated show, in your dashboard, 0% translation: French (fr), Galician (gl), Hebrew (he), Italian (it), Portuguese (pt), Portuguese (Brazil) (pt_BR), Spanish (es).

Most if not all of these were showing failures related to plurality. Logs showed errors related to “many” but, as English does not have a “many”, I have no idea how to coax your importer to accept the translations we have, even though they are already working with the upstream HotOSM project.

Please advise. We can’t recruit translators with these extensively translated languages being marked as 0% translated.

Hey @erictheise,

Thanks for the update.

The 0% translation status you’re seeing for those languages (French, Galician, Hebrew, Italian, Portuguese variants, and Spanish) is actually Transifex correctly identifying that these translations don’t follow Unicode CLDR (Common Locale Data Repository) formatting standards. This isn’t a bug since Transifex follows the Unicode CLDR standard, which prevents importing malformed locale data that could potentially break applications.

The pluralization errors you mentioned are exactly why CLDR compliance matters. Each language has specific plural rules that must be appropriately formatted for the translation system to handle them correctly. When translations do not conform to these rules, you get an error similar to the one you encountered.

To resolve this:

  1. The existing translations need to be reformatted to match CLDR standards for each respective language
  2. Once properly formatted, they should import without issue

You can find information about all supported Unicode CLDR language plural rules here. You might need to add or remove plural forms from your translation files depending on the errors you received.

Please let me know if this helps. If you have any additional questions, I will be happy to assist.

What I don’t understand, @Mylon, is how these files can be acceptable/operational in the upstream hotosm repository but not in our fork, where we’ve yet to make a single change.

I also cannot figure out how to get your system to process one of the failed translation files after it’s failed once. I’ve tried pushing an altered es.json file to our transifex branch; no background task seems to pick up the change and neither does a manual sync. I expect to see a new failure (or perhaps a success) but I only see files that have previously passed being processed.

I could destroy and recreate the project but that doesn’t seem like the correct approach.

Hello @erictheise,

You raise a valid question regarding the acceptance criteria. Let me clarify what’s happening here and also provide a solution for the issue you mentioned.

Why Transifex Enforces CLDR Standards

While upstream repositories may have more flexible acceptance criteria for translation files, Transifex requires Unicode CLDR compliance because we serve thousands of diverse projects and need a universal standard to ensure reliable file processing, prevent data corruption, and guarantee compatibility across different platforms and tools.

CLDR compliance provides standardized pluralization rules, consistent locale formatting, future compatibility with translation tools, and seamless interoperability across different platforms, ensuring your translations work reliably everywhere, not just in one specific repository.

Resolving the Sync Issue

You’re not seeing new failures or successes for your specific sync problem with the altered .json file because these translation files haven’t been synced yet. Translation files are fetched from your repository only during the initial sync or when you add new source files. This means the files staying at 0% right now is expected.

Here’s what I recommend to address this:

  1. Fix the plurals for one translation file per problematic language first
  2. Manually upload the files through the Transifex interface to make sure everything works as expected
  3. Once you confirm it works, apply the same formatting fixes to all affected files
  4. Resync your translation files by unlinking and relinking the GitHub integration so you can trigger the initial sync process again

Let me know if you need help with the specific CLDR formatting requirements for any of the affected languages or with any Transifex features.

As I said in my opening post, and throughout this thread, @Mylon, the upstream project uses Transifex and the translations I’m trying to sync are unchanged from theirs. It appears that the upstream project is exempt from CLDR Standards. I want to be clear that the translations I’m trying to sync did not materialize out of the blue and it seems to me they should just work. But I don’t care anymore I just want to get through this.

These are my import errors (pulled out of the tooltips by inspecting the html since there is no way to copy them):

es.json: Invalid plural types for string: mytasks.tasks.comments.number. Language supports: [‘one’, ‘many’, ‘other’], but found: [‘one’, ‘other’] instead.
gl.json: expected plurals rules ‘[‘one’, ‘other’]’ instead got ‘[‘other’]’ for resource string ‘582731090’ and language ‘gl’
he.json: Invalid plural types for string: mytasks.tasks.comments.number. Language supports: [‘one’, ‘two’, ‘other’], but found: [‘one’, ‘two’, ‘many’, ‘other’] instead.
it.json: Invalid plural types for string: mytasks.tasks.comments.number. Language supports: [‘one’, ‘many’, ‘other’], but found: [‘one’, ‘other’] instead.
pt.json: Invalid plural types for string: mytasks.tasks.comments.number. Language supports: [‘one’, ‘many’, ‘other’], but found: [‘one’, ‘other’] instead.
pt_BR.json: Invalid plural types for string: mytasks.tasks.comments.number. Language supports: [‘one’, ‘many’, ‘other’], but found: [‘one’, ‘other’] instead.

It seems that for he.json I need to remove many {# הערות}.

es.json, it.json, pt,json, and pt_BR.json have the opposite problem; they lack a key/value pair for many. es.json, for example, looks like this:

mytasks.tasks.comments.number": "{number, plural, one {# comentario} other {# comentarios}}"

What do you recommend I do for these four locales? Will it sync if I set those to an empty string? Should I replicate the value of other for many and hope real translators come along and fix them once the files sync? Seems I either have to discard useful translations or introduce faulty ones in order to get these to sync.

There are some translations in gl.json but no pluralization rules of the form {number, plural, one {# comentario} other {# comentarios}}" and the error message is different. How do you recommend that I chase down “resource string ‘582731090’”?