Projects share branch names when using GitHub app PR integration

Problem statement

Consider a repository configured as follows:

  • The repository has a main branch for new development, and release branches for maintenance of released versions
  • The main branch is linked to a “main” project on Transifex
  • Each release branch is linked to a corresponding release project on Transifex. For example:
    • The release-1.x branch is linked to a “1.x” project on Transifex
    • The release-2.x branch is linked to a “2.x” project on Transifex
  • All branches have the same directory structure.
  • All integrations are configured to push translations by creating pull requests

I would like to note that a repository configured this way would not be a good candidate for the documented multi-branching functionality, which is designed for allowing translation work to begin before feature branches are merged. The release branches are not feature branches, and will never be merged.

This configuration exposes a major issue with the Transifex GitHub app integration: every project uses the same branch naming when creating PRs.

For example, if the aforementioned repository had the following resources:

  • resource-base
  • resource-extra

Edit: This is specifically an issue when the resources in each project are using the same slug. The following example assumes resource-base is the slug for the resource in both the “1.x” and “2.x” Transifex projects.

and a language (es_419 for example) was updated to pass the sync threshold in both 1.x and 2.x, then we would see:

  • A string is updated in language es_419 in the resource-base resource in the “1.x” project. The resource is over the threshold so a sync is queued.
  • PR “A” is created with release-1.x as the base branch, and translations_resource-base_es_419 as the head branch.
  • We assume PR “A” has not yet been merged.
  • A string is updated in language es_419 in the resource-base resource in the “2.x” project. The resource is over the threshold so a sync is queued.
  • The base branch of PR “A” is release-1.x, which doesn’t match the release-2.x base branch of the translations the sync is trying to push.
    • It doesn’t make sense to try to add a commit to the existing translations_resource-base_es_419 head branch.
  • PR “A” is closed, and PR “B” is created with release-2.x as the base branch, and translations_resource-base_es_419 as the head branch.

This causes extreme amounts of PR churn when strings are updated in multiple Transifex projects simultaneously.

In the case of the Open edX project, this resulted in over 19,000 PRs being created over the course of 2 days.


Proposed solution

I would like to propose a new sync content option for the Transifex GitHub app integration: “Pull request branch prefix”

It is currently possible to configure a commit message prefix, and a pull request title prefix. If it was also possible to configure a branch prefix, then the previously described scenario could have instead played out as:

  • A string is updated in language es_419 in the resource-base resource in the “1.x” project. The resource is over the threshold so a sync is queued.
  • PR “A” is created with release-1.x as the base branch, and 1x_translations_resource-base_es_419 as the head branch.
  • We assume PR “A” has not yet been merged.
  • A string is updated in language es_419 in the resource-base resource in the “2.x” project. The resource is over the threshold so a sync is queued.
  • PR “B” is created with release-2.x as the base branch, and 2x_translations_resource-base_es_419 as the head branch.

In this scenario, the PR head branches would not conflict at all, so the Transifex GitHub app integration would not need to close PR “A” in order to create PR “B”


Closing thoughts

I do not mean to imply that my proposed solution is the only solution to this issue. I am just hoping to demonstrate how given the additional setting it would be possible to avoid the issue I encountered.

I’m looking forward to hearing other thoughts on how to address this!

Thank you so much for reading this!

Hi @bsmithaxim

Thanks for sharing, since you have a ticket open for this issue we will continue the discussion there to keep you updated in just one channel.

We are currently testing you scenario trying to replicate your issue. We’ll get back to you as soon as we have any news.

Best regards,

Carlos Olvera from Transifex Support

To summarize the issue:

When multiple Transifex projects are connected to the same GitHub repository using pull requests in the GitHub app integration, the slugs for each resource must be unique across all projects to avoid conflicts.

This uniqueness is not enforced on the Transifex side.

When there are multiple resources using the same slug (even when those resources exist in separate projects), the conflicts can lead to immense numbers of pull requests being closed/opened (in our case, over 19,000 in 2 days).

Hi @bsmithaxim

Could you please elaborate on what you mean that uniqueness is not enforced on the Transifex side?

Transifex generates a hash as a slug name every time a new resource is created, even when a resource is synced through the GitHub integration Transifex will assign a different slug for each resource in the same project.

For example,

The reason why a slug is changed it’s because, you might either have a process changing the slug names or you are assigning slug names manually. Can you confirm if you are doing either one of both options?

About the branches for translations when we create a branch to update or create translations, we follow the naming convention:

"translations_{resource_slug}_{language_code}"

So, if you are using the same repository with different branches but you are using the same resource slug the result will be the same branch name generated for the same GitHub repo. That’s why your process should validate if you have any other resource slug already assigned.

So, if you have Branch 1 with a resource with slug resource_slug1 in your GitHub repo for Transifex project 1 and then you have Branch 2 with a resource with slug resource_slug1 in your GitHub repo for Transifex project 2, if you translate in both projects Spanish locale your GitHub will receive two PRs with translations_resource_slug1_es

Both are coming from different Transifex projects, both are pointing to your same GitHub repo but since you are using the same resource slug in both projects the branch name will be the same, as a result Branch 1 will generate translations_resource_slug1_es and Branch 2 will also generate translations_resource_slug1_es

In summary, Transifex generates unique slug names for each resource every time a resource is synced from GitHub, assigning a hash to each resource. But if you modify this manually, Transifex will not validate if your hash is unique again assuming that you already validate this when you modify the resource slug.

If you need to make your resource easier to identify or read to translators, the best practice is to only modify the resource name field.

Best regards,

Carlos Olvera from Transifex support

It is possible to set the same slug on multiple resources, both via the API and (I believe) manually in the web interface.

We have a process that changes both resource names and resource slugs. The automatically generated names made it very hard for translators to distinguish between resources (see this PR and this follow-up commit).

Automatically Transifex set resource names to a long name e.g. translations..frontend-app-something..src-i18n-transifex-input--main

This script sets it to frontend-app-something which makes it usable in the Transifex UI for translators.

Without this fix, it’s almost not possible to know which resource belongs to which app.

We are currently working to update that script to ensure uniqueness in resource slugs, while maintaining human readability of URLs etc.

Thanks for clarifying @bsmithaxim

As mentioned before Transifex sets unique hash per project, but what you mention is correct, we are not checking the resource slug when it’s manually modified. What I can do is to raise a product improvement so it can be considered in future releases.

About your script, let us know how it goes and if you need further assistance.

Best regards,

Carlos olvera