-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Bulk Download from GlotPress #401
base: trunk
Are you sure you want to change the base?
Conversation
cc @mokagio in case you're curious (squirrel!) about this early implementation and my vision/attempt for the new code modular architecture for all those GlotPress-export-related code (see PR description) 😃 To be clear, this is still WIP and not fully ready for review yet (I'll add you as reviewer once I got unit tests and it's ready); though the new architecture and its separation of concerns used for the new classes (separating the download part, the post-process part, and the writing part, to be ready to potentially support a platform other than GlotPress one day) is already where I want it to be. But since I'm not sure when I will have time to come back on this code/project in the near future (hence why I decided to push it as draft anyway, to avoid losing my work so far) to finish it, I figured that, knowing you, you might be interested/curious to take a pick at my brain dump and architecture idea here 😉 |
] | ||
end | ||
|
||
# The common standardized set of Metadata rules for an Android project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The common standardized set of Metadata rules for an Android project | |
# The common standardized set of Metadata rules for an iOS project |
# Note: If the translation for `key` exceeds the specified `max_len`, we will try to find an alternate key named `#{key}_short` by convention. | ||
# @param [String] filename The (relative) path to the `.txt` file to write that metadata to | ||
# | ||
MetadataRule = Struct.new(:key, :max_len, :filename) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: add keyword_init: true
for the Struct
so we can update all the MetadataRule.new
call sites to use named parameters?
# Visit each key/value pair of a translations Hash, and yield keys and matching translations from it based on the passed `MetadataRules`, | ||
# trying any potential fallback key if the translation exceeds the max limit, and yielding each found and valid entry to the caller. | ||
# | ||
# @param [#read] io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# @param [#read] io | |
# @param [Hash<String,String>] translations The hash of key => translations for a single locale |
next if rule.nil? | ||
|
||
if rule.max_len != nil && value.length > rule.max_len | ||
UI.warning "Translation for #{key} is too long (#{value.length}), trying shorter alternate #{key}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UI.warning "Translation for #{key} is too long (#{value.length}), trying shorter alternate #{key}." | |
UI.warning "Translation for #{key} is too long (#{value.length} > #{rule.max_len}), trying shorter alternate #{key}." |
if rule.max_len != nil && value.length > rule.max_len | ||
UI.warning "Translation for #{key} is too long (#{value.length}), trying shorter alternate #{key}." | ||
short_key = "#{key}_short" | ||
value = json[short_key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value = json[short_key] | |
value = translations[short_key] |
query_params = filters.transform_keys { |k| "filters[#{k}]" }.merge('export-format': format) | ||
uri = URI::HTTPS.build(host: host, path: File.join('/', 'exporter', project, '-do'), query: URI.encode_www_form(query_params)) | ||
UI.message "Downloading #{uri}" | ||
zip_stream = uri.open(REQUEST_HEADERS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap this in begin…rescue…end
block
zip_file.each do |entry| | ||
next if entry.name.end_with?('/') && entry.size.zero? | ||
|
||
prefix = File.dirname(entry.name).gsub(/[0-9-]*$/, '') + '-' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix = File.dirname(entry.name).gsub(/[0-9-]*$/, '') + '-' | |
# Each entry in the ZIP looks like `apps-android-release-notes-current-2022-11-08-1951/apps-android-release-notes-current-zh-cn` | |
# So to get the locale, we use the `dirname`, minus the `2022-11-08-1951` timestamp at the end, as a prefix to strip from the `basename` to only have `zh-cn` left. | |
prefix = File.dirname(entry.name).gsub(/[0-9-]*$/, '') + '-' |
🚧 Next Steps / TODO
rubocop
violationsWhy?
With us encountering the
429 Too Many Requests
quota limitation more and more often while downloading translations from GlotPress during our release duties, I've recently asked our Team Global and Orbit (ref: pxLjZ-72D-p2) to install thegp-import-export
plugin on our GlotPress instances (ref: 307-gh-Automattic/Orbit), so that we can bulk-download the translations for all the locales at once (as a ZIP).This will allow us to only make a single network request to export them instead of one request per locale (which ends up being a lot of requests especially for WordPress and the 40+ locales it supports)
How?
I've reimplemented the logic for the download of GlotPress exports into a more modern infrastructure / code architecture in our code base. I intend to deprecate
glotpress_helper
andgp_downloadmetadata_action
once this new implementation is validated as more stable, and refactoringandroid_download_translations_action
/ios_downlaod_strings_files_from_glotpress
to use the new helpers and models (or deprecating those actions and creating new ones to replace them).The new implementation/architecture is structured like this:
Models to manipulate locales easily
This will supersede my #296 draft PR (which I opened… one year ago!)
models/locale
: is aStruct
which represents a single locale, exposing the various locale code representations for that locale — since there are multiple ISO standards for locale codes needed/used in various places of the codebase (ios folder names, android folder names, glotpress locale codes…)models/locales
: represents a set ofLocale
objects. That model allows to build a set of locales based on their locale codes (to have an object representing all the locales supported by a project/app, typically), build convenience sets (e.g. Mag16), easily add/remove/filter locales from sets, and find a known locale from the set of all known localesA class dedicated to download exports from GlotPress
The
helper/glotpress_downloader
class knows how to exports from GlotPress inandroid
,ios
orjson
formats.You start by initializing an instance with a GlotPress URL host and URL project path
You can then call methods to download exports from that project:
gp-import-export
plugin installed and supports it — as a bulk download of all the locales at once (as a ZIP stream that will be decompressed by the downloader class on the fly)That class also knows how to parse the odd format used by GlotPress for its JSON exports (where the JSON keys are actually the copy key +
\0004
+ the original copy, and the JSON values are an array with mostly only one entry at most) and convert it to a usableHash
.This class is supposed to only have the knowledge of the GlotPress side of things. Once it downloaded the export from GlotPress in memory (
File
/IO
instance) and parsed it, it is not responsible for doing anything with the data except handing it over to the next piece in that new architecture (typically a*_file_writer
).That way, if we want to support alternative Localization backends in the future (e.g. OneSky or Weblate, see 298-gh-Automattic/i18n-issues), this would be nicely decoupled.
File writers — to write the downloaded translations for each locale to disk
helper/ios/ios_strings_file_writer
knows how to take a single.strings
file export, for a single locale, and write it to disk(†).helper/android/android_strings_file_writer
knows how to take a singlestrings.xml
file export, for a single locale, and write it to disk(†).helper/fastlane_metadata_file_writer
knows how to take aHash
of exported translations (key => translation) for a single locale, and write it as.txt
files to disk (typically infastlane/metadata/…/{locale}/
folders)MetadataRule
which describes the rules to apply to each key, especially itsmax_len
and the name/path of the corresponding.txt
file to associate with it.FastlaneMetadataFileWrite
then knows how to use a list ofMetadataRule
objects to extract the proper keys from the translations, check their length, try the shorter alternatives (entries for#{original_key}_short
, by convention), and write the one that fits the limit into the proper.txt
file (or delete the file if no fitting translation was found).MetadataRule
lists for Android and iOS projects, with predefined keys for typical App Store / Play Store metadata.MetadataRule
for any unknown key, allowing us to describe how to handle unexpected/unusual keys (e.g. screenshots) as we encounter them.(†) Those two
*_strings_file_writer
helpers are currently very simple, just taking the IO stream and writing it to disk verbatim, with the only logic being to know the relative path to use for the subfolder based on the locale passed.But in the future I'm hoping we can improve those classes to optimize the way files are written to disk, especially reorder the XML nodes in
strings.xml
files to make sure they are always sorted in the same order, making PR diffs for them easier to review. This is not implemented in this PR yet though.To Test
🚧 There is currently no unit tests / specs. This is the next step on my TODO list.
In the meantime, I've written some "demo code" of how those helpers are supposed to interact with each other and assembled together.
This demo code is likely the scaffold of what the future new fastlane actions (the ones using this new implementaion) will look like, but for now they are quite drafty and only serve the purpose of brain-dumping my idea of how that new API of mine would work together at call site.