Content Flow
This document organizes the process and interface calls involved in the flow of content from repository -> quotation -> CAT production -> delivery -> acceptance.
Repository
The data source enters the content repository through certain rules. At this stage, the logic does not involve word count calculations and other operations in traditional localization, but it maintains tag generation rules (Tag Rules).
Quotation
The quotation service obtains the source text from the content repository and needs to address three issues: tag generation, word count, and match rate. After pulling content from the content repository, it should be persisted to the order information.
Tag Generation and Word Count Calculation
Tag generation and word count are prerequisite tasks for calculating match rates. This functionality is provided by the content service (Geralt). When obtaining a source text, for example: <p>Hello World</p>
Tag Rules contain a set of regular expressions for matching tags, for example: /<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>/g First, the source text should be converted to an object:
[
{"tagType": 4, "tagID": "lockedContent", "textEquivalent": "<p>"},
{"tagType": 0, "text": "Hello World"},
{"tagType": 4, "tagID": "lockedContent", "textEquivalent": "</p>"}
]Word count calculation uses the part with tags removed, which is: Hello World This part is used to calculate word count and character count. The quotation service uses Snapshot ID to get this result. This interface should support optional parameters, named additional regex expressions, because when the quotation is transferred to PM for re-quotation, PM can add additional regex expressions based on their experience. At this time, the regular expressions used by the content data service to generate tagged source text should be: Tag Rules corresponding to Snapshot ID + additional regex expressions. Additionally, the returned content should include a segment ID that can be recognized in the process. This ID will be imported as a segment ID into CAT. After translator production, the content retrieves translations through this ID for correspondence.
Note: To avoid differences in regex engines between frontend and backend, it is recommended that the frontend uses re2js (https://re2js.leopard.in.ua/) and the backend uses re2j (https://github.com/google/re2j) to eliminate implementation differences in regex engines across different languages. The final content interface's acceptance parameters and return format should be similar to the following:
Accepted Parameters
{
"repositoryID": "1234567890",
"snapshotID": "1234567890",
"additionalRegex": ["/<[^>]+>/g", "/<[^>]+>/g"]
}Return Results
{
"originalContent": "<p>Hello World</p>",
"wordsCount": 10,
"charactersCount": 10,
"taggedContent":
[
{"tagType": 4, "tagID": "lockedContent", "textEquivalent": "<p>"},
{"tagType": 0, "text": "Hello World"},
{"tagType": 4, "tagID": "lockedContent", "textEquivalent": "</p>"}
],
"segmentID": "1234567890"
}Match Rate
Match rate refers to the degree of matching between the source text and the quotation. After calculating the word count, the batch match rate calculation interface is called, first calling the language asset service (Morgan), and the language asset service calls the CAT Core (Allen) interface.
Persistence
The above segments and their corresponding calculation results should be persisted to the order information.
CAT Production
Import Segments
The project management part will add a settings confirmation button, which for users indicates that the task workflow has been confirmed and task splitting has been completed. After confirmation, the segment import operation is executed. At this point, the process can no longer be changed, and tasks are not allowed to be split.
Production
Translators fill in translations. After translators submit, they can no longer edit. PM previews the submitted content, and if problems are found, PM can change the status back to unsubmitted, allowing translators to continue editing.
Delivery
When all tasks in the project reach 100%, the complete button needs to be clicked on the project. When clicking complete, the content service is called, passing each target language's corresponding FileGuid, SubId, Slot (e.g., flow-202404-edit). Since task splitting exists, one target language may have multiple task records, with a structure similar to the following:
[
{
"targetLanguage": "zh-CN",
"tasks": [
{
"fileGuid": "1234567890",
"subId": 0,
"slot": "flow-202404-edit"
}
]
},
{
"targetLanguage": "ja-JP",
"tasks": [
{
"fileGuid": "1234567890",
"subId": 1,
"slot": "flow-202404-edit"
},
{
"fileGuid": "1234567890",
"subId": 2,
"slot": "flow-202404-edit"
}
]
}
]After the content service interface receives the import request from TMS, if it's a synchronous task, it waits to return a version number, and TMS stores the version number in the project information; if it's an asynchronous task, it returns directly, and after the subsequent asynchronous task succeeds, it updates the project's completion status and version number. Then click the delivery button in the order list. At this time, the frontend retrieves the project's completion status and whether there is a version number. Without a version number, delivery is not allowed. If delivery is possible, call the order delivery interface, passing delivery notes and the delivery version number.
Acceptance
After delivery, the order changes to an acceptance-pending status. At this time, the customer clicks the preview button to get the current order's acceptance version number and jumps to the content repository. The content repository displays the corresponding content according to the version number. Meanwhile, the customer evaluates whether the current content meets the acceptance conditions. If it does, the customer clicks the accept button, and the order changes to an accepted status, and the content repository is unlocked. If unsatisfied with the deliverables, the customer clicks the Reject button, the content repository deletes the corresponding version, the order status changes to in-production, the project status of the corresponding order changes to incomplete, the previous version number is cleared, and the above process is repeated until acceptance is passed.