CAT Review

trinley · February 13, 2022, 6:22am

Review Criteria

UI
- User friendliness
API
- API support to update segments (must for Tibetan)
- API support to export comments (must for 84000/Kumarajiva)
Price
- discount for open source / NGO
Tibetan Compatibility
Features

OmegaT

Memsource (Phrase)

API
Memsource’s API seems to check all the boxes (https://cloud.memsource.com/web/docs/api#operation/relevantTermBases, https://cloud.memsource.com/web/docs/api#tag/Conversations), but is priced much higher than anything we could afford till now so wasn’t tested.

Price
The API access starts with their Ultimate package (minimum of 5 project managers at 350$/month each).
Reduced price or pro bono access is possible for NGOs in discussion with Memsource.

UI
Reasonably elegant and simple. Helpfully and intuitively customisable: one can shrink or expand different panels and change the font size.
One of the characteristic features of the Memsource UI is the CAT pane. Rather than having a separate panel for Exact, Fuzzy matches etc., Subsegment matches, Termbase matches, and Machine Translations, these are incorporated into one location – the CAT pane. As a result, TM match suggestions are capped at five. To check for further matches one can, of course, use the search pane.

User Friendliness

Highly accessible, intuitive, and easy to use for translators with limited-no experience with CAT tools.

Tibetan Compatibility
Basically fine. CAT platforms often read tshegs as characters rather than word delimiters, which can create significant issues for Tibetan. Memsource does calculate word count for Tibetan based on spaces rather than readings tshegs as word delimiters, and does count tshegs as characters. However, tshegs appear to be read as a form of punctuation, similar to the interpunct, and function fine to delimit each Tibetan syllable. Generally, spaces + punctuation don’t much affect fuzzy matching (unless placed within a tsheg delimited syllable), though extra spaces or punctuation can bring an exact match down to 99%, for example. Also, and importantly, word count doesn’t seem to have any bearing on the Memsource fuzzy matching algorithm, only characters - so for the purposes of fuzzy matching, not recognizing tshegs as word delimiters is broadly irrelevant.

Features

Easy in-editor joining, splitting, and editing of source segments
Subsegment matching. As defined by Memsource: “If a smaller part of the original text was previously translated as a short segment, the CAT pane will display it even though the match is lower then the threshold set in the Editor’s Preference.” Sadly this only applies to exact subsegment matches, so extra shads and minor morphological changes etc. will prohibit a subsegment match
Cloud-based, but comes with a simple desktop editor. This can be used offline but requires an internet connection for full functionality (TM access etc.)

Transifex

API
segment I/O
programmer note I/O
comment/discussion I/O
No I/O for glossaries

Price for OpenSource, for copyrighted material
The API access starts with their Ultimate package (minimum of 5 project managers at 350$/month each)