Optimising source documents for translation - Part 2: The Portable Document Format – PDF for short
The idea behind the development of the Portable Document Format (PDF) is ingenious: a file format for sharing true-to-original electronic documents, regardless of operating system or application program. However, this file format is less suitable for translations and especially for the use of translation memory systems (TMSs).

Even state-of-the-art translation memory systems, from Across to Trados to memoQ, can process PDF files only in certain cases. What this means: source documents in PDF format have to be converted into a different format for translation. This costs time and money. It doesn't usually end there, though, because automatically converted files – normally files in *.doc or *.docx format – contain a hotchpotch of formatting after conversion: line breaks, blank spaces, tab stops, divisions, etc. This then has to be sorted out by hand because optimum results can be achieved only with an optimum source document when translating with a TM system. Manual post-processing is more time-consuming, and leads to delays and increased costs.

Our tip: If possible, please always provide us with what are called open (editable) files, in the original format, for translation – preferably supplying PDF files only as extras for viewing purposes. All open file formats, except for graphics formats, can be processed by CAT systems. Particularly suitable are all Office formats such as docx, doc, xlsx, xls, pptx, ppt, in addition to xml and html as well as idml (InDesign), mif (FrameMaker) and a whole host of other file formats.

