SD - Document Translation
Translation Problem
- n sources to n-1 targets
- Direct: n * (n-1)
- Intermediate: 2n
- Characteristics of intermediate
- Allows representation of all information content of the source
- Separates content/structure and layout so content/structure can be rendered in multiple ways from the intermediate, as desired
- Difficulty of up, down translation --- Many to one, one to many, m:n
- Sources of inconsistency
- markup selection, ordering
- ambiguity, overloading of symbols
- semantic vs. syntactic orientation
- Examples of difficult translations
- math: fonts, sizes, spacing, meaning of adjacency (long string, product)
- tables: multiline fields
- Types of translation fidelity
- Hardcopy: looks the same when printed: PostScript, PDF
- Screen: appears roughly the same on screen, using various editors: lines rewrap as
window size changes, paragraphs break appropriately
- Editing: screen fidelity + document editable as on original system
ICA Approach
- ICA Approach
- Formal model, with braced languages
- Translation (down) and inverse translation (up)
- Developers to specify grammars, translations
- Users to apply translators, resolve ambiguities
- SGML intermediate
- Develop DTD
- Identify mappings
- Replacement
- Context, history dependent
- Ambiguous, so need human help
ICA Tools
- Develop Grammar
- BNF, SGML DTD, ICA spec, Yacc
- Replace Tags
- Insert Tags
- Map Specific to General
- Map General to Specific
So, translation problems are difficult, but the ICA-style
approach of translating up to SGML and then
down to the target, is one important approach.