Review of "Markup Systems and the Future of Schorlarly Text Processing" by James H. Coombs, Allen H. Renear, Steven J. DeRose Representing the group Rick Compton, Fred L. Drake, Jr., Mark Missana, Steve Williams The authors of this paper assert that scholars have allowed the state of authoring and research support applications from the computing field to slip over the fifteen to twenty years preceeding the publication of their paper. The trend observed by the authors, that of increased use of non-structural "word processing" authoring tools dominating the workplace rather than more powerful structure-driven tools, has in fact continued since the publication of the paper to the current state of affairs. In many environments, the presentation-driven systems described by the authors as being the most difficult to use have completely supplanted even the consideration of structure-based tools as candidates for document production. Several categories of document markup are described in the article: 1. no markup at all, a variation typicaly not used in modern languages, 2. punctuational markup, where word, phrase, and sentence boundaries are identified by spaces, commas, periods, and other punctuation characters inserted in to the text, 3. presentational markup, where the visual form of the document is specified directly, 4. procedural markup, in which presentational instructions to some particular processing system are embedded in the text, 5. descritive markup, as often found in applications of SGML, which approaches documents as structured objects containing semantically-interpretable parts, 6. referential markup, which allows external elements to be handled without regard to their content, either through procedural or declarative constructs, and 7. meta-markup, which allows new markup constructs to be created as needed for particular documents or classes of documents. The authors claim that documents marked up using a descriptive system of annotation, augmented by referential and meta-markup constructs, provide the highest level of reusability and lowest level of initial investment on the part of the authors. A mental model of the process of marking documents is presented which shows that descriptive markup offers the highest level of information content for the lowest cost of including markup beyond the punctuational level. The use of alternative approaches to handling documents is examined and found inadequate, with examples being used to demonstrate the deficiencies of less powerful approaches to document preparation. Additional value is added to a document through the use of descriptive markup, allowing the use of desciptive systems to not only have a low authoring overhead but also to have the highest return on the investment. Examples are presented where descriptive markup in a document is used to create additional interpretations of the document content including outline- and index-style views. The most important benefit of using descriptive markup, however, is to allow future applications to interpret existing documents in new ways. This is made possible since descriptive systems provide a means identify document content by the role it plays within the document. Elements which are presented in a similar format may still be interpreted as distinct types of elements by marking them as having different types. The presentation format becomes independent of the content identification, allowing new processing to be created based on the roles of elements within documents. The authors are clearly strong proponents of descriptive markup systems, and indicate that many new applications for scholarly text are waiting to become available if enough authors and publishers move toward using content-oriented systems. This is a highly attractive view that has seen more light since this article was published, but which, in many ways, remains a distant vision. Much work remains to be done to make descriptive markup more readily re-usable. =================================== "Markup Systems and the Future of Scholarly Text Processing" by Coombs, Renear, and DeRose SD COOM87 Article Summary by Group I: Kalafut, Muhlenburg, Klein, Fitzgerald Not too long ago researchers were looking for new ways of composition. Some became merely computerized typesetters or programmer-typists, merely just doing the same old thing a little faster. Their problems were that they ignored on-going computing research that could help, overlooked the fact that "soft" copies were just as important as "hard" copies, and concentrated too much on just the presentation of a composition. Then along came descriptive markup, which is the best imaginable solution. In the beginning, markup was just written punctuation and presentation, which evolved into electronic forms. Now there are basically 6 types of markup: punctuational, presentational, procedural, descriptive, referential, and metamarkup. Markup processing is usually either just being read by humans (presentational), formatting (procedural), or open-ended (descriptive). Markup can either be exposed, disguised, concealed, or actually displayed. Descriptive markup has many advantages over the other main document markup methods - procedural and presentational - to make both composition and publishing very doable. Descriptive markup is the grand solution for document portability. It also eliminates source file maintenance, for example, when there is a standard style change. The 4 alternatives for portability all have drawbacks easily overcomeable with descriptive markup. Descriptive markup also facilitates the most important phase of markup - selection. In conclusion, descriptive markup is the best solution for manuscript composition and distribution because it allows authors to concentrate on content and structure, facilitates maintenance and portability, and contains the semantics and pragmatics for alternative views and structure-oriented editing. =================================== MARKUP SYSTEM AND THE FUTURE OF SCHOLARLY TEXT PROCESSING Authors; Coombs, Renear, and DeRose Group 2 Lauren Barton Martin Falck Nelson Kile Carolyn O'Hare Robert Ryan Markup tells the reader something about the content or expression of words. When the reader sees markup, it is interpreted into gestures to convey the proper meaning. There are several ways to include markup in a document. * Punctuational - consists of a closed set of marks to provide syntactic information about written utterances. This markup is relatively complicated, ambiguous, and subject to considerable stylistic variation. * Presentational - includes horizontal and vertical spacing, folios, page breaks notes and ad hoc symbols. * Procedural - consists of commands indicating how the text should be formatted (i.e. skip line, indent). * Descriptive - author identifies the elements types as tokens. Descriptive markup indicates which a text elements are, or declare that a portion of a text stream is a member of a particular class. * Referential - refers to entities external to the document and is replaced by those entities during processing. * Metamarkup - Provides a facility for controlling the interpretation of markup and for extending the vocabulary of descriptive markup languages (i.e. macros). With descriptive markup, adjusting the set of rules establishes a design that will be executed automatically and consistently. If the rules need to change, only the rules change the file remains intact. Three major categories of markup processing. * Reading. Presentational markup, moderately well suited to descriptive. * Formatting. Procedural markup * Open ended (includes formatting) Descriptive Markup is: * Exposed when the system shows the markup as it occurs in the source file without performing any special formatting. * Disguised when it is converted to a special character. * Concealed when it is not shown at all. * Displayed when it is shown as a especially formatted representation. Text editors may show all 4 types of displays. Any change to the computing environment pose a threat to procedural and presentational markup. Both procedural and presentational markup may present maintenance problems. Descriptive markup allows for greater portability. Can be easily converted into SGML. The author discounts other alternatives to portability. The author recommends conversion to descriptive markup to reduce costs. Presentation markup - The relation between the text and the typographical element is arbitrary. Procedural markup - Very complex Descriptive markup - reduces the process of markup selection to a single step falling out of element recognition. It also supports the author in focusing on both the structure and content of documents. Supports composition assistance features such as alternate views of a document. The author concludes that descriptive markup is the best approach. ===================================