Weaving Content with Coordination Widgets Robert B. Allen College of Information Science and Technology, Drexel University Philadelphia PA, 19107, USA [email protected] Abstract Text, as the term suggests, is a texture of interwoven threads. The grammatical organization of the words creates a cohesion of concepts. Beyond the words, additional conventions and structures – which we call coordination widgets -- have been developed to support the reader’s comprehension and navigation. These additional structures include tables of contents, reference lists, and footnotes. Although they are known to every reader, they are rarely considered as a group. While known to every reader they are rarely considered as a group. On one hand, the widgets incorporate a high degree of structure. Thus, they related to formal structures for composite hypertexts. However, they also depend on the semantics of the content. We take a cognitive perspective for describing the effects of these widgets in supporting better comprehension and navigation of the texts. Introduce notion of Templates. These issues could have implications for eReaders that do not capture some wellestablished techniques for texts. The tools proposed are more complex than most of the suggestions for semantic markup that have been made to thus far. Indeed, the tools proposed here could lead to a new generation of environments for interacting with text and other types of content. INTRODUCTION Published texts include more than words. They are supported by a range of what we term content coordination widgets. These widgets are conventions and techniques which enrich the interaction with the words. Many of these widgets emerged with the introduction of printing but we might expect a new generation of tools with interactive services. The emergence of eBooks demonstrates the enduring value of a complete, integrated conceptual presentation even when highly-interactive but also highly fragmented hypertexts are so common. Such information resources have many levels of explicit and implicit structure. While a general framework has been developed for bibliographic metadata (e.g., [Svenonius]), there is no comparable framework for content coordination widgets. A framework for content coordination could have practical applications. The current generation of markup languages and services to support interaction with text is ad hoc. A consistent framework would allow the development of more effective user interaction with text browsers and eReaders. Consider the treatment of footnotes by some eReaders. They are simply formatted as a part of the static page. Scrolling to them and then return to the text can be very tedious. It would be possible to code interaction with each widget separately but it should be helpful to develop an overall framework. A general framework might also allow the development of new modes of interaction. Moreover, there are many possibilities for new interactive reading widgets. In the following section, we consider examples of the content coordination widgets. Then we consider the linking and anchor structures. Finally, we introduce a grid for dimensions of the source models. There are many opportunities for developing a new generation of interactive widgets. CONTENT COORDINATION WIDGETS Tables of Contents Chapter and section help to orient readers and provide short conceptual previews of the associated text. When extracted from the main body of the document and collected the headings form a Table of Contents (TOC). While the hierarchical structure of TOCs is readily adapted to hierarchical interactive © Copyright R.B. Allen, 2011 2 browsers (e.g., [Egan]), the actual structure is often more complex a hierarchy. For instance, a quick survey of the TOCs of some chemistry textbooks found that there often a rich conceptual structure within the hierarchy. In one case, successive chapters considering solid, liquid, gaseous, and plasma phases of matter but then went on to other topics. The headings often reflected a workflow that described the steps in the synthesis of a chemical compound. Indeed, some TOCs use additional but ad hoc features to present more nuanced structures within the content. Structured Abstracts and Plot Summaries While TOCs capture the formal structure as defined by the author, they do not necessarily capture the salient points that may be most useful for readers who want an overview. Structured abstracts (e.g., [1, Lancaster]) provide such overviews by focusing on the key aspects of the content. Those dimensions are generally based on the discourse structures expected for a given genre. Many technical research articles have adopted structured abstracts. Editorial policy determines the features they require and reflects the document structures expected by a specific community. A given community may support several different standards. For instance, the American Dental Association (ADA) for publications dealing with Clinical Practices requires the following sections: Background, Methods, Results, Conclusions, Clinical Implications. While, for Case Descriptions they require: Background, Case Description, Clinical Implications These required sections in the various types of structured abstracts naturally reflect the approaches and researchers and the values of the communities they represent. In other words, they are outlines of genres and, thus, we call them genre templates. These genre templates imply a structure but they are not purely structural. As such, they go beyond structural hypertext models and, as we shall describe below, beyond current proposals for standard web formats. Even traditional abstracts which are not formally structured are implicitly structured. That is, they often include much of the same information as the structured abstracts. Even non-technical reports may have typical structures such as: Purpose, Findings, Conclusions, and Recommendations In addition, plot synopses and summaries share some aspects of structured abstracts. They provide a structured overview of stories including attributes such as characters and setting. Citations and the Reference List Citations are obviously related to simple hypertext links although their anchors can be complex. Consider an example from [Zhai]: Example 1: Overexpression of the fusion protein in transgenic mice and rats reproduces the Wld phenotype [12,13],…. The referent of the citations is clear, the two cited articles. From the text, it is also clear what we expect them to show. Consider a second example. This one from [X]. Example 2: In particular, mitochondria are crucial in energy metabolism and as such have been implicated in the aging process by one of the very first theories of aging [2], the rate-of-living theory of aging [3], which suggested that the rate of aging is proportional to the rate of energy metabolism (reviewed in [4]). Some improvements to navigating citations have been proposed by [Shotton] and [deWaard]. Many systems have been proposed for categorizing the role of citation links. Such role assignments appear to be a combination of the argument logic and the claims of the cited text. These structural 3 models might be effectively coupled with discourse or argumentation models to determine the citation anchor. The Reference List, when it appears, collects all the cited references. It is a distinct coordination element. Structurally, a Reference List is based on of composite hypertext that could be thought of as linking to both the source text and the cited works. It is often organized alphabetically. It can be helpful as a quick guide to the reader. Interactive or “malleable” versions of the reference list are described below. Back of the Book Indexes Unlike most algorithmic indexes, back-of-the-book indexes do not list all the terms that appear in a text [Mainez; Stauber]. Indeed, they may list concepts that are discussed in the text but not mentioned by name. Thus, we can call them concept indexes. They involve selection and insertion of index terms based upon a book’s focus and expectations about what a given readers are likely to be looking for. Back-of-the-book indexes can show how concepts are organized throughout a work. Additional Coordination Widgets Considered broadly, coordination widgets are any aspect of the presentation that goes beyond the text itself. Within a text, there may be tables, figures, and other displays that complement the text itself. While these are not usually considered in discourse analysis, the do add to the coherence of the text. These features could be modeled as a composite hypertext. Semantic annotation [X] is analogous to glossaries though the semantic annotations are machine-readable while the glossaries usually are not. While we have focused on structured coordination widgets such as TOCs and structured abstracts, several additional widgets are unstructured. Footnotes serve two functions. The first is as a format for citations similar to those we have already covered. The second function is to provide an unstructured discussion format for the author (e.g., [Grafton]). Of course, it is common to link to explicit models outside the Work itself. These models range from relatively static (e.g., maps, genealogies) to processes. The presentation of these models may also range from relatively static (e.g., schematics) to interactive (e.g., information visualizations). Mappings: Document to document mappings. For instance, for between adaptations of those works. Exploring a Network of connected roles and concepts without a single path through the structure. A Preface may include descriptions of the structure. Importantly, these can be indicative. Colophons describe aspects of the printed edition such as the details of the font used. A GENERAL FRAMEWORK A complex information resource such as a book is composed of many interwoven threads. Indeed, the word “text” is closely related to the word “textile”. In the course of ordinary reading, the separate threads are of little direct interest to the reader. However, there are circumstances in which it is useful to extract and highlight them. Toward that end, we take a broad view and propose that an annotation is a link or mapping between an information resource and a source model or system. Structure of Links and Anchors Links provide an association between anchors. Links may be one-to-one, one-to-many, many-to-one, or many-to-many. Moreover, multi-headed links (one-to-many, many-to-one, and many-to-many) may have an ordering of the heads. These may all be viewed being some type of (perhaps implicit) linking and may be viewed as forms of hypertext. The Open Hypertext System (OHS) model [x] is a comprehensive model for framework for hypertexts. OHS supports many attributes such as multiheaded anchor links. However, OHS is too general. So-called composite hypertexts [Halasz] are combinations of hypertext features and are structurally related to coordination widgets. However, in addition to the structure, the content and use are equally important. At a granular level, the 4 coordination widgets anchor to the discourse units within the texts. The source models (from which the links originate) are highly varied. The links have a direction, but in most cases the links can be useful in either direction. These widgets are closely related to hypertext and the Open Hypertext Systems (OHS) movement [Reich] provides a general framework for describing them. [van Ossenbruggen] has noted the overlap between OHS and recent directions for the Semantic Web that emphasizes role labels. The complexity of OHS goes beyond what is needed for the present discussion, but is sufficient for it. Anchor regions from structure based on discourse as described in the previous section. These could employ multi-part and thus require multi-headed links. In many cases, the anchors’ scope varies. Links Anchored in Discourse Relationships Discourse analysis emphasizes the coherence of texts by breaking the text into units based on their meaning and then considering how those meaningful units combine to create coherence. While the exact nature of the basic units is debated (e.g., [Schauer]), the effects are obvious. We are focused on structure that identifies meaningful conceptual units. Generally, these boundaries are said to be discourse structures. As noted above, citations frequently are anchored with specific statements in the text. So-called “semantic markup” is mostly on simple named annotation. description. We want a richer type of The rhetorical relationships are particularly clear for scientific publications. De Waard [x] on the discourse elements associated with sciecne research reports. RST is rarely applied beyond a brief passages, but we believe there is a macro-structure. Genre Templates. Macro-units. Broadly speaking, discourse can be any sort of communication but several approaches have identified specific discourse element. One well-known system is Swales’ IMRD that applies to scientific argumentation. Other argumentation systems have widgets that include claims and evidence. Narrative structure may arise from causal relationships. While XML is often used to define structure and that structure does not necessarily follow meaning boundaries. In addition, Citations with regions. There are many ways that knowledge of the discourse widgets can be used by the text widgets. For instance, citation types are highly dependent on overall discourse structure. Of course, methodologies are typically found in the methods section. We could also have metadata from narrative structure and even from a workflow. It is also important to note that discourse widgets do not stand alone but exist in the context of other discourse widgets. They might be assigned metadata but that metadata might also include an attribute for the role played by a unit of text. Indeed, that metadata tag could be the familiar descriptor for a discourse element. Indeed, it might reflect the aspects included in the structured abstracts. The notion of Template suggests that a set of related concepts must all occur. This can be effective In some cases, we want a more powerful structural description with conditionals an looping. We encourage greater emphasis on this type of content in developing interactive widgets. Types and Characteristics of Source Content The source Figure 1 lists several three main categories of Systems. The first group is reflexive; that is, one part of the document refers to other parts of the document. In the second group, there are links to related materials that are not formally part of the original. The third group has links to knowledge organization systems. 5 Two dimensions: Effect for readers (comprehension/navigation) versus the source of material to be linked. The first dimension refers to the reader’s cognition. The navigation function has sometimes been recognized. By comprehension, we mean that Some of the widgets facilitate extend reading comprehension and while other widgets support a combination of overviews and navigation. The second dimension is the relationship of the content model to the target content. This ranges from integral to the document to collection-oriented tools that lead the use to a particular document. The Integral widgets are part of the work and would generally be considered essential to it. Other widgets are centered on target work but add supplementary. Comprehension -----------------------------------------------Navigation & Survey Integral Figures and Tables Table of Contents Back of the Book Index Abstract Structured abstract Supplemental Appendix Footnotes (commentary) Glossary Supplementary data sets Semantic annotation Reader’s interpretation or reaction Citations Reference list Formally Related Adaptation Translations Reviews Figure 1. Grid of the source content that source used by coordination widgets. The two dimensions reflect the reader’s needs and the relationship to the content. Some of the assignments in the Figure are debatable. TOCs are internal but could be external. Abstracts may or may not be part of the original work. In addition, traditional abstracts may or may or may not describe links into the texts. Glossaries have implicit links. Semantic annotations may be seen as machine-readable glossaries. Another step is general-purpose reference works such as gazetteers. While we have focused on widgets most often associated with reading and accessing information in texts, these can provide advance organizers [Ausable, Bransford] other related widgets such exercises support leaning Rothkopf]. Mappings across versions The widgets in Figure 1 deal with the conceptual content but there are additional types of annotations that relate to the non-conceptual aspects of the document. These are shown in Figure 2. They may be as particular as copy edits or as broad as tagged corpora. While we have focused on coordination widgets that link meaningful units, there are additional links that describe physical structure rather than meaning. • Linguistic and Stylistic Notes o Discourse and Narrative tags 6 • • o Parse trees o Copy edits Administrative Metadata o Rights o Preservation o Technical o Use Expression-Level o Page numbers o Preface o Colophon Figure 2. Several types of non-conceptual annotation structures. Discussion of RST issues. Some ambiguity with the discourse tags. They can be used to develop services. We have focused on widgets for individual works, but there are related widgets at the collection level. As shown in Fig 3, Guided tours could be thought of as relative to a collection. Metadata assigned from Knowledge Organization systems. There are a few cases of supporting comprehension across the collection. Tools for exploring collections are sometimes called Reader’s Aids although that term could also be applied to Comprehension-------------------------------------------Navigation & Survey Collection/Web Level Knowledge Organization Systems Pathfinders and Guided Tours Bibliographies Search engines Citation index Catalog Figure 3. Several types of non-conceptual annotation structures. DISCUSSION Text is supported by an array of coordination widgets. We need a much broader range of frameworks for organizing and coordinating electronic materials. This paper has not addressed interactive widgets and games. Compositionality of components in abstracts and TOCs. Relationship of individual documents to collections of documents. Structure in complex resources such as newspapers. Mark-Up Systems Several markup-systems now support advanced text features. XML is purely hierarchical and meaningful units are more general that tidily organized. However, it is worth mentioning work on XML retrieval in which index terms are weighted according to … XML structures for indexing and retrieval. Includes automatic indexing by hierarchical structure. However, in some cases, the discourse structure in not hierarchical. The XML linking package, XLINK, incorporates attributes such as allowing simple multi-headed links and assigning role types to the links. More to the point of developing support for the proposed widgets TEIxml, Digital book standards, EPUB 7 METS has many hooks as a wrapper for complex information resources. For instance, METS Technical metadata includes usage counts. Potentially, the Structural Metadata element of METS could be of great use for indicating implicit structures. At first glance, the heading hierarchy of TOCs seems to have a simple structure that could readily captured by METS StructMaps. To date, most implementations of METS StructMaps focus on the relationship of files rather than conceptual-level structures. Composite features could be created with RDF. This still does not support interactivity as a full programming language would. Scheama.org, microformats.org Text Object Models, and Linked Data, Collection-Level Tools, and Contextual Descriptions We have emphasized the use of content coordination widgets as applied to complete conceptual units. Considering the text control widgets as adding to the coherence of the text. They are an adjunct to discourse. Note that we still observe the distinction between Work and Collection in Figure 1. This is different from OAI-ORE [X] in which the material is all undifferentiated. OAI-ORE is essentially one type of hypertext data model. In some cases, the notion of a collection of information resources remains useful but in many cases involving resources scattered across the Web. Dramatically change the notion of a collection to describe it as a context for information search [HRLee]. Descriptions of Works in Wikipedia do beyond traditional plot summaries to include recommended sections. We term these “contextual descriptions”. Is the concept of collection useful at this point? Serials have always been handled as an exception to other works. Moreover, works in other media have somewhat different policies. We have introduced the notion of a Genre Template. The notion of text templates such as TOCs and IMRD are related to the expectations of genres. Different from schemas which are unordered lists of attributes. Related notion of Discourse Macro-Units. These are also related to the notion of discourse macrostructure (e.g., [Kintsch]) but that term usually refers to the overall meaning of a work rather than a structure per se. Our Macro-units are also closely related to composites of simple discourse units such as those described by Toulmin’s [x] argumentation structures are weakly structures. In which Another important concept is local discourse or rhetorical structures such as suggested by RST. Useful for “highly-structured” and “model-oriented scientific research reports” [Allen2007, 2011] Collection level widgets point to specific texts but that is in the context of other parts of the collection. Interactive Widgets and Interfaces A new generation of highly interactive widgets may be envisioned. Indeed, interactive version of several widgets have already been developed or at least proposed. For instance, interactivity can greatly enhance the possible modes of interaction of these coordination widgets. The SuperBook document browser introduced an interactive table-of-contents in which search hits were posted against the headings that composed the TOC [Egan]. Potentially, indexing could be done the same way. Similarly, index terms could also be interactively posted against structured abstracts. Possibly improve increasingly multimedia models. 8 User-centered indexing [Fidel] Interactive pages [Sutcliffe] We need a much broader range of frameworks for organizing and coordinating electronic materials. This paper has not addressed interactive widgets and games. Several interactive services have been proposed but several others are possible. Interactivity has the possibility to dramatically change the experience of reading by introducing personalization based on a variety of factors such as local experience that particular text but it could also incorporate a broad knowledge of a user’s background. Dynamic personalization of aids. The Reference list itself could be interactive or malleable. The user could pick between alphabetical order and citation for the display. Related citations. A range of designs could be explored from simply marking the clusters of conceptually related units within the hierarchical structure. A somewhat deeper revision might shift from displaying a hierarchical TOC to displaying an interactive hypertext graph (cf., [Allen & Acheson]). Text can often be ambiguous and that ambiguity can sometimes be an advantage. However, this is not always the case. The emphasis on modeling and linking to events has implications for indexing processes. [AllenModel]. There could be an advantage of authoring the structures rather than trying to extract them. Static indexes are unlikely to be used in electronic editions but some of their functionality has been replaced by intra-text hypertext links. While the importance of Work seems to be upheld, the notion of collection is less clear. Highly coordinated maps with every change of scene. Workbenches as collections of related tools. An environment to support readers. This approach should greatly enhance the reader’s experience. Using information visualization and complex structures. There is the possibility for a very rich set of services to surround the readers’ experience. Bi-directional mappings between different documents and source models would be plausible in several cases. These might be implemented with coordinated windows [8], although cross-domain context may present a difficulty for users. Social annotations [Marshall Brush]. It would be helpful to have standard references link or an API for storing state with the reference resources such as maps. Personalization Supporting personalization. Adaptive indexing – indexing for certain applications. Retrofitting text versus a starting with a structured model. Standard sets of discourse tags for Works and, indeed for other objects. User-centered widgets such as user-based indexing [Fidel, AdaptiveHypertext]. Personalization and the notion of a work. The work is the text itself but the widgets are important supplements. Toward a fully personalized learning environment that deemphasizes complete works but that is in the unknowable future. Toward Comprehensive Standard Hypermedia may include animated sequences. Publishing standards to facilitate interoperability far beyond the current generation of Annotations to supplement data sets. coordinated windows [Baldanado]. For instance, the figures could be linked to the text via Social links for interacting with others. Making annotations visible to others (e.g., [Marshall’s] public/private distinction). 9 Standards to facilitate personalization. Conclusion In addition to their content, most documents incorporate some widgets that allow users to identify, orient, or navigate to components of the document. We have proposed a general framework for describing these widgets. The main goal of this paper is to encourage viewing these activities as a whole related to the processes of reading and learning from content. We believe that will lead to improved design and ultimately better performance. This paper has combined notions from the bibliographic metadata, hypertext, discourse, and Semantic Web communities to understand the complexity of coordination widgets. Duality of building contextual understanding and learning the details in the text. Although we strongly support the distinction between works and mere text objects, we also argue that the greater attention needs to be placed on the relationship of text objects to whole. Greater application to multimedia or hypermedia. These results may also point the way to a unified model of rich interactive scholarly documents. In addition to rich linking, there This is also consistent with model-oriented approaches to scholarly publications [Allen] Coordinating discourse widgets, theories of reading to describing and supporting access to works. Multi-models across a broad range of structures. Primacy of model-oriented approaches over linked data. Links defined by parameters within the framework of the OHS, Anchors defined as widgets of a discourse system, and sources defined by systems such as those identified in Figure 1. We aren’t focused on individual links but on ensembles of links. Essential attributes of annotations as providing navigation and elaboration. These are annotations but typically not free-form annotations. Scholarly publishing standards that natively support a full range of multimedia applications and simulations along with the interactive multimedia services. REFERENCES 1. Ad Hoc Working Group for Critical Appraisal of the Medical Literature, 1987, A Proposal for More Informative Abstracts of Clinical Articles, Annals of Internal Medicine, 106, 598-604. 2. Allen, Tables of contents for multimedia presentations. 3. Allen, 2011 DLib. 4. Allen, R.B., & Acheson, J., 2000, Browsing the Structure of Multimedia Stories, ACM/IEEE Conference on Digital Libraries (JCDL), 11-18. DOI: 10.1145/336597.336615 5. Ausabel 6. Belkin, ASK 7. Bradford 8. Conklin, J. and Begeman, M. L., 1989, gIBIS: A Tool for all Reasons. Journal of the American Society for Information Science, 40, 200–213. 9. de Waard, A., L. Breure, J. G. Kircz and H. van Oostendorp (2006). Modeling rhetoric in scientific publications. International Conference on Multidisciplinary Information Sciences and Technologies, InSciT2006, Merida, Spain. 10. Egan, D.E., Remde, J.R., Gomez, L.M., Landauer, T.K., Eberhardt, J., & Lochbaum, C.C., 1989, Formative Design Evaluation of Superbook. ACM Transactions on Information Systems (ACM TOIS), 7, 30-57. DOI: 10.1145/64789.64790 11. Grafton, A., 1999, The Footnote a Curious History, Harvard University Press, Cambridge MA. 12. Grosz, B., & Sidner, C., 1986, Attention, Intentions, and the Structure of Discourse. Computational Linguistics, 12(3), 175–204. 13. Halasz, F.G., 2001, Reflections on NoteCards: Seven Issues for the Next Generation of Hypermedia Systems. ACM Journal of Computer Documentation, 25(3), 71-87. DOI: 10.1145/48511.48514 10 14. Hodge, G., 2000, Systems of Knowledge Organization for Digital Libraries. Beyond Traditional Authority Files. CLIR, Washington DC. 15. Kitsch, W. Comprehsnion. 16. IFLA Study Group on the Functional Requirements for Bibliographic Records (FRBR), 2008, Functional Requirements for Bibliographic Records, http:// www.ifla.org/files/cataloguing/frbr/frbr_2008.pdf 17. Lagoze, C., and Van de Somple, H., Open Archives Initiative Object Reuse and Exchange. http://www.openarchives.org/ore/1.0/primer 18. Lancaster, F.W., 2003, Indexing and Abstracting in Theory and Practice, 3rd ed. Univ. of Illinois Press, Champaign IL. 19. Li, H.L., What is a collection? Journal of the American Society of Information Science, 2000, 51, 1106-1113. 20. Library of Congress, METS: Multimedia Encoding and Transmission http:// www.loc.gov/standards/mets/ 21. Maniez, J. & Maniez, D., Concevoir l'index d'un livre : histoire, actualité, perspectives. Paris, ADBS Éditions, 2009, 22. Mann, W.C., & Thompson, S.A., 1988, Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text, 8(3): p. 243-281. 23. Marshall, C.C., 1998, Toward an Ecology of Hypertext Annotation. ACM Hypertext 98, 40-49. DOI: 10.1145/276627.276632 24. Marshall, C.C., & Brush, A.J.B., 2002, From Personal to Shared Annotations. ACM SIGCHI Extended Abstracts. DOI 25. Oren, E., Möller, K.H., Scerri,S., Handschuh, S., & Sintek, M., What are semantic annotations? 2006, Technical Report, DERI Galway. 26. Polanyi, L. Telling the American Story: A Structural and Cultural Analysis of Conversational Storytelling, 1989, MIT Press. Cambridge MA. 27. Reich, S., Wiil, U.K., Nurnberg, P.J., Davis, H.C., Grønbæk, K., Anderson, K.M., Millard, D.E., & Haake, J.M., 1999, Addressing Interoperability in Open Hypermedia: The Design of the Open Hypermedia Protocol. New Review of Hypermedia and Multimedia. 28. Rimrott, A., The Discourse Structure of Research Article Abstracts – A Rhetorical Structure Theory (RST) Analysis. 29. Rothkopf, E.Z., The Concept of Mathemagenic Activities. Review of Education Research, 40(3), 1970. 30. OAI-ORE 31. Schauer, H., From Elementary Discourse Units to Complex Ones. Proceedings of the SIGDIAL workshop on Discourse and Dialogue. 2000. 32. Shotton, D. 33. Stauber, D.M., Facing the Text 34. Sutcliffe, ACM MM 1995. 35. Svenonius, E., 2000, The Intellectual Foundation of Information Organization. MIT Press, Cambridge MA. 36. van Ossenbruggen, J., Hardman, L., & Rutledge, L., 2002, Hypermedia and the Semantic Web: A Research Agenda, Journal of Digital Information, 3(1). http://journals.tdl.org/jodi/ article/viewArticle/78/77
© Copyright 2026 Paperzz