O917-01.qxd 7/24/03 4:12 PM Page 1 CHAPTER 1 Semantics: A Trillion-Dollar Cottage Industry The U.S. economy is perched precariously on top of some 200 billion lines of aging legacy mainframe code1 and a comparable amount of newer, but no less endangered, code on various flavors of servers and PCs. This represents a $3 trillion investment,2 most of which will need to be replaced over the next decade, at a price more likely to hit $10 trillion. This is pretty much business a usual, except for two things: • A large percentage of this cost, perhaps as much as half, is avoidable. • The approach we take to this next round of replacement will determine how much of this investment really will be an “investment” that will carry forward to subsequent generations of technology. And the technology on which the realization of these benefits hinges is not really a technology at all. It is a 2500-year-old branch of philosophy, made suddenly relevant by a confluence of developments: semantics. Consider the following: • The Mars Climate Observer crashed into the surface of Mars, a victim not of a technical problem, but of a semantic misunderstanding concerning the units of measure used to calculate the thrust. • Between half and three quarters of the $300 billion spent annually on systems integration is spent resolving semantic issues. 1. Rekha Balu, “(Re)Writing Code,” Fast Company, April 2001, pp. 181–189. 2. Paul Strassman, “End Build-and-Junk,” Computerworld, July 5, 1999. Available at www.strassmann.com/pubs/cw/end-junk.shtml. 1 O917-01.qxd 7/24/03 4:12 PM Page 2 2 CHAPTER 1 Semantics: A Trillion-Dollar Cottage Industry • The entire Y2K adventure was two semantic problems piled on top of each other, the first being the simple problem of determining whether “01” meant “1901” or “2001,” the second being that the stewards of many of the affected systems had no way to understand the applications in enough fidelity to predict what would happen to them if they were altered. • The most promising technologies currently offered up to solve our application development and implementation problems—Enterprise Application Integration (EAI), XML, Business Rules, Web Services, Collaboration, and of course the Semantic Web—all share a foundational reliance on semantics. Perhaps this is enough to whet your appetite, and maybe about now you are wondering: “Where can I buy some semantics?” or “How do I ‘do’ semantics?” or “Can I implement semantics in my organization?” But that’s not the nature of semantics. Semantics is a discipline you apply, not a technology you buy. Monsieur Jourdain, in Jean Baptiste Molière’s play The Bourgeois Gentleman: “And when one speaks, what is that?” “That is prose, Monsieur.” “What! When I say, ‘Nicole, bring me my slippers, and give me my nightcap’; is that prose?” “Yes, Monsieur.” “Well, well, well! To think that for more than forty years I have been speaking prose, and didn’t know a thing about it. I am very much obliged to you for having taught me this.” Like Monsieur Jourdain in the accompanying sidebar, I trust most software developers will be quite pleased to find they have been applying semantics their entire career. Maybe you haven’t been intentional or rigorous about it, but in order to get anything at all done in the world of software you have had to deal with semantics. In this book, we look at every aspect of business systems anew. We also put semantics under the microscope and find out what it is composed of, and how that might guide our further investigations. And we look at our applications and our development technologies from the point of view of semantics, to see how that changes our perceptions. Before we go any further, let’s get this out of the way: O917-01.qxd 7/24/03 4:12 PM Page 3 The Semantic Era of Information Systems Semantics 3 Semantics is the study of meaning. Semantics is often defined as the study of the meaning of words, but we are going to take the broader definition here, allowing for the possibility for meaning to reside in something other than just words. Ultimately, the relevance and success of our application systems rest on what the symbols that we are manipulating inside the computer really mean in the “real world.” Of importance is not only what they mean—but do the people, and other computer programs, that deal with the presented information understand and agree with the meaning as implied by the system? The Semantic Era of Information Systems Most of what we had thought were the hard problems of computer science and business system development have been solved. We know how to write efficient algorithms. We know the most effective ways to process and store data. We’ve solved the problems of getting diverse computer platforms to interoperate. We routinely store terabytes (trillions of bytes) of data in data warehouses. The average home has more processing power at its disposal than the largest corporation of just a generation ago. We’ve connected nearly a billion devices to a single gigantic Internet. What we’re left with, and what I believe will occupy us for most of the next decade or two, are some problems that don’t lend themselves to quite as mechanical a resolution. We have to determine what systems we really want to build. We have to find a way to determine what parts of a system need to be made flexible for future change, and which are likely to be stable for a long time. We need a way to understand the systems we already have, before we attempt to change them. We need a way to communicate with trading partners without a long burn-in period. And above all else we need a way that computers can help us with some of the processes that up until now we have thought of as being in the exclusive realm of the human: interpretation, negotiation, and reasoning. Scratch the surface on any of these issues and you’re into semantics. Indeed, for many of these problems, once the semantic issues are resolved, the remaining technical problems are routine. No period of time is exclusively focused on one issue, but there are periods of time when certain issues rise to the top as the issue on which progress will be marked. In the 1980s it was application development: We had an incredible appetite to build computerized versions of all our manual processes. In the early 1990s it was user interfaces: What could we do to make these systems easier to learn and use? Later O917-01.qxd 7/24/03 4:12 PM Page 4 4 CHAPTER 1 Semantics: A Trillion-Dollar Cottage Industry it was interconnections: If we could just overcome the barriers to getting our customers and supply chains, to say nothing of our internal systems, hooked up, we’d be able to move forward. Currently the top-of-mind issue may be security. But the ground swell is developing that suggests an impending sea change toward a semantic focus that may last a fair while. This book is meant to be your guide for taking advantage of this shift, at a minimum to avoid overinvestment in projects, technologies, and approaches that are unlikely to stand up to the changes. But for many of you this will be the opportunity to vault ahead of your competitors, either corporately or individually. Let’s spend a minute discussing how this book can help with that. The Plan of this Book The first third of this book (Chapters 1 through 5) is descriptive. It steeps you in what semantics is and explains why something so seemingly simple can be so complex. We deal briefly with the history of semantics and some of the closely related fields, to familiarize you with this rich subject. To make sure that you are clear about what aspects of our semantic conundrum were created by our systems and which were there before computers, we start the investigation of semantics in business systems before the arrival of computers. We then follow the progression through to the present, having looked at some of the areas that have used semantics the most to date: data modeling and metadata development. The second third of the book (Chapters 6 through 10) is prescriptive and covers approaches and methodologies to uncover and make more explicit the semantics that are already implicit in your business and your business systems. This section is built for practitioners who wish to suffuse what they currently do with techniques and approaches that will raise the level of semantic awareness in all their system-related activities. As such we will cover the role of interpretation in semantics, as well as ways to elicit, record, and convey a more complete semantic understanding of the systems and processes. The last third of the book (Chapters 11 through 15) is subscriptive in that it deals with relatively new technologies and approaches, some or all of which you are likely to be subscribing to in the future and each of which has a semantic twist to it. The chapter on XML deals with getting maximum value out of the tags, which have the potential to carry semantic information. The EAI chapter deals with using the study of semantics to overcome the single largest cause of integration difficulty: late discovery of semantic incongruities. To prevent Web Services from re-creating the tangle of point-to-point connections that characterize so many integration efforts, we describe a O917-01.qxd 7/24/03 4:12 PM Page 5 A Brief History of Semantics 5 semantically inspired approach to their adoption. Chapter 14 discusses the Semantic Web, the follow-on project to the World Wide Web. Fortunately, we don’t need to explain the semantic aspect of it, but we do cover some of the less obvious technologies that are being promoted along with the Semantic Web, as well as a scenario that should be helpful in visualizing how the Semantic Web will be used. The book wraps up with a short chapter on getting started in your semantic endeavors, and two appendices: one a set of annotated resources for those who would like to pursue this further, and the other a glossary of the many arcane terms that this subject involves. A Brief History of Semantics I’ll make this brief, but I do believe there are some developments in the long history of semantics that will still be relevant in the twenty-first century. There are some philosophical arguments that we must be aware of, or we can waste considerable time. Figure 1.1 outlines some of the key developments in the history of semantics. For our purposes, some of the key developments included the following: Pragmatism Linguistics Ancient Greece Spoken Language Artificial Intelligence Written Language Enlightenment 700,000 BC 20,000 BC 1700 AD 1960 AD 400 BC 1930 AD 1870 AD FIGURE 1.1 Key developments in the history of semantics. O917-01.qxd 7/24/03 4:12 PM Page 6 6 CHAPTER 1 Semantics: A Trillion-Dollar Cottage Industry • Spoken language—Most people rank the use of a spoken language with the development of tools as the defining event that separated our ancestors from the rest of the primate family. Semantically, early man had to make a giant leap from screaming and pointing to the use of abstract sounds to represent things that were not in the immediate environment. • Written language—The advent of writing raised the bar considerably. Tone and gestures were no longer available as adjuncts to aid with the communication of meaning. Perhaps the most important development was the ability to communicate with people who were not present. Syntax and grammar gradually developed as writing became more formalized. • Ancient Greece—The self-reflective knowledge of meaning with which our language was dealing had to wait until the Golden Age of Greece to be articulated. We don’t know much about Socrates’ formal position on semantics, other than that his famous Socratic method was mostly aimed at finding deeper meaning in thoughts, words, and deeds. Plato’s forms are a good representation of his take on semantics. He believed that we infer knowledge of the perfect forms (for example, a circle) from the less than perfect examples we come in contact with (round things). His metaphor of the cave concerns how we can make inferences only indirectly about the essence of things. Aristotle’s wideranging contributions included a great deal on classification and the establishment of identity, both central concerns for semantics. His syllogisms form the basis of how we can infer knowledge of a particular item, once we ascribe it to a type. • The Enlightenment—The semantic embers burned dimly through the Middle Ages, and even the Renaissance, with its advances in many areas, saw little new work on semantics. Sir Francis Bacon, Sir Isaac Newton, and René Descartes shifted the semantic debate to focus on what could be observed and verified experimentally. A series of later Enlightenment thinkers—Empiricists such as David Hume, Thomas Reed, John Locke, and George “If a tree falls in a forest” Berkeley— debated the role of the human observer as establishing context in a world otherwise devoid of meaning. • Pragmatism—Charles Pierce was responsible for several early and thought-provoking, high-level conceptual ontologies and for a formal approach to logic applied to semantics. William James, another prag- O917-01.qxd 7/24/03 4:12 PM Page 7 Putting Semantics in its Place 7 matist, brought us some of the concepts of verification and the belief that nature is to be understood deductively. • Linguistics—By comparatively investigating human languages, and especially anthropologically studying the languages of cultures that have not been exposed to mainstream languages, we have learned a great deal about what aspects of language are likely innate and what aspects are a product of culture. Some of the notable contributors included Alfred Korzybski, Noam Chomsky, Ludwig Wittgenstein, Eleanor Rosch, and George Lakoff, who, although they were not all purely in the linguistic field, all contributed greatly to the twentieth century’s advances in this field. In particular, Rosch and Lakoff have contributed some of the seminal work on what constitutes a category or a type, a topic that those of us in the business of information systems use constantly with little understanding of what we are describing. • Artificial intelligence (AI)—The AI community has contributed many subfields to this pursuit, including the formalization of ontologies (organization of meaning of terms), inferences (how we deduce new information from presented information), and interpretations (for example, how a computer system can be built to interpret spoken English). This brings us more or less up to the present. Yes, I’ve slighted some groups or individuals, but I wanted to get as much of the flavor for the long history of the subject as possible without becoming tedious. Throughout this rich history, people have been refining fields of knowledge, primarily within the domain of philosophy, specialized to study various aspects of the way we understand our place in the cosmos. In the next section we introduce some of these fields of study as they relate to semantics. Putting Semantics in its Place Semantics is not a stand-alone discipline; it is interlocked with various other areas of study that borrow from it, and it from them. If you decide to pursue this study further, Figure 1.2 should be a helpful roadmap or at least provide some idea of where the major boundaries are. Semantics is about meaning, and about distinguishing things that are close in meaning from each other. As such, we should spend a moment clarifying semantics by distinguishing it from several other terms that are related. O917-01.qxd 7/24/03 4:12 PM Page 8 8 CHAPTER 1 Semantics: A Trillion-Dollar Cottage Industry Metaphysics Epistemology Ontology Phenomenology Linguistics Semiotics Cosmology Philosophical Theology Mereology Semantics FIGURE 1.2 Syntax Pragmatics Semantics in relationship to other branches of metaphysics. • Metaphysics—Metaphysics attempts to explain the fundamental nature of everything, in particular the relationship of mind to matter. This is the more traditional definition and is not to be confused with many popular definitions that deal with occultism and mysticism. • Epistemology—Epistemology is the branch of philosophy that studies the nature of knowledge. This is more concerned with how we know things than with what things mean. • Mereology—You may not think there could be a branch of study devoted to the relationship of parts to wholes, but there is and this is it. The relationship to semantics is a bit complex. At one level mereology informs us whether we are attempting to understand the meaning of something in its entirety or whether understanding its constituent parts is sufficient. On the other hand we need to apply semantics to the many mereological distinctions to understand what it means to include something, be part of something, or contain something. • Phenomenology—Phenomenology is a philosophy based on the belief that reality is composed of objects and events as they are perceived by a human mind. The sophists believe that “man is the measure of all things” and that reality is as we perceive it to be. “Idealism,” the belief that the only real world is the “ideal” world and that the physical world is constantly changing, is a form of phenomenology. • Linguistics—Linguistics is the study of language, and generally is a broader concept and includes semiotics. Linguistics also covers many other disciplines not related here, such as the study of sounds. • Ontology—Ontology is a branch of metaphysics that deals with structures of systems. Currently, it is associated with organization and classi- O917-01.qxd 7/24/03 4:12 PM Page 9 A Semantic Solution to a Semantic Problem 9 fication of knowledge. It is closely related to semantics, the primary distinction being that ontology concerns itself with the organization of knowledge once you know what it means. Semantics concerns itself more directly with what something means. • Semiotics—Semiotics is the study of signs and symbols as used in language. It is a broader study than just the study of meaning in that it incorporates syntax, semantics, and pragmatics. • Syntax—Syntax as a philosophical study is concerned with first-order logic, or how to construct very basic grammars. It forms the basis for formal semantics. • Pragmatics—Pragmatics is a branch of semiotics concerned with the relationship between language (or signs) and the people using them. How does social context interact with meaning? The word pragmatic is often used to mean practical. This is an important body of work relative to semantics, especially as we come to apply semantics in a predominantly social context (business). • Cosmology—Cosmology is a subdiscipline of metaphysics that concerns itself with the nature of being. It is concerned with how the universe works, not with what our terms mean. It has come to be associated more with astronomy of late. Relative to semantics, it asks “Why?,” whereas semantics asks “What?” • Philosophical theology—Philosophical theology is the branch of metaphysics that deals with the relationship of a deity relative to the phenomenology of the world. It has historically been a trump card in the discussion of semantics, in that the meaning of things we deal with in semantics could be construed to have a meaning not available to us but only to a divine creator. I hope that this overview is useful in describing a few of the other fields that have been closely related to semantics over its long history. A Semantic Solution to a Semantic Problem To get us started, I’ve outlined a sketch that says, in effect: We have trillions of dollars worth of business software installed and in use. It is obsolete, or soon to become obsolete, and we are going to have to replace it. I make the claim that much of the complexity of these systems has its roots in semantics, as do most of the newer technologies with which we are now presented. O917-01.qxd 7/24/03 4:12 PM Page 10 10 CHAPTER 1 Semantics: A Trillion-Dollar Cottage Industry And I further claim that a systematic study of the application of semantics to business systems is our best hope for the future. But I haven’t really made an airtight case for these claims. That’s what the rest of the book is about. I could have opened with the George Jetson-style world of the future where your refrigerator not only talks with your thermostat, but they have meaningful conversations. And your day timer understands the office politics of staff scheduling. But you’re not likely to buy that “if only” technologic utopian world. Instead, I’d rather appeal to that side of you that knows the current state of business systems is a deplorable mess, many times more complex than it needs to be, and yet is still not up to the tasks we have in store for it. You suspect that things could be much better than they are now. You’re eager to find out what to do to make things better. We’ll get there, but before we do, let’s take a moment to understand how we built this semantic cacophony.
© Copyright 2026 Paperzz