Past—Present—Sound. On Auralization as Augmented Reality ∗ Christian Kassung Institute for Cultural History and Theory Humboldt-Universität zu Berlin Unter den Linden 6 D–10099 Berlin [email protected] ABSTRACT Digital reconstructions of historical spaces have become a widespread tool within archaeology. However, these models seek to convey almost exclusively a visual representation of these past spaces. This dominance of the visual makes us forget that past as well as current spaces are characterized by a multitude of sensory experiences. Political speeches, bustling markets, or religious processions were affecting all human senses. Hence, the need for reconstructing the aural dimension, namely sound, in addition to purely visual models is obvious. The talk will firstly demonstrate technical solutions for virtual models of soundscapes like the antique Forum Romanum. Secondly, it will discuss how these reconstructions can be used to better understand the multivalent experience and the acoustic function of these spaces. At the Forum Romanum, different venues were designated for addressing large crowds, and these spaces underwent constant historical transformations, especially at the period of transition from the late Republic to the early Empire. Our virtual models allow to analyze and reconstruct the acoustics of these different speaker’s platforms for the first time. Being able to listen to a speech of Cicero at different spots within the crowd, one can at best hear how the auditory characteristics of these spaces can explain their manyfold functional and structural changes. By comparing different simulated scenarios or acoustically augmented realities, it is possible to gain an understanding of how these spaces were designed for specific acoustic experiences, as well as to identify obstructions in these functional ensembles. Audiovisual reconstructions of historical spaces prove therefore to be not only an excellent tool for illus∗All research and results presented here are based on collective work and disciplinary contributions by the team members of the project “Analog Storage Media. Auralizations of Archaeological Spaces”, namely Prof. Susanne Muth, Prof. Stefan Weinzierl, Erika Holter, Una Ulrike Schäfer, Christoph Böhm and Sebastian Schwesinger. ICCS ’16 October 25–28, 2016, Windhoek, Namibia ACM ISBN 978-1-4503-2138-9. DOI: 10.1145/1235 tration, but also for raising new questions. By this latter aspect, augmented reality becomes a valuable instrument of historic research. CCS Concepts •Applied computing → Computer-aided design; Sound and music computing; Keywords Auralization; augmented reality; archaeology 1. SOUND AND SPACE In his 1677 posthumously published work “On Man” (“Traité de l’Homme”), the French philosopher, mathematician and physicist René Descartes used the following depiction to explain seeing (cf. fig. 1): We recognize a middle-aged, barefooted man with a goatee, holding two sticks in his hands. These sticks form a triangle whose vertex touches a tree. Or, to hear it in Descartes’ own words: Notice also that if two hands f and g each hold sticks i and h with which they touch the object K, then even though the soul is otherwise ignorant of the length of the sticks, nevertheless, because it can tell the distance between the points f and g, and the sizes of the angles f gh and gf i, it will be able to tell, as if by a natural geometry, where the object K is.1 Now, what we cannot see in the illustration is the fact that the man is blind. Descartes conceives vision “in terms of analogies to the senses of touch.”2 It is the metaphor of the blind man constituting an epistemology of a tactile geometry in which vision operates by touching the outer world with two more or less material sticks. However, the crucial point here is that there are two sticks, explaining why humans have two eyes and two ears.3 For Descartes, both seeing and hearing is like exploring the outer world by scanning their surfaces. Leaving aside this historical scene of the early modern period, we can state a close epistemic relationship between 1 Descartes 1998: 133–134. Crary 1990: 59. 3 Cf. Mach 1896: 83–84 and von Hornborstel 1923: 64. 2 Figure 2: G. B. Piranesi 1748: Vedute di Roma: Campo Vaccino (Forum Romanum). Figure 1: R. Descartes 1677: Model of Vision. seeing and hearing, linked by an active and bidirectional signal process. The sound or light that we are perceiving comes only in first approximation directly from the seen or heard object. In the case of sound, only 10 % of the signal’s acoustic energy propagates without any reflections or refractions into the ear of the listener. In fact, the wave fields are much too complicated to allow a clear distinction between a sender and a receiver. Or, to put it into more general terms: Both seeing and hearing are modes of perception being active and passive at the same time. We look at certain things whiles others are attracting our attention unconciously. We turn our head automatically because we have heard something behind us. And we hear a dog barking behind a wall although we cannot see it. Furthermore, the predominance of the visual and visual media in the last two hundered years has made us forget that living space is strongly shaped by sounds. When we think about the structure of the acoustic space we primarily consider sound as an intentionally used element. Especially in public spaces, functional sounds like car horns, bells or jingles inform or warn us of certain events. Signals that result of certain actions like i. e. the slamming of a door or the steps on the stairs are playing a similar role: They help us to navigate through our daily routines. Functional signals act like a direct command making us hesitate, rush, repeat, stop, or anything else. However, soundscapes are structured by more acoustic information than just these more or less intentionally designed sound signals. Consider a person walking down a street. With every step, the soundscape is changing and thus provides a whole bunch of information. And it is not only material parameters like the surrounding architecture but also soft ones like wind, rain, emotions, education, or knowledge. To make a long story short: Our behaviour is to a large extent a result of the manyfold acoustic interactions between us and our environment. Now, switching to historical spaces it seems surprising that the most famous public spaces in antiquity, such like the Forum Romanum in Rome or the Agora in Athens, have been investigated primarily with regard to their visual and symbolic function. Individuals or groups display and experience their collective or personal identity or status by decorative architectural styles and ornamental elements, by the formation of mosaics, or by the dramaturgy of views and lines of sight. However, this traditional archaeological approach ignores that exactly like nowadays these spaces were characterized by a plurality of multisensorial experiences like bustling markets, civic assemblies, public speeches and law courts, processions and festivals. The central aim of our project is to reconstruct this multisensorial dimension of antique spaces and especially to evaluate the functional plausibility of different acoustic scenarios. To achieve this, we installed a research group at the Cluster of Excellency “Image Knowledge Gestaltung” at the Humboldt-Universität zu Berlin. In our group, Professors Christian Kassung from the Department of Cultural History and Theory and Susanne Muth from the Department of Classical Archaeology, both at Humboldt-Universität zu Berlin, teamed up with Professor Stefan Weinzierl from the Technische Universität Berlin’s Department of Audio Communication.4 In the following paper we will present a case study on speech comprehension during public addresses on the Forum Romanum in the Late Republican period. 2. MEDIA OF ARCHAEOLOGY Casually speaking one could say that the media history of archaeology at least dates back to Piranesi’s documentations of ancient buildings and ruins in the 18th century. However, when looking at these pictures one can easily discern their confusing non-objective perspective. The observer looks at tiny and isolated human beings spread in a vast landscape of almost disturbing relicts. Comparing these interpretations of the past to media technologies like photographs that have been and are used extensively at excavations, the difference between “symbolic” and “real” representation becomes obvious. What is meant by this dichotomy? 4 Cf. https://www.kulturtechnik.hu-berlin.de/de/content/ analogspeicher-ii-auralisierung-archaologischer-raume/. When we are walking across the Forum Romanum as a visitor of the historical site, our internal simulation draws on knowledge from history classes, image reservoirs from brochures, books and museum objects, habitus from movies or comics, etc. All that we come up with is heavily mingled with our cultural acquired representations that also heavily depend on their techno-media formats and our sensory “bio-media”. We cannot enter a historical site without augmenting the physical structures we observe with pictures, knowledge, or symbolic significations from various, partial unconscious sources. On the other hand there is the “real”, the physical structures of the relicts that cultural heritage initiatives try to save from decay.5 As can be seen very prominently in the Acropolis Restauration Project that began in 1975, today’s societies invest high amounts of energy to preserve material structures in order to retain a certain kind of historical reality. Now, the crucial point is that instead of artificially trying to uphold this dichotomy one could bring the real and the symbolic into even closer contact. We would like to argue that the emergence of media technologies has the potential to bridge this gap. Instead of solely conserving historical reality in situ the radical alternative that media technologies provide is to treat the real as symbolic.6 Today’s media and of course especially the computer turn this maybe interaction between the real and the symbolic into a new and extremely productive relationship. Computer simulations have gained acceptance in scientific knowledge production. At the CERN, you will hardly find any experiment in the traditional way of measuring something called by somebody reality. Instead, computer simulations create experimental environments that are likewise being measured by computer simulations. Or, to give a historical example: How did Julius Caesar look like? What are the sources of our knowledge? Ancient marble busts? Old coins? Film stars like Louis Calhern? Cartoonists like Albert Uderzo? For this example one could argue that we can clearly distinguish between fictional sources and historical facts. But our argument is that you cannot fully trust ancient media like coins or busts neither. And this suspicion holds true as well for every fallen column that has been built up again. Surely, there is a continuum of trustworthiness, but something like a self-contained historical reality with corresponding documents simply doesn’t exist. Hence, the idea of a clean, depopulated and white ancient past gave way to approaches that consider the usage of ancient structures as important. In this sense, blurring the clear boundaries between fiction and fact prepared the floor for new methodologies that asymptotically approach these material structures. To this end, any archaeological model is a form of augmented reality. In our project, we take architecture as a medium for sound, with very specific functional requirements. Thus reconstructing the architecture of the Forum Romanum as an analogue storage medium allows us to draw conclusions on its usage and suitability for specific aural occurences. 3. VISUAL RECONSTRUCTION The first step towards any reconstruction of what a historical “user” of such an environment might have seen, heard 5 6 Cf. Kassung/Schwesinger 2016. Cf. Kittler 1993. Figure 3: E. Holter, S. Muth 2016: View of the Forum Romanum during the Late Republic. or in other ways experienced is the digital simulation of the built, architectural space. Historical simulations in particular require a digital model supported by as much scientific evidence as possible. Our research is therefore based on the digital reconstruction of this ancient public space, provided by “digitales forum romanum”, a project led by Prof. Susanne Muth.7 The goal of the project “digitales forum romanum” is to create a diachronic digital reconstruction of the Forum Romanum from the Archaic period until Late Antiquity and the Early Middle Ages. So far, seven different phases dating to 200 B. C., 100 B. C., 14 A. D., 96 A. D., 150 A. D., 210 A. D., and 310 A. D. have been reconstructed. This reconstruction should not be regarded as a simple visualisation, but as a tool for further research, and in keeping with this all available evidence—archaeological, literary, architectural—has been reviewed and analysed for the different structures on the Forum, considering very carefully their reliability. Now, the case study of this paper focuses on the Forum Romanum as it appeared in the Late Republican period (cf. fig. 3). While the situation in 200 B. C. is not fundamentally different from that of the Early and Middle Republic we can start by taking a look at the architectural reconstruction of the earliest location for public speeches and assemblies. This is the Comitium, the area in front of the senate house, the Curia. Together with the very first speaker’s platform on the Forum, called Rostra, Curia and Comitium constituted the architectural complex for the decision-making assembly of all Roman citizens. At latest, the beginnings of this area as a functionally developed space can be traced to the foundation of the Early Republic and its corresponding Republican political structures (cf. situation 1). Speaking from this first position of a built platform, the politicians were able to address the citizens gathered in the Comitium in front of the Curia. This situation changed for the first time in the middle of the 2nd century B. C., when the speakers on the platform turned their backs on the Comitium in order to face the crowds assembled on the other side of the rostra, in the Forum square. The gathering place therefore moved from the Comitium to the center of the Forum (cf. situation 2). There are no architectural traces or remains of this inversion which is solely documented in the literary tradition. 7 Cf. http://www.digitales-forum-romanum.de/. Yet another area of the Forum became increasingly important as a place of public assembly in the course of the 2nd century B. C. On the opposite site of the Forum, the Temple of Castor had undergone a rebuilding in the early 2nd century, at which a speaker’s platform was attached to the temple podium. The temple podium could possibly have been used earlier for this purpose, but in any case, the reconstruction had become neccessary because the architecture fullfilled no longer the requirements of this new function (cf. situation 3). Literary sources, however, first mention speeches and public assemblies in front of the Temple by mid-2nd century B. C. In our analyses, we use a development stage of the temple by Lucius Caecilius Metellus in the late 2nd century. According to present interpretations of the Forum Romanum, the architecturally unchanged situation of the Comitium-Curia complex from its first construction until the 1st century B. C. reflects the collective identity of an enduring Republican tradition. And the Temple of Castor is read as a victory monument for the Roman citizens. Its original construction commemorated the victory over the Latins, while it later became a symbol of the patrician identity. Whatever the case, interpretations focusing on the symbolic nature of architecture fail to take into account that architecture serves, first and foremost, a specific functional purpose.8 The speaker’s platforms were intended to provide an ideal space for giving speeches to public assemblies—raised up for optimal view and comprehension of the speaker. The question is, how well they did this. From the very beginning of the Republic period, political communication in public spaces was a central aspect of Roman society: it was a political communication based on oral as opposed to written communication. Public assemblies met on the forum in Contiones, meetings convened by a magistrate to discuss political and legal matters, meant to provide the citizenry with information and explanations for various votes. The Contio itself did not make a decision, instead, it was a necessary precursor to the comitium, the public assembly in which the citizens voted in elections or on other legal issues. In addition, the Contiones were the appropriate platform from which to make public announcements, proclaim victories in battle, and report on senate resolutions, all events in which the people could experience themselves as part of a (victorious) whole. These public assemblies where the plebs and nobiles met and communicated were essential for creating a cultural consensus that could lead to a passing of even controversial legislation. It might be exaggerating a little, but successfull oral communication at these assemblies could be a matter of life and death. 4. AURAL RECONSTRUCTION Consequently, the functional importance of the Forum Romanum cannot be understood only by its architectural changes. Investigating it as a space for political communication and comparing the three different speaking situations known for the Republican Forum (first from the speaker’s platform towards the Curia, second from the speaker’s platform towards the Forum itself, and third from the Temple of Castor) requires the simulation of these assemblies by means of digital media. In addition to peopling the Forum 8 Cf. Muth 2015, Muth/Schulze 2014. Figure 4: St. Wienzierl 2016: Scheme of Auralisation. with crowds and recreating the point of view of a listening person, the auditory experience itself needs to be reliably recreated. Using the methods and tools of virtual acoustics helps us to reconstruct an objective aural impression calles auralisation based on the architectural model as well as on literary sources.9 By this, we are able to systematically create scenario-based simulations which allow us to define plausible corridors and boundaries of speech comprehension. To give a brief description of the technical procedure one can distinguish four crucial components: first the acoustic properties of the Forum, second the listener’s physiognomy, third the pure speech signal, and fourth the binaural Forum’s noisescape (cf. fig. 4). As mentioned at the beginning of this paper, only 10 % energy of an aural impression reaches our ears directly. This means that the acoustically active matter of the environment is, according to Brian Larkin, “a powerfull mediating force that produces new modes of organizing sensory perception, time, [and] space”.10 Any speech sound has to pass through a complex filtering structure of human bodys, buildings, plants and trees, air, etc. before it is reveived by the listener. Our simulation of the Forum’s space impulse response may be summarised as follows: Up to 20 million sound rays were sent out from various speaker positions into the virtual model of the digital Forum Romanum. Acoustic parameters like sound pressure level, absorption, reflection and scattering at bounding surfaces as well as the air absorption were calculated. These interactions in turn influence level, coloration, and time delay of each ray. Now, all these simulation rays reach—either directly or reflected—a virtually placed listening head. In the second step we thus take into account the listener’s head position plus its physiognomy, the so-called head related transfer function. The Forum’s space impulse response is split into a right and a left ear impression that is changed by the software in real-time when the listener turns his head. If, for example, the listener turns his right ear against the orator he then hears the speaker more clearly with the right ear while the left ear receives primarly the spatial acoustics. 9 10 Cf. Weinzierl 2002, Weinzierl 2008. Larkin 2008: 219. Figure 5: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 1. Figure 6: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 2. In the third step this spatial filter system is folded with the pure signal of the speech. Among the most famous speeches known to have been given during a Contio are Cicero’s second and third oration against Catilina. In theses speeches, Cicero informed the Roman citizens of the impending danger of a conspiracy. We chose an excerpt from the third speech against Catilina for our auralizations and hired a trained orator of similar age. We asked him to project himself into this speech scenario in which he has to address as many people as possible, as the occasion for the speech was such crucial. The speech was recorded in an anechoic studio. This architecture diffuses almost all soundwaves so that the recorded speech signal includes virtually no reverberations. The final element of our auralization is the noisescape that has to represent the plausible background noise of whispers, air movement, and rustling. The amount of people, the architecture, and the attentiveness of the crowd have to be comparable to the listening experience one is trying to reconstruct. For our case study we recorded samples of the crowd noise on St. Peter’s Square in Rome during the traditional Sunday Angelus Prayer of the Pope. All these four components of the signal path were processed by an auralisation software and then made available for an aural impression. It is important to note that in addition to this digital simulation, real ears are needed in the end to determine the speech’s comprehensibility. We test and document these hearing impressions to evaluate the results and calibrate parameters iteratively. survey of the situation in the Late Republic when the speakers addressed the crowds standing in the Comitium (cf. fig. 5.1). The color spectrum indicates the sound intensity level that correlates with the speech comprehensibility as examined in listening tests. With the dotted line we have marked the dark red area of 2,650 sq m in which a listener would have been able to understand very well. Using a figure of four persons per square meter, a speaker at this position would have been able to reach 10,600 people easily. The dashed line indicates the zone of average comprehensibility, albeit only with intense concentration. In this best-case scenario, therefore, in which the audience is quiet and straining to hear, the speaker could reach 11,200 people on an area of 2,800 sq m. On the basis of these results, we can assume that from the Rostra, a speaker could reach almost all listeners gathered in the Comitium very well. 5. RESULTS Within our simulations, we have started with a best-case scenario, in which all listeners are trying to be as quit as possible in order to understand everything. However, this has not always been the case and crowds were known to routinely interrupt the speaker with loud boos and shouts and to generally disturb the proceedings, especially as the Late Republic progressed. A gifted orator, on the other hand, was said to bring a crowd to silence. Anyway, while the following figures will depict the maximum amount of people that could understand the speaker, we must assume that this was often not the case. 5.1 Situation 1 Our first rendering illustrates the findings of the acoustic 5.2 Situation 2 This comfortable situation changes significantly when the orator addresses the citizens on the Forum square itself (cf. fig. 5.2). Here, the area of easy comprehensibility decreases to 2,300 sq m according 9,200 people being able to understand a speech without great effort—1,400 people less than in the earlier Republican situation. On the other side, however, the amount of people that can theoretically be reached has increased considerably to a total space of 4,700 sq m in which 18,800 people in all can stay. It is thus possible to discern a general development between the different situations. Speaking towards the Forum wasn’t just a symbolic gesture of literally turning ones back on the senate—speeches were audible to a far greater number of people, if not only because of the amount of space now available for the audience to listen. However, the decrease in comprehensibility shows that this solution to the new need to address greater amounts of people was still less than ideal. 5.3 Situation 3 At the last stage of the Late Republic—the speech given in front of the Temple of Castor—the number of people being able to comprehend the oration increases once more: to 2,950 sq m for easy understandability and to 5,900 sq m for general comprehension. This encompasses a range of 11,800 to 23,600 people, more than double what a speaker from the Rostra facing the Comitium could reach and considerably 7. Figure 7: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 3. more than if he faced the Forum square. One of the reasons for this is that the architecture became an acoustically active matter especially because the heightened rostrum reduced the sound absorption by the listeners’ bodies and noise. These results are especially interesting when considered within the historical context of the Late Republic. By mid2nd century B. C. we find the Temple of Castor increasingly mentioned as a place of public assembly. At the same time the Contio became more and more controversial, with consensus less and less likely to be reached. The literary sources detail the violence and riots that began to surround public assemblies. Usually, this has been interpreted as a breakdown of communication between the senatorial elite and the people. Our acoustic results show, however, that this occurrered at the same time when the greatest number of people could be reached. 6. CONCLUSIONS Referring to my introductory remarks on the relationship between the real and the symbolic one could say that with these (symbolic) auralisations we simulate the (real) material structures and vice versa, just because sound is essentially both: a symbolic signifier and a physical process. As a subsequent augmentation these digital algorithms implement physical principles of sound propagation and reflection in simulations run with virtual augmentations of real archaeological structures. In the case of the Forum Romanum in the Late Republic expert testing of our auralisations has revealed an continuously increasing number of listeners being able to comprehend the speeches in the course of three changes of speaking positions on the square. Whereas in the oldest position towards the Curia a maximum of approximately 11,000 people could have been adequately addressed, from the speaker’s platform in front of the Temple of Castor possibly up to 23,000 people could have followed an oration. Accounting for the growth of population and the significance of public address and assembly during the struggles of the Late Republic these findings could challenge traditional interpretations that regarded these changes rather as representational strategy or symbolic gestures of a political elite. The sensory experience of the Forum in the Republican period can thus be expanded past the (symbolic) visual nature usually studied. REFERENCES [1] J. Crary. Techniques of the Observer. On Vision and Modernity in the Nineteenth Century. MIT Press, Cambridge/Massachusetts, London/England, 1992. [2] R. Descartes. The World and Other Writings. Cambridge Texts in the History of Philosophy. Cambridge University Press, Cambridge/UK, 2004. [3] E. M. v. Hornborstel. Beobachtungen über ein- und zweiohriges hören. Psychologische Forschung, 4:64–114, 1923. [4] C. Kassung and S. Schwesinger. How to Hear the Forum Romanum. On Historical Realities and Aural Augmentation. In C. Busch and J. Sieck, editors, Kultur und Informatik. Augmented Reality, pages 41–53. Verlag Werner Hülsbusch, Glückstadt, 5 2016. [5] F. Kittler. Es gibt keine Software. In Draculas Vermächtnis. Technische Schriften, number 1476 in Reclam-Bibliothek, pages 225–242. Reclam Verlag, Leipzig, 1993. [6] B. Larkin. Signal and Noise. Media, Infrastructure, and Urban Culture in Nigeria. Duke University Press, Durham, London, 2008. [7] E. Mach. Warum hat der mensch zwei augen? In Populär-wissenschaftliche Vorlesungen, pages 78–99. Johann Ambrosius Barth, Leipzig, 1903. [8] S. Muth. Das Forum Romanum: Roms antikes Zentrum neu verstehen. Antike Welt, 46(6):34–40, 2015. [9] S. Muth and H. Schulze. Wissensformen des Raums: die schmutzigen Details des Forum Romanum – Archäologie & Sound Studies im Dialog. Cluster-Zeitung, 55:7–11, 2014. [10] S. Weinzierl. Beethovens Konzerträume. Raumakustik und symphonische Aufführungspraxis an der Schwelle zum bürgerlichen Zeitalter. Verlag Erwin Bochinsky, Frankfurt am Main, 2002. [11] S. Weinzierl, editor. Handbuch der Audiotechnik. Springer Verlag, Berlin, 2008. APPENDIX A. FIGURES • Fig. 1: R. Descartes 1677: Model of Vision. In R. Descartes: The World and Other Writings. Cambridge University Press, Cambridge/UK, 2004, p. 133. • Fig. 2: G. B. Piranesi 1748: Vedute di Roma: Campo Vaccino (Forum Romanum). In http://www.zeno.org/ nid/20004223756. • Fig. 3: E. Holter, S. Muth 2016: View of the Forum Romanum during the Late Republic. In digitales forum romanum, http://www.digitales-forum-romanum.de/. • Fig. 4: St. Weinzierl 2008: Scheme of Auralization. Diagram by the research group. • Fig. 5–7: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 1–3. Diagram by the research group.
© Copyright 2026 Paperzz