fileadmin/user_upload - Leuphana Universität Lüneburg

Past—Present—Sound.
On Auralization as Augmented Reality
∗
Christian Kassung
Institute for Cultural History and Theory
Humboldt-Universität zu Berlin
Unter den Linden 6
D–10099 Berlin
[email protected]
ABSTRACT
Digital reconstructions of historical spaces have become a
widespread tool within archaeology. However, these models
seek to convey almost exclusively a visual representation of
these past spaces. This dominance of the visual makes us
forget that past as well as current spaces are characterized
by a multitude of sensory experiences. Political speeches,
bustling markets, or religious processions were affecting all
human senses. Hence, the need for reconstructing the aural
dimension, namely sound, in addition to purely visual models is obvious. The talk will firstly demonstrate technical
solutions for virtual models of soundscapes like the antique
Forum Romanum. Secondly, it will discuss how these reconstructions can be used to better understand the multivalent
experience and the acoustic function of these spaces.
At the Forum Romanum, different venues were designated
for addressing large crowds, and these spaces underwent constant historical transformations, especially at the period of
transition from the late Republic to the early Empire. Our
virtual models allow to analyze and reconstruct the acoustics of these different speaker’s platforms for the first time.
Being able to listen to a speech of Cicero at different spots
within the crowd, one can at best hear how the auditory
characteristics of these spaces can explain their manyfold
functional and structural changes.
By comparing different simulated scenarios or acoustically
augmented realities, it is possible to gain an understanding
of how these spaces were designed for specific acoustic experiences, as well as to identify obstructions in these functional
ensembles. Audiovisual reconstructions of historical spaces
prove therefore to be not only an excellent tool for illus∗All research and results presented here are based on collective work and disciplinary contributions by the team members of the project “Analog Storage Media. Auralizations
of Archaeological Spaces”, namely Prof. Susanne Muth,
Prof. Stefan Weinzierl, Erika Holter, Una Ulrike Schäfer,
Christoph Böhm and Sebastian Schwesinger.
ICCS ’16 October 25–28, 2016, Windhoek, Namibia
ACM ISBN 978-1-4503-2138-9.
DOI: 10.1145/1235
tration, but also for raising new questions. By this latter
aspect, augmented reality becomes a valuable instrument of
historic research.
CCS Concepts
•Applied computing → Computer-aided design; Sound
and music computing;
Keywords
Auralization; augmented reality; archaeology
1.
SOUND AND SPACE
In his 1677 posthumously published work “On Man” (“Traité de l’Homme”), the French philosopher, mathematician
and physicist René Descartes used the following depiction to
explain seeing (cf. fig. 1): We recognize a middle-aged, barefooted man with a goatee, holding two sticks in his hands.
These sticks form a triangle whose vertex touches a tree. Or,
to hear it in Descartes’ own words:
Notice also that if two hands f and g each hold
sticks i and h with which they touch the object
K, then even though the soul is otherwise ignorant of the length of the sticks, nevertheless, because it can tell the distance between the points
f and g, and the sizes of the angles f gh and gf i,
it will be able to tell, as if by a natural geometry,
where the object K is.1
Now, what we cannot see in the illustration is the fact
that the man is blind. Descartes conceives vision “in terms
of analogies to the senses of touch.”2 It is the metaphor of the
blind man constituting an epistemology of a tactile geometry
in which vision operates by touching the outer world with
two more or less material sticks. However, the crucial point
here is that there are two sticks, explaining why humans
have two eyes and two ears.3 For Descartes, both seeing
and hearing is like exploring the outer world by scanning
their surfaces.
Leaving aside this historical scene of the early modern
period, we can state a close epistemic relationship between
1
Descartes 1998: 133–134.
Crary 1990: 59.
3
Cf. Mach 1896: 83–84 and von Hornborstel 1923: 64.
2
Figure 2: G. B. Piranesi 1748: Vedute di Roma:
Campo Vaccino (Forum Romanum).
Figure 1: R. Descartes 1677: Model of Vision.
seeing and hearing, linked by an active and bidirectional signal process. The sound or light that we are perceiving comes
only in first approximation directly from the seen or heard
object. In the case of sound, only 10 % of the signal’s acoustic energy propagates without any reflections or refractions
into the ear of the listener. In fact, the wave fields are much
too complicated to allow a clear distinction between a sender
and a receiver. Or, to put it into more general terms: Both
seeing and hearing are modes of perception being active and
passive at the same time. We look at certain things whiles
others are attracting our attention unconciously. We turn
our head automatically because we have heard something
behind us. And we hear a dog barking behind a wall although we cannot see it. Furthermore, the predominance of
the visual and visual media in the last two hundered years
has made us forget that living space is strongly shaped by
sounds.
When we think about the structure of the acoustic space
we primarily consider sound as an intentionally used element. Especially in public spaces, functional sounds like car
horns, bells or jingles inform or warn us of certain events.
Signals that result of certain actions like i. e. the slamming
of a door or the steps on the stairs are playing a similar role:
They help us to navigate through our daily routines. Functional signals act like a direct command making us hesitate,
rush, repeat, stop, or anything else. However, soundscapes
are structured by more acoustic information than just these
more or less intentionally designed sound signals. Consider a
person walking down a street. With every step, the soundscape is changing and thus provides a whole bunch of information. And it is not only material parameters like the
surrounding architecture but also soft ones like wind, rain,
emotions, education, or knowledge. To make a long story
short: Our behaviour is to a large extent a result of the
manyfold acoustic interactions between us and our environment.
Now, switching to historical spaces it seems surprising
that the most famous public spaces in antiquity, such like
the Forum Romanum in Rome or the Agora in Athens, have
been investigated primarily with regard to their visual and
symbolic function. Individuals or groups display and experience their collective or personal identity or status by
decorative architectural styles and ornamental elements, by
the formation of mosaics, or by the dramaturgy of views
and lines of sight. However, this traditional archaeological
approach ignores that exactly like nowadays these spaces
were characterized by a plurality of multisensorial experiences like bustling markets, civic assemblies, public speeches
and law courts, processions and festivals. The central aim
of our project is to reconstruct this multisensorial dimension
of antique spaces and especially to evaluate the functional
plausibility of different acoustic scenarios. To achieve this,
we installed a research group at the Cluster of Excellency
“Image Knowledge Gestaltung” at the Humboldt-Universität
zu Berlin. In our group, Professors Christian Kassung from
the Department of Cultural History and Theory and Susanne Muth from the Department of Classical Archaeology,
both at Humboldt-Universität zu Berlin, teamed up with
Professor Stefan Weinzierl from the Technische Universität
Berlin’s Department of Audio Communication.4 In the following paper we will present a case study on speech comprehension during public addresses on the Forum Romanum in
the Late Republican period.
2.
MEDIA OF ARCHAEOLOGY
Casually speaking one could say that the media history of
archaeology at least dates back to Piranesi’s documentations
of ancient buildings and ruins in the 18th century. However,
when looking at these pictures one can easily discern their
confusing non-objective perspective. The observer looks at
tiny and isolated human beings spread in a vast landscape of
almost disturbing relicts. Comparing these interpretations
of the past to media technologies like photographs that have
been and are used extensively at excavations, the difference
between “symbolic” and “real” representation becomes obvious. What is meant by this dichotomy?
4
Cf. https://www.kulturtechnik.hu-berlin.de/de/content/
analogspeicher-ii-auralisierung-archaologischer-raume/.
When we are walking across the Forum Romanum as a
visitor of the historical site, our internal simulation draws
on knowledge from history classes, image reservoirs from
brochures, books and museum objects, habitus from movies
or comics, etc. All that we come up with is heavily mingled
with our cultural acquired representations that also heavily depend on their techno-media formats and our sensory
“bio-media”. We cannot enter a historical site without augmenting the physical structures we observe with pictures,
knowledge, or symbolic significations from various, partial
unconscious sources.
On the other hand there is the “real”, the physical structures of the relicts that cultural heritage initiatives try to
save from decay.5 As can be seen very prominently in the
Acropolis Restauration Project that began in 1975, today’s
societies invest high amounts of energy to preserve material
structures in order to retain a certain kind of historical reality. Now, the crucial point is that instead of artificially
trying to uphold this dichotomy one could bring the real
and the symbolic into even closer contact. We would like
to argue that the emergence of media technologies has the
potential to bridge this gap. Instead of solely conserving
historical reality in situ the radical alternative that media
technologies provide is to treat the real as symbolic.6
Today’s media and of course especially the computer turn
this maybe interaction between the real and the symbolic
into a new and extremely productive relationship. Computer simulations have gained acceptance in scientific knowledge production. At the CERN, you will hardly find any
experiment in the traditional way of measuring something
called by somebody reality. Instead, computer simulations
create experimental environments that are likewise being
measured by computer simulations. Or, to give a historical example: How did Julius Caesar look like? What are
the sources of our knowledge? Ancient marble busts? Old
coins? Film stars like Louis Calhern? Cartoonists like Albert Uderzo? For this example one could argue that we can
clearly distinguish between fictional sources and historical
facts. But our argument is that you cannot fully trust ancient media like coins or busts neither. And this suspicion
holds true as well for every fallen column that has been built
up again. Surely, there is a continuum of trustworthiness,
but something like a self-contained historical reality with
corresponding documents simply doesn’t exist. Hence, the
idea of a clean, depopulated and white ancient past gave way
to approaches that consider the usage of ancient structures
as important. In this sense, blurring the clear boundaries
between fiction and fact prepared the floor for new methodologies that asymptotically approach these material structures. To this end, any archaeological model is a form of
augmented reality. In our project, we take architecture as
a medium for sound, with very specific functional requirements. Thus reconstructing the architecture of the Forum
Romanum as an analogue storage medium allows us to draw
conclusions on its usage and suitability for specific aural occurences.
3.
VISUAL RECONSTRUCTION
The first step towards any reconstruction of what a historical “user” of such an environment might have seen, heard
5
6
Cf. Kassung/Schwesinger 2016.
Cf. Kittler 1993.
Figure 3: E. Holter, S. Muth 2016: View of the
Forum Romanum during the Late Republic.
or in other ways experienced is the digital simulation of the
built, architectural space. Historical simulations in particular require a digital model supported by as much scientific
evidence as possible. Our research is therefore based on
the digital reconstruction of this ancient public space, provided by “digitales forum romanum”, a project led by Prof.
Susanne Muth.7 The goal of the project “digitales forum
romanum” is to create a diachronic digital reconstruction of
the Forum Romanum from the Archaic period until Late
Antiquity and the Early Middle Ages. So far, seven different phases dating to 200 B. C., 100 B. C., 14 A. D., 96
A. D., 150 A. D., 210 A. D., and 310 A. D. have been reconstructed. This reconstruction should not be regarded as a
simple visualisation, but as a tool for further research, and in
keeping with this all available evidence—archaeological, literary, architectural—has been reviewed and analysed for the
different structures on the Forum, considering very carefully
their reliability. Now, the case study of this paper focuses on
the Forum Romanum as it appeared in the Late Republican
period (cf. fig. 3).
While the situation in 200 B. C. is not fundamentally different from that of the Early and Middle Republic we can
start by taking a look at the architectural reconstruction of
the earliest location for public speeches and assemblies. This
is the Comitium, the area in front of the senate house, the
Curia. Together with the very first speaker’s platform on the
Forum, called Rostra, Curia and Comitium constituted the
architectural complex for the decision-making assembly of
all Roman citizens. At latest, the beginnings of this area as
a functionally developed space can be traced to the foundation of the Early Republic and its corresponding Republican
political structures (cf. situation 1). Speaking from this first
position of a built platform, the politicians were able to address the citizens gathered in the Comitium in front of the
Curia.
This situation changed for the first time in the middle of
the 2nd century B. C., when the speakers on the platform
turned their backs on the Comitium in order to face the
crowds assembled on the other side of the rostra, in the
Forum square. The gathering place therefore moved from
the Comitium to the center of the Forum (cf. situation 2).
There are no architectural traces or remains of this inversion
which is solely documented in the literary tradition.
7
Cf. http://www.digitales-forum-romanum.de/.
Yet another area of the Forum became increasingly important as a place of public assembly in the course of the
2nd century B. C. On the opposite site of the Forum, the
Temple of Castor had undergone a rebuilding in the early
2nd century, at which a speaker’s platform was attached
to the temple podium. The temple podium could possibly
have been used earlier for this purpose, but in any case,
the reconstruction had become neccessary because the architecture fullfilled no longer the requirements of this new
function (cf. situation 3). Literary sources, however, first
mention speeches and public assemblies in front of the Temple by mid-2nd century B. C. In our analyses, we use a development stage of the temple by Lucius Caecilius Metellus
in the late 2nd century.
According to present interpretations of the Forum Romanum, the architecturally unchanged situation of the Comitium-Curia complex from its first construction until the 1st
century B. C. reflects the collective identity of an enduring
Republican tradition. And the Temple of Castor is read as a
victory monument for the Roman citizens. Its original construction commemorated the victory over the Latins, while
it later became a symbol of the patrician identity. Whatever
the case, interpretations focusing on the symbolic nature
of architecture fail to take into account that architecture
serves, first and foremost, a specific functional purpose.8
The speaker’s platforms were intended to provide an ideal
space for giving speeches to public assemblies—raised up for
optimal view and comprehension of the speaker. The question is, how well they did this.
From the very beginning of the Republic period, political communication in public spaces was a central aspect of
Roman society: it was a political communication based on
oral as opposed to written communication. Public assemblies met on the forum in Contiones, meetings convened by
a magistrate to discuss political and legal matters, meant
to provide the citizenry with information and explanations
for various votes. The Contio itself did not make a decision,
instead, it was a necessary precursor to the comitium, the
public assembly in which the citizens voted in elections or
on other legal issues. In addition, the Contiones were the
appropriate platform from which to make public announcements, proclaim victories in battle, and report on senate
resolutions, all events in which the people could experience
themselves as part of a (victorious) whole. These public
assemblies where the plebs and nobiles met and communicated were essential for creating a cultural consensus that
could lead to a passing of even controversial legislation. It
might be exaggerating a little, but successfull oral communication at these assemblies could be a matter of life and
death.
4.
AURAL RECONSTRUCTION
Consequently, the functional importance of the Forum
Romanum cannot be understood only by its architectural
changes. Investigating it as a space for political communication and comparing the three different speaking situations
known for the Republican Forum (first from the speaker’s
platform towards the Curia, second from the speaker’s platform towards the Forum itself, and third from the Temple
of Castor) requires the simulation of these assemblies by
means of digital media. In addition to peopling the Forum
8
Cf. Muth 2015, Muth/Schulze 2014.
Figure 4: St. Wienzierl 2016: Scheme of Auralisation.
with crowds and recreating the point of view of a listening
person, the auditory experience itself needs to be reliably
recreated. Using the methods and tools of virtual acoustics helps us to reconstruct an objective aural impression
calles auralisation based on the architectural model as well
as on literary sources.9 By this, we are able to systematically create scenario-based simulations which allow us to
define plausible corridors and boundaries of speech comprehension.
To give a brief description of the technical procedure one
can distinguish four crucial components: first the acoustic
properties of the Forum, second the listener’s physiognomy,
third the pure speech signal, and fourth the binaural Forum’s noisescape (cf. fig. 4). As mentioned at the beginning of this paper, only 10 % energy of an aural impression reaches our ears directly. This means that the acoustically active matter of the environment is, according to
Brian Larkin, “a powerfull mediating force that produces
new modes of organizing sensory perception, time, [and]
space”.10 Any speech sound has to pass through a complex
filtering structure of human bodys, buildings, plants and
trees, air, etc. before it is reveived by the listener. Our simulation of the Forum’s space impulse response may be summarised as follows: Up to 20 million sound rays were sent
out from various speaker positions into the virtual model
of the digital Forum Romanum. Acoustic parameters like
sound pressure level, absorption, reflection and scattering at
bounding surfaces as well as the air absorption were calculated. These interactions in turn influence level, coloration,
and time delay of each ray.
Now, all these simulation rays reach—either directly or
reflected—a virtually placed listening head. In the second
step we thus take into account the listener’s head position
plus its physiognomy, the so-called head related transfer
function. The Forum’s space impulse response is split into
a right and a left ear impression that is changed by the software in real-time when the listener turns his head. If, for
example, the listener turns his right ear against the orator
he then hears the speaker more clearly with the right ear
while the left ear receives primarly the spatial acoustics.
9
10
Cf. Weinzierl 2002, Weinzierl 2008.
Larkin 2008: 219.
Figure 5: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 1.
Figure 6: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 2.
In the third step this spatial filter system is folded with the
pure signal of the speech. Among the most famous speeches
known to have been given during a Contio are Cicero’s second and third oration against Catilina. In theses speeches,
Cicero informed the Roman citizens of the impending danger
of a conspiracy. We chose an excerpt from the third speech
against Catilina for our auralizations and hired a trained orator of similar age. We asked him to project himself into this
speech scenario in which he has to address as many people
as possible, as the occasion for the speech was such crucial.
The speech was recorded in an anechoic studio. This architecture diffuses almost all soundwaves so that the recorded
speech signal includes virtually no reverberations.
The final element of our auralization is the noisescape that
has to represent the plausible background noise of whispers,
air movement, and rustling. The amount of people, the architecture, and the attentiveness of the crowd have to be
comparable to the listening experience one is trying to reconstruct. For our case study we recorded samples of the
crowd noise on St. Peter’s Square in Rome during the traditional Sunday Angelus Prayer of the Pope.
All these four components of the signal path were processed by an auralisation software and then made available
for an aural impression. It is important to note that in addition to this digital simulation, real ears are needed in the
end to determine the speech’s comprehensibility. We test
and document these hearing impressions to evaluate the results and calibrate parameters iteratively.
survey of the situation in the Late Republic when the speakers addressed the crowds standing in the Comitium (cf. fig.
5.1). The color spectrum indicates the sound intensity level
that correlates with the speech comprehensibility as examined in listening tests. With the dotted line we have marked
the dark red area of 2,650 sq m in which a listener would
have been able to understand very well. Using a figure of
four persons per square meter, a speaker at this position
would have been able to reach 10,600 people easily. The
dashed line indicates the zone of average comprehensibility,
albeit only with intense concentration. In this best-case scenario, therefore, in which the audience is quiet and straining
to hear, the speaker could reach 11,200 people on an area of
2,800 sq m. On the basis of these results, we can assume that
from the Rostra, a speaker could reach almost all listeners
gathered in the Comitium very well.
5.
RESULTS
Within our simulations, we have started with a best-case
scenario, in which all listeners are trying to be as quit as
possible in order to understand everything. However, this
has not always been the case and crowds were known to
routinely interrupt the speaker with loud boos and shouts
and to generally disturb the proceedings, especially as the
Late Republic progressed. A gifted orator, on the other
hand, was said to bring a crowd to silence. Anyway, while
the following figures will depict the maximum amount of
people that could understand the speaker, we must assume
that this was often not the case.
5.1
Situation 1
Our first rendering illustrates the findings of the acoustic
5.2
Situation 2
This comfortable situation changes significantly when the
orator addresses the citizens on the Forum square itself (cf.
fig. 5.2). Here, the area of easy comprehensibility decreases
to 2,300 sq m according 9,200 people being able to understand a speech without great effort—1,400 people less than
in the earlier Republican situation. On the other side, however, the amount of people that can theoretically be reached
has increased considerably to a total space of 4,700 sq m in
which 18,800 people in all can stay. It is thus possible to discern a general development between the different situations.
Speaking towards the Forum wasn’t just a symbolic gesture
of literally turning ones back on the senate—speeches were
audible to a far greater number of people, if not only because
of the amount of space now available for the audience to listen. However, the decrease in comprehensibility shows that
this solution to the new need to address greater amounts of
people was still less than ideal.
5.3
Situation 3
At the last stage of the Late Republic—the speech given
in front of the Temple of Castor—the number of people being able to comprehend the oration increases once more: to
2,950 sq m for easy understandability and to 5,900 sq m for
general comprehension. This encompasses a range of 11,800
to 23,600 people, more than double what a speaker from the
Rostra facing the Comitium could reach and considerably
7.
Figure 7: Chr. Böhm, St. Weinzierl 2016: Auralization of the Forum Romanum, Situation 3.
more than if he faced the Forum square. One of the reasons
for this is that the architecture became an acoustically active
matter especially because the heightened rostrum reduced
the sound absorption by the listeners’ bodies and noise.
These results are especially interesting when considered
within the historical context of the Late Republic. By mid2nd century B. C. we find the Temple of Castor increasingly
mentioned as a place of public assembly. At the same time
the Contio became more and more controversial, with consensus less and less likely to be reached. The literary sources
detail the violence and riots that began to surround public
assemblies. Usually, this has been interpreted as a breakdown of communication between the senatorial elite and the
people. Our acoustic results show, however, that this occurrered at the same time when the greatest number of people
could be reached.
6.
CONCLUSIONS
Referring to my introductory remarks on the relationship
between the real and the symbolic one could say that with
these (symbolic) auralisations we simulate the (real) material structures and vice versa, just because sound is essentially both: a symbolic signifier and a physical process. As
a subsequent augmentation these digital algorithms implement physical principles of sound propagation and reflection
in simulations run with virtual augmentations of real archaeological structures.
In the case of the Forum Romanum in the Late Republic
expert testing of our auralisations has revealed an continuously increasing number of listeners being able to comprehend the speeches in the course of three changes of speaking
positions on the square. Whereas in the oldest position towards the Curia a maximum of approximately 11,000 people
could have been adequately addressed, from the speaker’s
platform in front of the Temple of Castor possibly up to
23,000 people could have followed an oration. Accounting
for the growth of population and the significance of public
address and assembly during the struggles of the Late Republic these findings could challenge traditional interpretations that regarded these changes rather as representational
strategy or symbolic gestures of a political elite. The sensory experience of the Forum in the Republican period can
thus be expanded past the (symbolic) visual nature usually
studied.
REFERENCES
[1] J. Crary. Techniques of the Observer. On Vision and
Modernity in the Nineteenth Century. MIT Press,
Cambridge/Massachusetts, London/England, 1992.
[2] R. Descartes. The World and Other Writings.
Cambridge Texts in the History of Philosophy.
Cambridge University Press, Cambridge/UK, 2004.
[3] E. M. v. Hornborstel. Beobachtungen über ein- und
zweiohriges hören. Psychologische Forschung,
4:64–114, 1923.
[4] C. Kassung and S. Schwesinger. How to Hear the
Forum Romanum. On Historical Realities and Aural
Augmentation. In C. Busch and J. Sieck, editors,
Kultur und Informatik. Augmented Reality, pages
41–53. Verlag Werner Hülsbusch, Glückstadt, 5 2016.
[5] F. Kittler. Es gibt keine Software. In Draculas
Vermächtnis. Technische Schriften, number 1476 in
Reclam-Bibliothek, pages 225–242. Reclam Verlag,
Leipzig, 1993.
[6] B. Larkin. Signal and Noise. Media, Infrastructure,
and Urban Culture in Nigeria. Duke University Press,
Durham, London, 2008.
[7] E. Mach. Warum hat der mensch zwei augen? In
Populär-wissenschaftliche Vorlesungen, pages 78–99.
Johann Ambrosius Barth, Leipzig, 1903.
[8] S. Muth. Das Forum Romanum: Roms antikes
Zentrum neu verstehen. Antike Welt, 46(6):34–40,
2015.
[9] S. Muth and H. Schulze. Wissensformen des Raums:
die schmutzigen Details des Forum Romanum –
Archäologie & Sound Studies im Dialog.
Cluster-Zeitung, 55:7–11, 2014.
[10] S. Weinzierl. Beethovens Konzerträume. Raumakustik
und symphonische Aufführungspraxis an der Schwelle
zum bürgerlichen Zeitalter. Verlag Erwin Bochinsky,
Frankfurt am Main, 2002.
[11] S. Weinzierl, editor. Handbuch der Audiotechnik.
Springer Verlag, Berlin, 2008.
APPENDIX
A. FIGURES
• Fig. 1: R. Descartes 1677: Model of Vision. In R.
Descartes: The World and Other Writings. Cambridge
University Press, Cambridge/UK, 2004, p. 133.
• Fig. 2: G. B. Piranesi 1748: Vedute di Roma: Campo
Vaccino (Forum Romanum). In http://www.zeno.org/
nid/20004223756.
• Fig. 3: E. Holter, S. Muth 2016: View of the Forum
Romanum during the Late Republic. In digitales forum
romanum, http://www.digitales-forum-romanum.de/.
• Fig. 4: St. Weinzierl 2008: Scheme of Auralization.
Diagram by the research group.
• Fig. 5–7: Chr. Böhm, St. Weinzierl 2016: Auralization
of the Forum Romanum, Situation 1–3. Diagram by the
research group.